Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
An OAIS Based Approach to Effective Long-term Digital Metadata Curation  [cached]
Arif Shaon,Andrew Woolf
Computer and Information Science , 2009, DOI: 10.5539/cis.v1n2p2
Abstract: Metadata has the proven ability to provide information necessary for successful long-term curation of digital objects. However, without curation metadata itself may deteriorate in terms of its quality and integrity over time. Therefore, a digital curation process needs to incorporate the curation of metadata along with that of data in order to ensure the accurate description of data over time. Unfortunately, no comprehensive method for effective curation of metadata for long periods of time is known to exist at present. Even the Reference Model for Open Archival Information System (OAIS), despite being the most comprehensive and widely adopted framework for long-term data preservation, fails to address the requirements of long-term metadata curation in a comprehensive and unambiguous manner. This paper presents an approach to efficiently curating digital metadata over the long-term that is achieved through articulating the metadata curation related ambiguities of the OAIS Reference Model. The approach essentially involves the use of a “Metadata Curation Model”, which is a specialised edition of the “Data Management” module of the OAIS Reference Model, dedicated to the purpose of long-term metadata curation.
Feng Luan,Mads Nygard
International Journal of Digital Information and Wireless Communications , 2011,
Abstract: Preservation becomes an important infrastructural service for information systems. Many research works have done in past decades. The most popular preservation approach is migration, which transfers and/or transforms digital objects between two computers or two generations of computer technology. However, it is difficult for custodians to decide which migration solution should be chosen. This is because the migration selection depends on the old situation (e.g., digital objects, technical infrastructure and restriction rules) and the current situation (e.g., system requirements and organization requirements). Therefore, in order to obtain the old situation of an information system, we in this paper design a new solution to retrieve information about the old situation from stored metadata. The viability and efficiency of our approach are evaluated in an experiment, under which there are several sets of image files to be migrated.
Data and metadata management automation for an effective approach to sharing environmental data  [cached]
D’Amore F.,Cinnirella S.,Pirrone N.
E3S Web of Conferences , 2013, DOI: 10.1051/e3sconf/20130118003
Abstract: In the market of geospatial systems there are several applications that handle data. Some of these components are oriented towards managing and storing data, map visualizations, and data streams, while other components are oriented towards data description and the creation of metadata. Neither of these applications offers users an overview of the process leading to the data creation up to its export to the end user. In the existing literature there are attempts to automate metadating - the term given to descriptions of data. Some companies are also trying an approach that allows, by means of workflow systems, the automation of the creation of a geospatial dataset. As a result, users of geospatial data are increasingly looking for a more structured process of managing geospatial data and metadata, and any tool that handles this process is likely to find reasonable success within this community. Public entities, mainly local ones, are often called on to deal with regulations that require web systems that reveal parameters of any given territory to users/citizens. Likewise, research organizations, especially those dealing with the environment, increasingly find themselves analyzing spatial data. Difficulties that arise when handling such data can be overcome using the approach we are proposing, which involves using a single tool that handles all the necessary steps in exporting spatial data. In this paper we present several methodologies used to manage geospatial data and metadata by means of GeoInt, a middleware tool developed at CNR-IIA that manages geospatial data produced in different research projects. GeoInt is a middleware program offering basic services that permit users to define both data and metadata. It allows users to manage map servers, and allows them to control the download and sharing processes. This research illustrates the ways in which GeoInt has been improved to minimize metadata editing by its users.
A Description Driven Approach for Flexible Metadata Tracking  [PDF]
Andrew Branson,Jetendr Shamdasani,Richard McClatchey
Computer Science , 2014,
Abstract: Evolving user requirements presents a considerable software engineering challenge, all the more so in an environment where data will be stored for a very long time, and must remain usable as the system specification evolves around it. Capturing the description of the system addresses this issue since a description-driven approach enables new versions of data structures and processes to be created alongside the old, thereby providing a history of changes to the underlying data models and enabling the capture of provenance data. This description-driven approach is advocated in this paper in which a system called CRISTAL is presented. CRISTAL is based on description-driven principles; it can use previous versions of stored descriptions to define various versions of data which can be stored in various forms. To demonstrate the efficacy of this approach the history of the project at CERN is presented where CRISTAL was used to track data and process definitions and their associated provenance data in the construction of the CMS ECAL detector, how it was applied to handle analysis tracking and data index provenance in the neuGRID and N4U projects, and how it will be matured further in the CRISTAL-ISE project. We believe that the CRISTAL approach could be invaluable in handling the evolution, indexing and tracking of large datasets, and are keen to apply it further in this direction.
A metadata approach to manage and organize electronic documents and collections on the web
Moura, Ana Maria de Carvalho;Pereira, Genelice da Costa;Campos, María Luiza Machado;
Journal of the Brazilian Computer Society , 2002, DOI: 10.1590/S0104-65002002000100003
Abstract: in recent years, the number of information sources offered on the web has grown tremedously. support for accessing these information sources has mostly been concentrated on browsing and search tools. digital libraries and web directories constitute important initiatives to improve information access, creating and organizing document collections hierarchically, according to different criteria. search tools, on the other hand, offer a more comprehensive coverage of resources, using robot-based services to collect and index documents, that can be latter accessed using information retrieval techniques. however, technologies applied to search mechanisms on the web still offer little support to manage document collections, as the association between these documents cannot be explicitly identified, neither by their formats nor by their types. this paper presents a formal structure for organizing and describing collections and their documents on the web. it is based on a metadata conceptual model which explores relationships between information resources at different levels of granularity. to validate this model, a prototype has been implemented using both a semi-structured and an object-relational database (db) approach.
Towards Cleaning-up Open Data Portals: A Metadata Reconciliation Approach  [PDF]
Alan Tygel,S?ren Auer,Jeremy Debattista,Fabrizio Orlandi,Maria Luiza Machado Campos
Computer Science , 2015,
Abstract: This paper presents an approach for metadata reconciliation, curation and linking for Open Governamental Data Portals (ODPs). ODPs have been lately the standard solution for governments willing to put their public data available for the society. Portal managers use several types of metadata to organize the datasets, one of the most important ones being the tags. However, the tagging process is subject to many problems, such as synonyms, ambiguity or incoherence, among others. As our empiric analysis of ODPs shows, these issues are currently prevalent in most ODPs and effectively hinders the reuse of Open Data. In order to address these problems, we develop and implement an approach for tag reconciliation in Open Data Portals, encompassing local actions related to individual portals, and global actions for adding a semantic metadata layer above individual portals. The local part aims to enhance the quality of tags in a single portal, and the global part is meant to interlink ODPs by establishing relations between tags.
A Bayesian Approach to Classifying Supernovae With Color  [PDF]
Natalia Connolly,Brian Connolly
Physics , 2009,
Abstract: Upcoming large-scale ground- and space- based supernova surveys will face a challenge identifying supernova candidates largely without the use of spectroscopy. Over the past several years, a number of supernova identification schemes have been proposed that rely on photometric information only. Some of these schemes use color-color or color-magnitude diagrams; others simply fit supernova data to models. Both of these approaches suffer a number of drawbacks partially addressed in the so-called Bayesian-based supernova classification techniques. However, Bayesian techniques are also problematic in that they typically require that the supernova candidate be one of a known set of supernova types. This presents a number of problems, the most obvious of which is that there are bound to be objects that do not conform to any presently known model in large supernova candidate samples. We propose a new photometric classification scheme that uses a Bayes factor based on color in order to identify supernovae by type. This method does not require knowledge of the complete set of possible astronomical objects that could mimic a supernova signal. Further, as a Bayesian approach, it accounts for all systematic and statistical uncertainties of the measurements in a single step. To illustrate the use of the technique, we apply it to a simulated dataset for a possible future large-scale space-based Joint Dark Energy Mission and demonstrate how it could be used to identify Type Ia supernovae. The method's utility in pre-selecting and ranking supernova candidates for possible spectroscopic follow-up -- i.e., its usage as a supernova trigger -- will be briefly discussed.
A metadata approach for clinical data management in translational genomics studies in breast cancer
Irene Papatheodorou, Charles Crichton, Lorna Morris, Peter Maccallum, METABRIC Group, Jim Davies, James D Brenton, Carlos Caldas
BMC Medical Genomics , 2009, DOI: 10.1186/1755-8794-2-66
Abstract: In this work we employ semantic web techniques developed within the CancerGrid project, in particular the use of metadata elements and logic-based inference to annotate heterogeneous clinical information, integrate and query it.We show how this integration can be achieved automatically, following the declaration of appropriate metadata elements for each clinical data set; we demonstrate the practicality of this approach through application to experimental results and clinical data from five hospitals in the UK and Canada, undertaken as part of the METABRIC project (Molecular Taxonomy of Breast Cancer International Consortium).We describe a metadata approach for managing similarities and differences in clinical datasets in a standardized way that uses Common Data Elements (CDEs). We apply and evaluate the approach by integrating the five different clinical datasets of METABRIC.The METABRIC study (Molecular Taxonomy of Breast Cancer International Consortium) is an example of molecular profiling studies on cancer patients that aim to associate experimental results with clinical datasets in order to understand the clinical heterogeneity of the disease. The patient cohorts used are large and the clinical information is consolidated from a number of hospital databases that use different data definitions and often hold incomplete datasets. Patient information is often scattered in different databases within a hospital, or even between different hospitals, as patients are not necessarily treated by the same hospital throughout the course of their disease and/or relapse. Moreover, patient cohorts usually span a large period of time and depending on when the hospital started to record patient data electronically, this can result in incomplete clinical datasets. In addition to these, standard treatment and diagnosis procedures have changed throughout the last three decades, resulting in different types of information being accumulated over time. An example is the HER2 bio-mark
A Heuristic Text Analytic Approach for Classifying Research Articles  [PDF]
Steven Walczak, Deborah L. Kellogg
Intelligent Information Management (IIM) , 2015, DOI: 10.4236/iim.2015.71002
Abstract: Classification of research articles is fundamental to analyze and understand research literature. Underlying concepts from both text analytics and concept mining form a foundation for the development of a quantitative heuristic methodology, the Scale of Theoretical and Applied Research (STAR), for classifying research. STAR demonstrates how concept mining may be used to classify research with respect to its theoretical and applied emphases. This research reports on evaluating the STAR heuristic classifier using the Business Analytics domain, by classifying 774 Business Analytics articles from 23 journals. The results indicate that STAR effectively evaluates overall article content of journals to be consistent with the expert opinion of journal editors with regard to the research type disposition of the respective journals.
Quantitative Network Measures as Biomarkers for Classifying Prostate Cancer Disease States: A Systems Approach to Diagnostic Biomarkers  [PDF]
Matthias Dehmer, Laurin A. J. Mueller, Frank Emmert-Streib
PLOS ONE , 2013, DOI: 10.1371/journal.pone.0077602
Abstract: Identifying diagnostic biomarkers based on genomic features for an accurate disease classification is a problem of great importance for both, basic medical research and clinical practice. In this paper, we introduce quantitative network measures as structural biomarkers and investigate their ability for classifying disease states inferred from gene expression data from prostate cancer. We demonstrate the utility of our approach by using eigenvalue and entropy-based graph invariants and compare the results with a conventional biomarker analysis of the underlying gene expression data.
Page 1 /100
Display every page Item

Copyright © 2008-2017 Open Access Library. All rights reserved.