全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

A graph-search framework for associating gene identifiers with documents

DOI: 10.1186/1471-2105-7-440

Full-Text   Cite this paper   Add to My Lib

Abstract:

We show that named entity recognition (NER) systems with similar F-measure performance can have significantly different performance when used with a soft dictionary for geneId-ranking. The graph-based approach can outperform any of its component NER systems, even without learning, and learning can further improve the performance of the graph-based ranking approach.The utility of a named entity recognition (NER) system for geneId-finding may not be accurately predicted by its entity-level F1 performance, the most common performance measure. GeneId-ranking systems are best implemented by combining several NER systems. With appropriate combination methods, usefully accurate geneId-ranking systems can be constructed based on easily-available resources, without resorting to problem-specific, engineered components.Curators of biological databases, such as MGI [1] or Flybase [2], must read and analyze research papers in order to identify the experimental results that should be included in a database. As one example of this analysis process, Cohen and Hersh [3] describe the curation process for MGI as consisting of three stages. First, papers are selected according to whether or not they use mouse as a model organism. Second, the selected papers are triaged to determine if they contain curatable information. For MGI, papers must be curated when they contain evidence sufficient to assign a Gene Ontology term [4] to a specific gene. (These terms indicate some aspect of the molecular function, biological process, or cellular localization for a gene product.) Third, for articles that do contain such information, terms from Gene Ontology are assigned to specific genes, along with the appropriate evidence code.To accomplish the final stage of curation, a curator must find, for each article being curated, the unique database identifier of every gene that will be annotated. Because the criteria for annotation is different for every model-organism database, it is useful to consider

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133