OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Human Genomics 2011

In-silico human genomics with GeneCards

DOI: 10.1186/1479-7364-5-6-709

Gil Stelzer, Irina Dalah, Tsippi Stein, Yigeal Satanower, Naomi Rosen, Noam Nativ, Danit Oz-Levi, Tsviya Olender, Frida Belinky, Iris Bahir, Hagit Krug, Paul Perco, Bernd Mayer, Eugene Kolker, Marilyn Safran, Doron Lancet

Keywords: GeneCards, GeneDecks, Partner Hunter, Set Distiller, omics, genomics, human genes, database, synthetic lethality, genetic variations

Full-Text Cite this paper Add to My Lib

Abstract:

From the very beginning, the core GeneCards features included two important components: the capability to view integrated details about a gene in 'card' format and a full text-based search engine. GeneCards has evolved by constantly adding new data sources and data types (eg protein expression and gene networks), revamping the search engine to improve results and performance, and expanding the original gene-centric dogma to encompass sets of genes.Currently, GeneCards automatically mines over 90 sources in an offline process and constructs a consolidated gene list. First, the complete current snapshot of the HUGO Gene Nomenclature Committee (HGNC)-approved symbols[1] is used as the core gene list. Next, human Entrez Gene[2] entries that are different from the HGNC genes are added. Finally, human Ensembl[3] records are matched against the emerging gene list via GeneLoc's exon-based unification algorithm;[4] those that are not found to be equivalent to others in the set are included as novel Ensembl-based GeneCards gene entries. These primary sources provide annotations for aliases, descriptions, previous symbols, gene category, location, summaries, paralogues and non-coding RNA (ncRNA) details. Once the gene list is in place with these significant annotations, over 90 data sources--including those noted above and others[4-9]--are mined for thousands of additional descriptors.The data for each gene are collected into a text file which is used to display the web-card. In addition to the legacy text file format, the complex data model of GeneCards version 3 is stored in relational databases [10]. One database ('by resource') stores the data largely in the originally mined architecture, and another database ('by function') supports the website and has over 130 tables and views, with an average volume of hundreds of thousands of records. The largest table has over 6.5 million rows. This compendium is modelled into 40 entities, with hundreds of hierarchical relationships.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133