全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Arabidopsis chromosome 4 sequence

DOI: 10.1186/gb-2000-1-1-reports030

Full-Text   Cite this paper   Add to My Lib

Abstract:

The paper summarizes years of work by hundreds (if not thousands) of people in dozens of labs spread over three continents. The key features of chromosome 4 are as follows. The long arm of chromosome 4 is 14.5 Mb, the short arm is 3.0 Mb (plus nearly 3.5 Mb of ribosomal DNA repeats). Nearly 50% of the sequence encodes for protein, for a total of 3,744 predicted proteins. Each gene is about 4.6 kb in length, containing an average of 5.2 exons. The actual or potential cellular function for approximately 60% of the genes can be predicted on the basis of similarity to other characterized proteins. Only 33% of the predicted genes are represented among the available 45,000 Arabidopsis expressed sequence tags (ESTs). Of these, 6% of the genes match 75% of the ESTs. Note that it is not clear if the authors are referring at this point only to the chromosome 4 sequence or to all Arabidopsis sequence available; it is clearly important to sequence normalized EST libraries in order to maximize the amount of non-redundant sequence gathered. Almost 8% of the predicted genes have no ESTs and no similarity to other proteins; these may represent spurious gene predictions or plant-specific genes expressed at low levels.The authors give some statistics on various motifs and structural topologies found in the predicted proteins. They also attempt to classify the proteins into major functional categories (such as metabolism and transcription). The only major surprise is the large number of genes involved in disease and defense responses. This is largely due to several large clusters of leucine-rich repeat genes, including one family of 15 contiguous genes. A surprisingly large number of genes are arranged in tandem copies. Of genes with products that have significant similarity to other proteins in Arabidopsis, 12% are arrayed in tandem clusters, ranging from pairs of genes to the 15 leucine-rich repeat genes. This hints at the underlying mechanism of how plants generate sequence diversi

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133