全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Improving draft assemblies by iterative mapping and assembly of short reads to eliminate gaps

DOI: 10.1186/gb-2010-11-4-r41

Full-Text   Cite this paper   Add to My Lib

Abstract:

The complete genome sequence of an organism provides an invaluable resource to the wider research community and is the foundation for comparative and evolutionary genomics studies. With the recent advances in second-generation sequencing technologies (454 pyrosequencing, Illumina, SOLiD, and Helicos), genome projects have seen an explosion of sequence data production at a fraction of the per-base cost. However, this cost reduction is compromised by typically shorter sequence lengths, and unique profiles of sequencing errors compared with conventional capillary reads [1]. This leads to new computational challenges in assembly to address each of these differences as well as subsequent downstream analyses.The performance of de novo assembly software depends heavily on the sequence length, depth of sequence coverage (genome equivalents, or fold coverage), fragment size of the templates that are sequenced and the types of sequence errors specific to each technology. The situation is complicated by the range of assembly software that exists for use with second-generation technologies. For example, Newbler, produced by Roche, specifically addresses 454 read-specific error profiles. A range of assemblers are available for de novo assembly of Illumina reads, including Velvet [2], Abyss [3], SOAPdenovo [4] and ALLPATHS2 [5], each of which is designed with a different aim and functionality. As second-generation sequencing technologies are improving at different paces, both in error rate and sequence length, assembling a mixture of sequences from different technologies remains a viable strategy for sequencing genomes de novo.Currently, few assemblers (for example, Newbler and Velvet) are able to incorporate mixtures of read types, and their accuracy remains to be assessed. An alternative approach is to combine sequence information from different technologies by using bioinformatics pipelines to assemble contigs from each sequencing technology separately, before treating them as

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133