All Title Author
Keywords Abstract

-  2018 

基于序列拼接的基因组长插入变异集成检测方法研究

DOI: 10.13543/j.bhxbzr.2018.06.014

Keywords: 高通量测序,长插入变异,序列拼接,集成检测,
high-throughput sequencing
,long insertions,sequence assembly,integrated detection

Full-Text   Cite this paper   Add to My Lib

Abstract:

针对目前基于测序的结构变异检测方法检测效果较差的问题,提出了基于序列拼接的长插入变异(ISALins)检测方法。首先融合3种不同检测工具初始的检测结果,然后在初始的预测可疑断点附近分析和提取最可能包含插入信息的高质量软切片段,并比对失败的片段;将这些候选片段利用基于De Bruijn图的方法进行拼接后完成长插入变异的检测。仿真数据和真实数据的实验结果表明:与直接融合多个单一工具检测结果相比,本文方法可以在确保检测敏感度的前提下大幅提高长插入变异的检测精度。
Abstract:With the application and development of high-throughput sequencing technology, methods for the detection of structural variants based on sequencing have emerged. However, since the high-throughput sequencing reads are relatively short compared to the previous sequencing reads, it is difficult to detect long insertion. Although assembly-based approaches can solve the problems of long insertion, the computational resources used for assembly are too complex, resulting in poor results for the assembly and final detection of long insertion. To this end, an integrated sequence assembly based approach (ISALins) is proposed in this work. The initial results of three different detection tools were first merged and then high quality soft-clipped reads and unmapped reads which are the set of most probable reads containing information about insertion were analyzed and extracted around the initial suspected SV breakpoints. Finally these reads were assembled using an assembly tool based on De Bruijn graphs. By experimenting with both simulated and real data, we found that our method was superior to the single tool in terms of both precision and sensitivity of detection. Compared with the direct combination of call results for multiple tools, ISALins significantly improved the detection accuracy.

Full-Text

comments powered by Disqus