%0 Journal Article %T 基于曲面拟合重建单细胞染色体三维结构
Reconstruction of Single-Cell Chromosome %A 刘立伟 %A 张琦 %A 白凤兰 %J Hans Journal of Computational Biology %P 1-7 %@ 2164-5434 %D 2019 %I Hans Publishing %R 10.12677/HJCB.2019.91001 %X
基因组学是当今生物信息学的核心领域之一,基因组学的两个主要研究方向是:以全基因组测序为目标的结构基因组学和以基因功能解读为目标的功能基因组学。在过去几十年,基因组学经历了长足的发展。而染色体三维结构的预测对基因组学的研究有重大意义。染色体三维结构的重建问题,就是从基因组的一维和二维数据出发预测其在三维空间中的构像,再利用数据分析等方法判断重建后染色体三维结构的可靠性。单细胞的Hi-C数据的接触矩阵是稀疏且含有噪声的,缺失很多接触位点的信息,我们把这样的矩阵称作低秩矩阵。我们首先要解决的问题就是对于低秩矩阵的处理,也叫低秩距离矩阵的完备化。本文介绍了包括最优化方法、最短距离法在内的几种常见的低秩矩阵完备化的方法,也详细介绍了本文采用的与前人不同的方法,最后通过MATLAB实现得到最终结论并与前人研究成果形成对比。
Genomics is one of the core areas of bioinformatics; there are two main research directions of genomics, structural genomics targeting whole genome sequencing and functional genomics targeting gene function interpretation. In the past few decades, genomics has experienced considerable development. The prediction of the three-dimensional structure of chromosomes is of great significance for the study of genomics. The reconstruction of the three-dimensional structure of chromosomes is to predict the conformation of the three-dimensional image from the one-dimensional and two-dimensional data of the genome, and then use the data analysis method to judge the reliability of the three-dimensional structure of the reconstructed chromosome. This paper is based on the single-cell chromosome Hi-C technology and Hi-3C derived data to capture the interaction data of individual cells, write the contact frequency matrix, and then convert the contact frequency matrix into a distance matrix to further obtain the three-dimensional structure of the chromosome. The contact matrix of single-cell Hi-C data is sparse and noise-containing, missing many non-contact sites. We refer to such a matrix as a low-rank matrix. The first problem we have to solve is the processing of low rank matrices, also called the completion of low rank distance matrices. This paper introduces several common low-rank matrix completion methods including optimization method and shortest distance method. It also introduces the different methods used in this paper. Finally, the final conclusion is obtained through MATLAB and compared of human research results.
%K Hi-C数据,低秩矩阵,低秩矩阵的完备化,最短距离法
Hi-C Data %K Low-Rank Matrix %K Completion of Low-Rank Matrices %K Shortest Distance Method %U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=29394