%0 Journal Article
%T 基于曲面拟合重建单细胞染色体三维结构
Reconstruction of Single-Cell Chromosome
%A 刘立伟
%A 张琦
%A 白凤兰
%J Hans Journal of Computational Biology
%P 1-7
%@ 2164-5434
%D 2019
%I Hans Publishing
%R 10.12677/HJCB.2019.91001
%X
基因组学是当今生物信息学的核心领域之一,基因组学的两个主要研究方向是:以全基因组测序为目标的结构基因组学和以基因功能解读为目标的功能基因组学。在过去几十年,基因组学经历了长足的发展。而染色体三维结构的预测对基因组学的研究有重大意义。染色体三维结构的重建问题,就是从基因组的一维和二维数据出发预测其在三维空间中的构像,再利用数据分析等方法判断重建后染色体三维结构的可靠性。单细胞的Hi-C数据的接触矩阵是稀疏且含有噪声的,缺失很多接触位点的信息,我们把这样的矩阵称作低秩矩阵。我们首先要解决的问题就是对于低秩矩阵的处理,也叫低秩距离矩阵的完备化。本文介绍了包括最优化方法、最短距离法在内的几种常见的低秩矩阵完备化的方法,也详细介绍了本文采用的与前人不同的方法,最后通过MATLAB实现得到最终结论并与前人研究成果形成对比。
Genomics is one of the core areas of
bioinformatics; there are two main research directions of genomics, structural
genomics targeting whole genome sequencing and functional genomics targeting
gene function interpretation. In the past few decades, genomics has experienced
considerable development. The prediction of the three-dimensional structure of
chromosomes is of great significance for the study of genomics. The
reconstruction of the three-dimensional structure of chromosomes is to predict
the conformation of the three-dimensional image from the one-dimensional and
two-dimensional data of the genome, and then use the data analysis method to
judge the reliability of the three-dimensional structure of the reconstructed
chromosome. This paper is based on the single-cell chromosome Hi-C technology
and Hi-3C derived data to capture the interaction data of individual cells,
write the contact frequency matrix, and then convert the contact frequency
matrix into a distance matrix to further obtain the three-dimensional structure
of the chromosome. The contact matrix of single-cell Hi-C data is sparse and
noise-containing, missing many non-contact sites. We refer to such a matrix as
a low-rank matrix. The first problem we have to solve is the processing of low
rank matrices, also called the completion of low rank distance matrices. This
paper introduces several common low-rank matrix completion methods including
optimization method and shortest distance method. It also introduces the
different methods used in this paper. Finally, the final conclusion is obtained
through MATLAB and compared of human research results.
%K Hi-C数据,低秩矩阵,低秩矩阵的完备化,最短距离法
Hi-C Data
%K Low-Rank Matrix
%K Completion of Low-Rank Matrices
%K Shortest Distance Method
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=29394