%0 Journal Article %T 基于矩阵三角化分解的Cholesky分解及FPGA并行结构设计<br>Cholesky decomposition and parallel structure design based on matrix triangularization decomposition %A 刘书勇 %A 林俊宇 %A 吴艳霞 %A 张博为 %J 清华大学学报(自然科学版) %D 2016 %R 10.16511/j.cnki.qhdxxb.2016.21.060 %X 矩阵运算是高性能计算中核心问题之一,矩阵分解是提高矩阵运算并行性的重要途径,飞速发展的FPGA为并行运算结构提供了有力的环境支持。该文基于子矩阵更新同一化算法实现了Cholesky分解,基于FPGA设计了相应的并行结构。实验结果表明:与通用处理器的软件实现相比,本文实现的Cholesky分解的FPGA并行结果在核心计算性能上可以取得10倍以上的加速比,该算法针对矩阵三角化计算过程具有更高的数据和流水并行性。<br>Abstract:Matrix computing is one of the core problems in high performance computing with matrix decomposition being an important way to improve the parallelism of matrix computations. FPGA gives a powerful environment for parallel computing. This study uses Cholesky decomposition based on a hardware-adaptive parallel sub-matrix identity updating algorithm. Its parallel structure is based on FPGA. Tests show that this structure achieves more than 10 fold speedup compared to general-purpose processors in terms of the kernel computational speed because the algorithm has better data-parallelism and pipeline-parallelism during matrix triangularization. %K 矩阵三角化分解 %K Cholesky分解 %K 并行结构 %K 现场可编程门阵列 %K < %K br> %K matrix triangularization decomposition %K Cholesky decomposition %K parallel structure %K field programmable gate array %U http://jst.tsinghuajournals.com/CN/Y2016/V56/I9/963