%0 Journal Article
%T Optimization of LAPACK Based on Loongson 3A
基于龙芯3A 的LAPACK 函数优化
%A ZHANG Bin
%A GU Nai-Jie
%A HE Song-Song
%A LIU Bin-Bin
%A
张斌
%A 顾乃杰
%A 何颂颂
%A 刘斌斌
%J 计算机系统应用
%D 2012
%I
%X According to the characteristics of Loongson 3A architecture, this paper shows three ways to improve the performance of LAPACK: optimization of the underlying BLAS library, the selection of the best block size of the block algorithm in LAPACK and optimization of the specific LAPACK functions. By running the LAPACK Timing Programs, experimental results are obtained and it shows that the performance of 240 LAPACK functions, which account for 81% of all the LAPACK Timing Programs, is increased by more than 30%.
%K LAPACK
%K BLAS
%K Loongson 3A
%K optimization
%K paired single
LAPACK
%K BLAS
%K 龙芯3A
%K 优化
%K 双单精度
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=D4F6864C950C88FFCE5B6C948A639E39&aid=2F008F04A30914BAE8F50AB522A65B66&yid=99E9153A83D4CB11&vid=659D3B06EBF534A7&iid=708DD6B15D2464E8&sid=E84BBBDDD74F497C&eid=5D71B28100102720&journal_id=1003-3254&journal_name=计算机系统应用&referenced_num=0&reference_num=8