%0 Journal Article
%T An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel？ Xeon Phi？ processor architecture
%A Alexander Moskovsky
%A Michael D’Mello
%A Vladimir Mironov
%A Yuri Alexeev
%J The International Journal of High Performance Computing Applications
%@ 1741-2846
%D 2019
%R 10.1177/1094342017732628
%X The Hartree–Fock method in the General Atomic and Molecular Structure System (GAMESS) quantum chemistry package represents one of the most irregular algorithms in computation today. Major steps in the calculation are the irregular computation of electron repulsion integrals and the building of the Fock matrix. These are the central components of the main self consistent field (SCF) loop, the key hot spot in electronic structure codes. By threading the Message Passing Interface (MPI) ranks in the official release of the GAMESS code, we not only speed up the main SCF loop (4× to 6× for large systems) but also achieve a significant ( > 2 ×) reduction in the overall memory footprint. These improvements are a direct consequence of memory access optimizations within the MPI ranks. We benchmark our implementation against the official release of the GAMESS code on the Intel？ Xeon Phi？ supercomputer. Scaling numbers are reported on up to 7680 cores on Intel Xeon Phi coprocessors
%K Parallel Hartree–Fock–Roothaan
%K OpenMP
%K MPI
%K quantum chemistry
%K GAMESS
%K Intel Xeon Phi
%K irregular computation
%K integral computation
%U https://journals.sagepub.com/doi/full/10.1177/1094342017732628