Hartree-Fock Calculations (HF)

Introduction

Ab initio chemistry calculations are the key to a detailed understanding of bond strengths and reaction energies for chemical species. Moreover, they allow chemists to study reaction pathways that would be too hazardous or too expensive to explore experimentally. This version of the Hartree Fock algorithm calculates the non-relativistic interactions among atomic nuclei, electrons in the presence of other electrons, and electrons interacting with nuclei. Basis sets derived from the atoms and the relative geometry of the atomic centers are the initial inputs. Atomic integrals are calculated over these basis functions and are used to approximate molecular density. This density and the previously calculated integrals are used to compute the interactions and to form a Fock matrix. A self consistent field (SCF) method is used until the molecular density converges to within an acceptable threshold.

The Hartree Fock implementation we studied consists of three distinct programs totaling roughly 25K lines of Fortran. The three programs operate as a logical pipeline, with the second and third accepting file input from the previous one. The first program, psetup, reads the initial input, performs any transformations needed by the later computational phases, and writes its result to disk. The next program, pargos calculates and writes to disk the one and two-electron integrals. The final program, pscf, reads the integral files multiple times (they are too large to retain in memory) and solves the SCF equations. In subsequent sections, we refer to these three programs as initialization, integral calculation, and self-consistent field calculations. For an MPEG video of the self-consistent field calculations click here .

Platforms

The uninstrumented application is portable across a number of platforms, including Sun, RS/6000, and the Intel Paragon. The cmdc code handles portability issues. The resulting utility, cmdc.x, also tailors the Hartree-Fock to a number of possible options including sequential execution, parallel execution using one output file for each processor, and parallel execution using a single data file. The version that is generated is controlled by flags in the Makefile for each phase.

Code Access

The Hartree-Fock Code is available for distribution.

Source Code

Three separate programs totaling approximately 25,000 lines of highly portable FORTRAN code perform the entire task. The first, psetup, parses the initial input and does any transformation needed by later computational phases and writes the result to a single file on disk. All processing and I/O in this phase is performed by node 0. The next program, pargos, calculates and writes to disk the one- and two-electron integrals. Again node 0 is charged with accessing the initial input file. The data read from this file are broadcast to the other participating nodes. In the version of the program used for this investigation, each node creates and writes data to a separate disk file of integrals. Other versions use only a single file. The final program, pscf, reads the integrals and solves the equations. All file access is sequential.

Libraries

An application-specific library is included with the code. This library, putility.a, handles a large amount of the I/O operations. Therefore, a number of subprograms in this library were instrumented to capture I/O traces. The TCGMSG toolkit is used for message passing. The code for this library is also included in the distribution. Portions of this library were also instrumented for Pablo I/O tracing. 

Acknowledgements

The I/O characterization presented here is based on the Messkit program that performs Hartree Fock calculations. The application was provided by Rick Kendall from the Molecular Science Software (MSS) Group at the Molecular Science Research Center (MSRC) of the Pacific Northwest Laboratory (PNL). Funding for their exploration has been provided by the U.S. Department of Energy and the High Performance Computing and Communication Program. Trace data was obtained using the Pablo trace capture library.

This research is supported in part by the  National Science Foundation.

back