
U.S. Lattice Gauge Theory Computational Program
Overview
Lattice gauge theory calculations solve fundamental problems in particle and nuclear physics using large-scale computer calculations. The theory aims to understand the physical phenomena described by quantum chromodynamics, or QCD, and to make precision calculations of the theory's predictions. QCD describes the strong interactions that bind protons and neutrons together to form the nuclei of atoms. The elementary entities of QCD are quarks and gluons. The LQCD research team with members at institutions worldwide conducts many large-scale calculations and terascale simulations within the framework of lattice gauge theory. Such simulations are necessary to solve the fundamental problems in high energy and nuclear physics that are at the heart of the Department of Energy's large experimental efforts in these fields. Under the DOE Scientific Discovery through Advanced Computing, (SciDAC) initiative, the QCD research community and computational scientists are constructing a distributed computing facility for lattice gauge theory and QCD calculations and simulations. Major hardware for this facility is located at Brookhaven National Laboratory, Fermi National Accelerator Laboratory, and Thomas Jefferson National Accelerator Facility. Under the SciDAC program, the group also is developing the software infrastructure needed to achieve very high efficiency on these computing platforms and on future systems.
The RENCI Contribution
To obtain high efficiency on terascale and petascale HPC systems, it is essential to understand the run time characteristics of QCD applications through detailed performance analyses. RENCI is extending the existing performance infrastructure and developing new software tools for performance studies aiming at future computing environment. Using the improved tools, RENCI research scientists have been able to collect performance data for QCD codes related to communication, computation, and memory performance. The researchers also have conducted performance analyses and comparisons on and across all QCD platforms. The in-depth performance assessments and performance tuning for QCD applications will enable significantly higher performance on today's very large scale systems.
Funding
U.S. Department of Energy under Award No. DE-FC02-04ER41205 and Award No. DE-FC02-04ER41340
Physics:
Richard Brower, Boston University
Michael Creutz, Brookhaven National Laboratory
Norman Christ, Columbia University
Paul Machenzie, Fermi National Laboratory
Steven Gottlieb, Indiana University
John Negele, Massachusetts Institute of Technology
David Richards, Thomas Jefferson National Accelerator Facility
University of Arizona, Doug Toussaint
Robert Sugar, University of California, Santa Barbara
University of Utah, Carleton DeTar
University of Washington, Stephen Sharpe
Computer Science:
DePaul University, Massimo DiPierro
Illinois Institute of Technology, Xian-He Sun
Vanderbilt University, Theodore Bapty
Publications
Y. Zhang, R. Fowler, K. Huck, A. Malony, A. Porterfield, D. Reed, S. Shende, V. Taylor, and X. Wu, “US QCD Computational Performance Studies with PERI”, SciDAC 2007, Boston, MA, June 2007
Daniel A. Reed, Charng-da Lu and Celso L. Mendes. "Reliability Challenges in Large Systems," Future Generation Computer Systems, Spring 2005.
Charng-da Lu, Daniel A. Reed, "Assessing Fault Sensitivity in MPI Applications" Proceedings of SC2004, SC2004 Best Technical Paper Award, Supercomputing 2004, Pittsburgh, PA, November 2004.
Celso Mendes, Daniel A. Reed, "Monitoring Large Systems via Statistical Sampling," Proceedings of the LACSI Symposium, Santa Fe, NM, October 2002.
Presentations
Ying Zhang, SvPablo: Performance Analysis on BlueGene/L, SIAM PP2006, San Francisco, CA, February, 2006
Daniel A. Reed, Computing - An Intellectual Lever for Multidisciplinary Discovery, Supercomputing 2004, Pittsburgh, PA, November 2004.
Ying Zhang, SvPablo: A Toolkit for Performance Analysis and Visualization, demonstration at Supercomputing 2004, Pittsburgh, PA, November 2004.
Celso Mendes, Daniel A. Reed, QCD Software and Performance Optimization, QCD All Hands meeting, April, 2004.
Partners
Brookhaven National Laboratory
Thomas Jefferson National Accelerator Facility
Fermi National Accelerator Laboratory
University of California at Santa Barbara
University of Arizona
University of Utah
Boston University
Columbia University
DePaul University
Illinois Institute of Technology
Massachusetts Institute of Technology
RENCI