The Center for Exascale Radiation Transport (CERT) was created at Texas A&M University through a research grant from the National Nuclear Security Agency (NNSA) under the Predictive Science Academic Alliance Program (PSAAP-II).
CERT is focused on the development of computational techniques for efficiently simulating thermal radiation transport (propagation) using extreme-scale or exascale computers together with the development of predictive science techniques to quantify uncertainty in simulated results. Radiation propagation plays a major role in high-energy density laboratory physics (HEDLP) experiments of the type carried out at the NNSA's National Ignition Facility at Lawrence Livermore National Laboratory as well as several other NNSA facilities. Exascale computers are planned for the future and will consist of many millions of processors and be capable of executing on the order of 1018 floating point operations per second. The fastest computers currently in existence execute roughly 1016 floating point operations per second and use enormous amounts of power.
In order to achieve affordable operating costs, exascale computers must consume far less energy per processor than current computers. Computing on exascale machines will be very different from computing on existing machines because of this low-power requirement. For instance, data movement will be far more expensive in terms of energy consumption than floating-point operations, and erroneous computations will routinely occur during the course of a simulation. Thus the entire concept of the "cost" of computational algorithms will change and algorithms must have the capability to detect erroneous computation and either correct it or tolerate it in some quantifiable manner.
Thermal radiation transport is generally far more computationally expensive than the other physics components in HEDLP simulations, chiefly because the transport equation is seven-dimensional. Thus it is critical to develop efficient algorithms for radiation transport. CERT will develop exascale computer science algorithms and parallel performance models, exascale adaptive transport algorithms for both improved accuracy and numerical error estimation, multiscale physics models relating to transport in embedded voids and small cracks, and exascale multilevel preconditioning techniques. Our computer science research will include the development of methods for fault tolerance and the development of performance models that include the impact of iterative methods on parallel efficiency. The latter will enable us to choose an optimal solution technique based upon characteristics of the problem. Below is a graph comparing the weak scaling performance of our latest transport solution algorithm with the predictions of the associated performance model. It can be seen that our algorithm scales with roughly 60% efficiency from 1 to 384,000 cores of the Sequoia machine at LLNL, and that the performance model agrees very well with the actual performance.
CERT will also perform transport experiments with quantified uncertainties for the purposes of validating multiscale models, yielding experimental estimates of our numerical errors, and providing high-fidelity benchmark results to the thermal radiation transport community. Experiments in the HEDLP regime require multiphysics modeling, which makes it difficult to discern the origins of simulation/experiment disagreements. CERT will avoid the need for HEDLP experiments by performing surrogate neutron transport experiments, exploiting the strong analogy between thermal radiation transport and neutron transport. The single-physics nature of these experiments enables the use of hierarchical uncertainty quantification techniques to cleanly differentiate sources of error and uncertainty in a way that is not possible with HEDP experiments.
The CERT team consists of researchers from Texas A&M as well as the University of Colorado (UC) and Simon Fraser University (SFU). Texas A&M provides expertise in radiation transport theory and discretization methods, massively parallel transport solution algorithms, computer science, verification, validation, and uncertainty quantification (VVUQ), and neutron experimentation. UC provides expertise in multigrid methods for both diffusion and transport. Diffusion is important in addition to transport because diffusion equations are used to precondition the transport equation in highly diffusive problems.