Present and Future Leadership Computers at OLCF Jack Wells, Director of Science Oak Ridge Leadership Computing Facility Oak Ridge National Laboratory #OpenPOWERSummit Join the conversation at #OpenPOWERSummit 1 Oak Ridge Leadership Computing Facility (OLCF) Mission: Deploy and operate the computational resources required to tackle global challenges. Providing world-leading computational and data resources and specialized services for the most computationally intensive problems Providing stable hardware/software path of increasing scale to maximize productive applications development Providing the resources to investigate otherwise inaccessible systems at every scale: from galaxy formation to supernovae to earth systems to automobiles to nanomaterials With our partners, deliver transforming discoveries in materials, biology, climate, energy technologies, and basic science Our Science requires that we continue to advance OLCF’s computational capability over the next decade on the roadmap to Exascale. Since clock-rate scaling ended in 2003, HPC performance has been achieved through increased parallelism. Jaguar scaled to 300,000 cores. Summit: 5-10x Titan Hybrid GPU/CPU 10 MW Titan: 27 PF Hybrid GPU/CPU 9 MW Jaguar: 2.3 PF Multi-core CPU 7 MW 2010 Titan and beyond deliver hierarchical parallelism with very powerful nodes. MPI plus thread level parallelism through OpenACC or OpenMP plus vectors 2012 CORAL System 2017 2022 OLCF5: 5-10x Summit ~20 MW Join the conversation at #OpenPOWERSummit 4 Science Accomplishments Highlights from Titan Cosmology Salman Habib Argonne National Laboratory Habib and collaborators used its HACC Code on Titan’s CPU–GPU system to conduct today’s largest cosmological structure simulation at resolutions needed for modern-day galactic surveys. K. Heitmann, 2014. arXiv.org, 1411.3396 Combustion Superconducting Materials Molecular Science Fusion Jacqueline Chen Sandia National Laboratory Chen and collaborators for the first time performed direct numerical simulation of a jet flame burning dimethyl ether (DME) at new turbulence scales over space and time. A. Bhagatwala, et al. 2014. Proc. Combust. Inst. 35. Paul Kent ORNL Michael Klein Temple University Paul Kent and collaborators performed the first ab initio simulation of a cuprate. They were also the first team to validate quantum Monte Carlo simulations for hightemperature superconductor simulations. Researchers at Procter & Gamble (P&G) and Temple University delivered a comprehensive picture in full atomistic detail of the molecular properties that drive skin barrier disruption. K. Foyevtsova, et al. 2014. Phys. Rev. X 4 C. M. MacDermaid, et al. 2014. J. Chem. Phys. 141 M. Paloncyova, et al. 2014. Langmuir 30 C.S. Chang PPPL Chang and collaborators used the XGC1 code on Titan to obtain fundamental understanding of the divertor heat-load width physics and its dependence on the plasma current in present-day tokamak devices. C. S. Chang, et al. 2014. Proceedings of the 25th Fusion Energy Conference, IAEA, October 13–18, 2014. Industrial Partnerships: Accelerating Competitiveness through Computational Sciences Number of OLCF Industry Projects Operational Per Calendar Year Number of Projects 40 34 35 30 28 25 19 20 21 15 15 10 3 5 5 5 2007 2008 8 0 2005 2006 2009 2010 Year 2011 2012 2013 2014 Accelerating Future DOE Leadership Systems (“CORAL”) “Summit” System “Sierra” System 5X – 10X Higher Application Performance IBM POWER CPUs, NVIDIA Tesla GPUs,Mellanox EDR 100Gb/s InfiniBand, Paving The Road to Exascale Performance 2017 OLCF Leadership System Hybrid CPU/GPU architecture Vendor: IBM (Prime) / NVIDIA™ / Mellanox Technologies® At least 5X Titan’s Application Performance Approximately 3,400 nodes, each with: Multiple IBM POWER9 CPUs and multiple NVIDIA Tesla® GPUs using the NVIDIA Volta architecture CPUs and GPUs completely connected with high speed NVLink Large coherent memory: over 512 GB (HBM + DDR4) • all directly addressable from the CPUs and GPUs An additional 800 GB of NVRAM, which can be configured as either a burst buffer or as extended memory over 40 TF peak performance Dual-rail Mellanox® EDR-IB full, non-blocking fat-tree interconnect IBM Elastic Storage (GPFS™) - 1TB/s I/O and 120 PB disk capacity. Join the conversation at #OpenPOWERSummit 9 How does Summit compare to Titan Feature Summit Titan 5-10x Titan Baseline Number of Nodes ~3,400 18,688 Node performance > 40 TF 1.4 TF Memory per Node >512 GB (HBM + DDR4) 38GB (GDDR5+DDR3) NVRAM per Node 800 GB 0 NVLink (5-12x PCIe 3) PCIe 2 Dual Rail EDR-IB (23 GB/s) Gemini (6.4 GB/s) Non-blocking Fat Tree 3D Torus Processors IBM POWER9 NVIDIA Volta™ AMD Opteron™ NVIDIA Kepler™ File System 120 PB, 1 TB/s, GPFS™ 32 PB, 1 TB/s, Lustre® 10 MW 9 MW Application Performance Node Interconnect System Interconnect (node injection bandwidth) Interconnect Topology Peak power consumption CAAR projects on Titan WL-LSMS LAMMPS Illuminating the role of material disorder, statistics, and fluctuations in nanoscale materials and systems. A molecular description of membrane fusion, one of the most common ways for molecules to enter or exit living cells. 3.8 performance ratio XK7/XE6 7.4 performance ratio XK7/XE6 S3D Understanding turbulent combustion through direct numerical simulation with complex chemistry. Denovo 2.2 performance ratio XK7/XE6 3.8 performance ratio XK7/XE6 Discrete ordinates radiation transport calculations that can be used in a variety of nuclear energy and technology applications. Titan: Cray XK7 (Kepler GPU plus AMD 16-core Opteron CPU) Cray XE6: (2x AMD 16-core Opteron CPUs) The Complexities of Combustion Jackie Chen, Sandia National Laboratory Use Titan’s computing power to develop predictive models of turbulence in combustion. Employ direct numerical simulation code known as S3D to simulate a jet flame burning dimethyl ether (DME) and compare with experiment. Impact: Americans use two-thirds of their petroleum for transportation and one-third for heating buildings and generating electricity. If low-temperature compression ignition concepts are widely adopted in nextgeneration automobiles, fuel efficiency could increase by as much as 25 to 50 percent. The logarithm of the scalar dissipation rate (that is, the local mixing rate) where white denotes high mixing rates and red, lower mixing rates. Blue indicates regions in which the flame is extinguished. Visualization credit: Hongfeng Yu. Join the conversation at #OpenPOWERSummit 12 CAAR in preparation for Summit • Application Developer Team involvement – Knowledge of the application – Work on application in development “moving target” – Optimizations included in application release • Demonstration of science capabilities at scale – Shake-down on the new system hardware and software – Large-scale science project is strong incentive to participate • Vendor technical support is crucial – Programming environment often not mature – Best source of information on new hardware features • Access to multiple resources, including early hardware • Portability is a critical concern – Application developers target wide range of architectures – Maintaining multiple code version is difficult – Porting to different architectures is timeconsuming – Many Principal Investigators have allocations on multiple resources – Applications far outlive any computer system • Experience benefits other developers and users – Coverage of scientific domains – Coverage of algorithmic methods and programming models • Persistent culture of application readiness – Experience of science liaisons and catalysts for user programs – Synergy with libraries and tools projects CAAR Timeline 1. November 2014: Call for CAAR applications 7. September 2017: Call for Early Science projects 2. February 20, 2015: CAAR proposal deadline 8. November 2017: Selection Early Science projects 3. March 2015: Selection of CAAR application teams 9. 4. April 2015: CAAR application training workshop 5. April 2015: CAAR application teams start 6. June 2016: CAAR project review December 2017: Early Science projects start 10. June 2019: Early Science project ends Want to join our team? ORNL is hiring. Contact us at https://jobs.ornl.gov [email protected] This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725
© Copyright 2024