End-to-end performance modeling of distributed GPU applications
Citations Over TimeTop 20% of 2020 papers
Abstract
With the growing number of GPU-based supercomputing platforms and GPU-enabled applications, the ability to accurately model the performance of such applications is becoming increasingly important. Most current performance models for GPU-enabled applications are limited to single node performance. In this work, we propose a methodology for end-to-end performance modeling of distributed GPU applications. Our work strives to create performance models that are both accurate and easily applicable to any distributed GPU application. We combine trace-driven simulation of MPI communication using the TraceR-CODES framework with a profiling-based roofline model for GPU kernels. We make substantial modifications to these models to capture the complex effects of both on-node and off-node networks in today's multi-GPU supercomputers. We validate our model against empirical data from GPU platforms and also vary tunable parameters of our model to observe how they might affect application performance.
Related Papers
- 슈퍼컴퓨터센터의 최적 운영환경을 위한 기반시설 용량 산정에 관한 연구(2010)
- → The Next-generation Supercomputer and Visuakization(2006)
- Multi-level Structure Abstract and Description of Supercomputer(2008)
- → Theory and Practice of Efficient Supercomputer Management(2017)
- → Enhancing Energy sector efficiency: A study on supercomputer performance in optimizing energy systems(2024)