I/O-Aware Batch Scheduling for Petascale Computing Systems
Citations Over TimeTop 10% of 2015 papers
Abstract
In the Big Data era, the gap between the storage performance and an application's I/O requirement is increasing. I/O congestion caused by concurrent storage accesses from multiple applications is inevitable and severely harms the performance. Conventional approaches either focus on optimizing an application's access pattern individually or handle I/O requests on a low-level storage layer without any knowledge from the upper-level applications. In this paper, we present a novel I/O-aware batch scheduling framework to coordinate ongoing I/O requests on petascale computing systems. The motivation behind this innovation is that the batch scheduler has a holistic view of both the system state and jobs' activities and can control the jobs' status on the fly during their execution. We treat a job's I/O requests as periodical subjobs within its lifecycle and transform the I/O congestion issue into a classical scheduling problem. We design two scheduling polices with different scheduling objectives either on user-oriented metrics or system performance. We conduct extensive trace-based simulations using real job traces and I/O traces from a production IBM Blue Gene/Q system. Experimental results demonstrate that our design can improve job performance by more than 30%, as well as increasing system performance.
Related Papers
- → Characterization and identification of HPC applications at leadership computing facility(2020)23 cited
- → Chronicles of Astra: Challenges and Lessons from the First Petascale Arm Supercomputer(2020)13 cited
- IBM Power Systems 775 for Aix and Linux Hpc Solution(2012)
- → The next-generation supercomputer project and a plan for the advanced institute for computational science(2010)1 cited
- Data-intensive computing on numerically-insensitive supercomputers(2010)