Introducing kernel-level page reuse for high performance computing
Citations Over TimeTop 20% of 2013 papers
Abstract
Due to computer architecture evolution, more and more HPC applications have to include thread-based parallelism and take care of memory consumption. Such evolutions require more attention to the full memory management chain, particularly stressed in multi-threaded context. Several memory allocators provide better scalability on the user-space side. But, with the steadily increasing number of cores, the impact of the operating system cannot be neglected anymore. We measured performance impact of the OS memory sub-system for up to one third of the total execution time of a real application on 128 cores. On modern architectures, we measured that up to 40% of the page fault time is spent in page zeroing. In this paper, we detail a proposal to improve paging performance by removing the needs of this unproductive page zeroing through an extension of the mmap semantic. To this end, we added a kernel-level memory page pool per process to locally reuse free pages without content reset. Our experiments show significant performance improvements especially for huge pages.
Related Papers
- → The page fault frequency replacement algorithm(1972)88 cited
- → Effect of Replacement Algorithms on a Paged Buffer Database System(1978)27 cited
- → Program behavior and the page-fault-frequency replacement algorithm(1976)34 cited
- Improving Virtual Memory Performance by Off-Line Page Clustering(1981)
- → A highly reliable memory subsystem design employing SECDED, column sparing, and paging(1987)