Domain-Specific Optimization and Generation of High-Performance GPU Code for Stencil Computations
Proceedings of the IEEE2018Vol. 106(11), pp. 1902–1920
Citations Over TimeTop 10% of 2018 papers
Prashant Singh Rawat, Miheer Vaidya, Aravind Sukumaran-Rajam, M. Ravishankar, Vinod Grover, Atanas Rountev, Louis-Noël Pouchet, P. Sadayappan
Abstract
Stencil computations arise in a number of computational domains. They exhibit significant data parallelism and are thus well suited for execution on graphical processing units (GPUs), but can be memory-bandwidth limited unless temporal locality is utilized via tiling. This paper describes how effective tiled code can be generated for GPUs from a domain-specific language (DSL) for stencils. Experimental results demonstrate the benefits of such a domain-specific optimization approach over state-of-the-art general-purpose compiler optimizations.
Related Papers
- → Reuse-distance-based miss-rate prediction on a per instruction basis(2004)51 cited
- → A quantitative analysis of loop nest locality(1996)105 cited
- → Improving whole-program locality using intra-procedural and inter-procedural transformations(2005)6 cited
- → A quantitative analysis of loop nest locality(1996)6 cited
- → Link-Time Improvement of Scheme Programs(1999)1 cited