TRACO: An Automatic Loop Nest Parallelizer for Numerical Applications
Annals of Computer Science and Information Systems2015Vol. 5, pp. 681–686
Citations Over TimeTop 25% of 2015 papers
Abstract
We present the source-to-source TRACO compiler allowing for increasing program locality and parallelizing arbitrarily nested loop sequences in numerical applications. Algorithms for generation of tiled code and extracting synchronization-free slices composed of tiles are presented. Parallelism of arbitrary nested loops is obtained by creating a kernel of computations represented in the OpenMP standard to be executed independently on many CPUs. We consider benchmarks, typical from compute-intensive sequences of algebra operations or numerical computation from industry and engineering. The speed-up of programs generated by TRACO are discussed. Related compilers and techniques are considered. Future work is outlined.
Related Papers
- → ON THE OPTIMALITY OF ALLEN AND KENNEDY'S ALGORITHM FOR PARALLELISM EXTRACTION IN NESTED LOOPS(1997)18 cited
- → Finding Synchronization-Free Slices of Operations in Arbitrarily Nested Loops(2008)15 cited
- → On the optimality of Allen and Kennedy's algorithm for parallelism extraction in nested loops(1996)9 cited
- → Data locality optimization of interference graphs based on polyhedral computations(2011)1 cited
- → Synchronization-Free Automatic Parallelization for Arbitrarily Nested Affine Loops(2016)