0 citations0 references

Static placement of computation on heterogeneous devices

Proceedings of the ACM on Programming Languages2017Vol. 1(OOPSLA), pp. 1–28

Citations Over TimeTop 10% of 2017 papers

Gabriel Poesia, Breno Campos Ferreira Guimarães, Fabrício Ferracioli, Fernando Magno Quintão Pereira

Abstract

Heterogeneous architectures characterize today hardware ranging from super-computers to smartphones. However, in spite of this importance, programming such systems is still challenging. In particular, it is challenging to map computations to the different processors of a heterogeneous device. In this paper, we provide a static analysis that mitigates this problem. Our contributions are two-fold: first, we provide a semi-context-sensitive algorithm, which analyzes the program's call graph to determine the best processor for each calling context. This algorithm is parameterized by a cost model, which takes into consideration processor's characteristics and data transfer time. Second, we show how to use simulated annealing to calibrate this cost model for a given heterogeneous architecture. We have used our ideas to build Etino, a tool that annotates C programs with OpenACC or OpenMP 4.0 directives. Etino generates code for a CPU-GPU architecture without user intervention. Experiments on classic benchmarks reveal speedups of up to 75x. Moreover, our calibration process lets avoid slowdowns of up to 720x which trivial parallelization approaches would yield.

Related Papers

→ Parallel connected-component labeling algorithm for GPGPU applications(2010)14 cited
Parallel Programming For High-Performance Computing on CUDA(2009)
CUDA-NP： Realizing Nested Thread-Level Parallelism in GPGPU Applications(2015)
Introductory on GPGPU Programming Technique(2010)
→ Новітні архітектури відеоадаптерів. Технологія GPGPU. Частина 2(2013)