A Comparative Evaluation of Parallel Programming Models for Shared-Memory Architectures | doi.page

0 citations0 references

A Comparative Evaluation of Parallel Programming Models for Shared-Memory Architectures

2012Vol. 23, pp. 363–370

Citations Over Time

Luis Miguel Sánchez, Javier Fernández, Rafael Sotomayor, J. Daniel García

Abstract

Nowadays, most computers that are commercially available off-the-shelf (COTS) include hardware features that increase the performance of parallel general-purpose threads (hyper threading, multicore, ccNUMA architectures) or SIMD kernels (CPU vector instructions, GPUs). The purpose of this paper is to perform a compared evaluation of several parallel programming models where each one is fitted to exploit some of these features but also each one requires a different level of programming skills. Four parallel programming models (OpenMP, Intel TBB, Intel ArBB, and CUDA) have been selected. The idea is to cover a wide spectrum of programming models and most of the parallel hardware features included in modern computers. On one hand, OpenMP and TBB platforms, that exploits parallel threads running on multicore systems. On the other hand, ArBB, that combines multicore parallel threads and multicore SIMD features with a simpler programming model, and CUDA that exploits SIMD features of the GPU hardware. Our results obtained with the benchmarks used on this paper suggest that OpenMP and TBB have a lower performance compared to ArBB and CUDA. But also that ArBB performance tends to be comparable with CUDA performance in most cases (although it is normally lower). Thus, there are evidences that a careful designed top range multicore and multisocket architecture, can be comparable in terms of performance with top range GPU cards for many applications, with the advantage of a simpler programming model.

Citations Over Time

Abstract

Related Papers