0 citations0 references

UMH

ACM Transactions on Architecture and Code Optimization2016Vol. 13(4), pp. 1–25

Citations Over TimeTop 10% of 2016 papers

Amir Kavyan Ziabari, Yifan Sun, Yenai Ma, Dana Schaa, José Luis Abellán, Rafael Ubal, John Kim, Ajay Joshi, David Kaeli

Abstract

In this article, we describe how to ease memory management between a Central Processing Unit (CPU) and one or multiple discrete Graphic Processing Units (GPUs) by architecting a novel hardware-based Unified Memory Hierarchy (UMH). Adopting UMH, a GPU accesses the CPU memory only if it does not find its required data in the directories associated with its high-bandwidth memory, or the NMOESI coherency protocol limits the access to that data. Using UMH with NMOESI improves performance of a CPU-multiGPU system by at least 1.92 × in comparison to alternative software-based approaches. It also allows the CPU to access GPUs modified data by at least 13 × faster.

Related Papers

→ Basker: A Threaded Sparse LU Factorization Utilizing Hierarchical Parallelism and Data Layouts(2016)12 cited
Memory hierarchy exploration for accelerating the parallel computation of SVDs(2008)
→ Avoiding Communication through a Multilevel LU Factorization(2012)2 cited
→ Basker: A Threaded Sparse LU Factorization Utilizing Hierarchical Parallelism and Data Layouts(2016)