NDC: Analyzing the impact of 3D-stacked memory+logic devices on MapReduce workloads
Citations Over TimeTop 1% of 2014 papers
Abstract
While Processing-in-Memory has been investigated for decades, it has not been embraced commercially. A number of emerging technologies have renewed interest in this topic. In particular, the emergence of 3D stacking and the imminent release of Micron's Hybrid Memory Cube device have made it more practical to move computation near memory. However, the literature is missing a detailed analysis of a killer application that can leverage a Near Data Computing (NDC) architecture. This paper focuses on in-memory MapReduce workloads that are commercially important and are especially suitable for NDC because of their embarrassing parallelism and largely localized memory accesses. The NDC architecture incorporates several simple processing cores on a separate, non-memory die in a 3D-stacked memory package; these cores can perform Map operations with efficient memory access and without hitting the bandwidth wall. This paper describes and evaluates a number of key elements necessary in realizing efficient NDC operation: (i) low-EPI cores, (ii) long daisy chains of memory devices, (iii) the dynamic activation of cores and SerDes links. Compared to a baseline that is heavily optimized for MapReduce execution, the NDC design yields up to 15X reduction in execution time and 18X reduction in system energy.
Related Papers
- → Transparent Hardware Management of Stacked DRAM as Part of Memory(2014)109 cited
- → Adaptive Dynamic On-chip Memory Management for FPGA-based reconfigurable architectures(2014)29 cited
- → Efficient Memory Page Management for NVDIMM-Based Big Data Processing Environments(2017)1 cited
- → Construction and optimization of heterogeneous memory system based on NUMA architecture(2022)1 cited
- → Optimal Periodical Memory Allocation for Logic-in-Memory Image Processors(2006)2 cited