Scaling Datacenter Accelerators with Compute-Reuse Architectures
Citations Over TimeTop 12% of 2018 papers
Abstract
Hardware specialization is commonly used in datacenters to ameliorate the nearing end of CMOS technology scaling. While offering superior performance and energy-efficiency returns compared to general-purpose processors, specialized accelerators are bound to the same device technology constraints, and are thus prone to similar limitations in the future. Once technology scaling plateaus, accelerator and application tuning will reach a point of near-optimum, with no clear direction for further improvements. Emerging non-volatile memory (NVM) technologies follow different scaling trends due to different physical properties and manufacturing techniques. NVMs have inspired recent efforts of innovation in computer systems, as they possess appealing qualities such as high capacity and low energy. We present the COmpute-REuse Accelerators (COREx) architecture that shifts computations from the scalability-hindered transistor-based logic towards the continuing-to-scale storage domain. COREx leverages datacenter redundancy by integrating a storage layer together with the accelerator processing layer. The added layer stores the outcomes of previous accelerated computations. The previously computed results are reused in the case of recurring computations, thus eliminating the need to re-compute them. We designed COREx as a combination of an accelerator and specialized storage layer using emerging memory technologies, and evaluated it on a set of datacenter workloads. Our results show that, when integrated with a well-tuned accelerator, COREx achieves an average speedup of 6.4x and average savings of 50% in energy and 63% in energy-delay product. We expect further increase in gains in the future, as memory technologies continue to improve steadily.
Related Papers
- → Another view on parallel speedup(1990)102 cited
- → Toward a better parallel performance metric(1991)97 cited
- → Performance considerations of shared virtual memory machines(1995)24 cited
- → Speedup for Multi-Level Parallel Computing(2012)8 cited
- → Shared virtual memory and generalized speedup(2002)12 cited