Enabling Scientific Computing on Memristive Accelerators
Citations Over TimeTop 10% of 2018 papers
Abstract
Linear algebra is ubiquitous across virtually every field of science and engineering, from climate modeling to macroeconomics. This ubiquity makes linear algebra a prime candidate for hardware acceleration, which can improve both the run time and the energy efficiency of a wide range of scientific applications. Recent work on memristive hardware accelerators shows significant potential to speed up matrix-vector multiplication (MVM), a critical linear algebra kernel at the heart of neural network inference tasks. Regrettably, the proposed hardware is constrained to a narrow range of workloads: although the eight-to 16-bit computations afforded by memristive MVM accelerators are acceptable for machine learning, they are insufficient for scientific computing where high-precision floating point is the norm. This paper presents the first proposal to enable scientific computing on memristive crossbars. Three techniques are explored — reducing overheads by exploiting exponent range locality, early termination of fixed-point computation, and static operation scheduling — that together enable a fixed-point memristive accelerator to perform high-precision floating point without the exorbitant cost of naïve floating-point emulation on fixed-point hardware. A heterogeneous collection of crossbars with varying sizes is proposed to efficiently handle sparse matrices, and an algorithm for mapping the dense subblocks of a sparse matrix to an appropriate set of crossbars is investigated. The accelerator can be combined with existing GPU-based systems to handle datasets that cannot be efficiently handled by the memristive accelerator alone. The proposed optimizations permit the memristive MVM concept to be applied to a wide range of problem domains, respectively improving the execution time and energy dissipation of sparse linear solvers by 10.3x and 10.9x over a purely GPU-based system.
Related Papers
- → A floating-point fused dot-product unit(2008)47 cited
- → Approximating matrix multiplication for pattern recognition tasks(1997)24 cited
- → Faster Arbitrary-Precision Dot Product and Matrix Multiplication(2019)10 cited
- Faster arbitrary-precision dot product and matrix multiplication(2019)