Strassen's Algorithm Reloaded
Citations Over TimeTop 21% of 2016 papers
Abstract
We dispel with “street wisdom” regarding the practical implementation of Strassen's algorithm for matrix-matrix multiplication (DGEMM). Conventional wisdom: it is only practical for very large matrices. Our implementation is practical for small matrices. Conventional wisdom: the matrices being multiplied should be relatively square. Our implementation is practical for rank-k updates, where k is relatively small (a shape of importance for libraries like LAPACK). Conventional wisdom: it inherently requires substantial workspace. Our implementation requires no workspace beyond buffers already incorporated into conventional high-performance DGEMM implementations. Conventional wisdom: a Strassen DGEMM interface must pass in workspace. Our implementation requires no such workspace and can be plug-compatible with the standard DGEMM interface. Conventional wisdom: it is hard to demonstrate speedup on multi-core architectures. Our implementation demonstrates speedup over conventional DGEMM even on an Intel ® Xeon Phi™ coprocessor1 utilizing 240 threads. We show how a distributed memory matrix-matrix multiplication also benefits from these advances.
Related Papers
- → MapReduce Implementation of Strassen's Algorithm for Matrix Multiplication(2017)6 cited
- Susquehanna Chorale Spring Concert "Roots and Wings"(2017)
- → DETERMINING QUALITY REQUIREMENTS AT THE UNIVERSITIES TO IMPROVE THE QUALITY OF EDUCATION(2018)
- → Laderman matrix multiplication algorithm can be constructed using Strassen algorithm and related tensor's isotropies(2017)
- → ИСПОЛЬЗОВAНИЕ ПОТЕНЦИAЛA СОЦИAЛЬНЫХ ПAРТНЕРОВ В ПОДГОТОВКЕ БУДУЩИХ ПЕДAГОГОВ(2024)