0 citations0 references

Overlap Communication with Dependent Computation via Decomposition in Large Deep Learning Models

2022pp. 93–106

Citations Over TimeTop 10% of 2022 papers

Shibo Wang, Jinliang Wei, Amit Sabne, Andy Davis, Berkin Ilbeyi, Blake A. Hechtman, Dehao Chen, Karthik Murthy, Marcello Maggioni, Qiao Zhang, Sameer Kumar, Tongfei Guo, Yuanzhong Xu, Zongwei Zhou

Abstract

Large deep learning models have shown great potential with state-of-the-art results in many tasks. However, running these large models is quite challenging on an accelerator (GPU or TPU) because the on-device memory is too limited for the size of these models. Intra-layer model parallelism is an approach to address the issues by partitioning individual layers or operators across multiple devices in a distributed accelerator cluster. But, the data communications generated by intra-layer model parallelism can contribute to a significant proportion of the overall execution time and severely hurt the computational efficiency.

Citations Over TimeTop 10% of 2022 papers

Abstract

Related Papers