Reza Yazdani Aminabadi
Microsoft (United States)(US)
Publications by Year
Research Areas
Advanced Neural Network Applications, Topic Modeling, Ferroelectric and Negative Capacitance Devices, Parallel Computing and Optimization Techniques, Natural Language Processing Techniques
Most-Cited Works
- → Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model(2022)299 cited
- → DeepSpeed- Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale(2022)208 cited
- → ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers(2022)72 cited
- → ZeRO-Offload: Democratizing Billion-Scale Model Training(2021)61 cited
- → DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale(2022)55 cited
- → System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models(2024)12 cited
- → DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales(2023)10 cited
- → DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale(2022)8 cited
- → Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases(2023)6 cited
- → SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Networks(2022)6 cited