Yinmin Zhong
Publications by Year
Research Areas
Topic Modeling, Natural Language Processing Techniques, Parallel Computing and Optimization Techniques, Speech Recognition and Synthesis, Advanced Neural Network Applications
Most-Cited Works
- → ElasticFlow: An Elastic Serverless Training Platform for Distributed Deep Learning(2023)48 cited
- → LoongServe: Efficiently Serving Long-Context Large Language Models with Elastic Sequence Parallelism(2024)29 cited
- → MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs(2024)24 cited
- → Fast Distributed Inference Serving for Large Language Models(2023)21 cited
- → AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving(2023)20 cited
- → DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving(2024)14 cited
- → DistMind: Efficient Resource Disaggregation for Deep Learning Workloads(2024)5 cited
- → FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion(2024)3 cited
- → DistTrain: Addressing Model and Data Heterogeneity with Disaggregated Training for Multimodal Large Language Models(2025)3 cited
- → Aquifer: Transparent Microsecond-Scale Scheduling for vRAN Workloads(2024)3 cited