Sebastian Jaszczur
Publications by Year
Research Areas
Topic Modeling, Natural Language Processing Techniques, Speech Recognition and Synthesis, Machine Learning and Algorithms, Generative Adversarial Networks and Image Synthesis
Most-Cited Works
- → Sparse is Enough in Scaling Transformers(2021)3 cited
- → Scaling Laws for Fine-Grained Mixture of Experts(2024)3 cited
- → Mixture of Tokens: Continuous MoE through Cross-Example Aggregation(2023)1 cited
- → Projected Compression: Trainable Projection for Efficient Transformer Compression(2025)
- → Joint MoE Scaling Laws: Mixture of Experts Can Be Memory Efficient(2025)
- Decoupled Relative Learning Rate Schedules(2025)