Tristan Thrush
Publications by Year
Research Areas
Natural Language Processing Techniques, Advanced Clustering Algorithms Research, Image Retrieval and Classification Techniques, AI-based Problem Solving and Planning, Biomedical Text Mining and Ontologies
Most-Cited Works
- Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection(2021)
- → The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset(2022)
- → MixMin: Finding Data Mixtures via Convex Minimization(2025)
- → DataPerf: Benchmarks for Data-Centric AI Development(2023)