Dirk Groeneveld
Allen Institute(US)
Publications by Year
Research Areas
Topic Modeling, Natural Language Processing Techniques, Multimodal Machine Learning Applications, Text Readability and Simplification, Semantic Web and Ontologies
Most-Cited Works
- → Construction of the Literature Graph in Semantic Scholar(2018)323 cited
- → From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the\n Aristo Project(2019)71 cited
- → OLMo: Accelerating the Science of Language Models(2024)46 cited
- → Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research(2024)35 cited
- → From F to A on the New York Regents Science Exams — An Overview of the Aristo Project(2020)24 cited
- → Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models(2025)23 cited
- Documenting the English Colossal Clean Crawled Corpus.(2021)
- → IKE - An Interactive Tool for Knowledge Extraction(2016)18 cited