Benoît Sagot
Institut national de recherche en informatique et en automatique(FR)Institut national de recherche en sciences et technologies du numérique(FR)
Publications by Year
Research Areas
Natural Language Processing Techniques, Topic Modeling, Text Readability and Simplification, Linguistics and Discourse Analysis, Semantic Web and Ontologies
Most-Cited Works
- → What Does BERT Learn about the Structure of Language?(2019)1,190 cited
- → CamemBERT: a Tasty French Language Model(2020)702 cited
- → Asynchronous pipelines for processing huge corpora on medium to low resource infrastructures(2019)238 cited
- → Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets(2022)163 cited
- The Lefff, a freely available and large-coverage morphological and syntactic lexicon for French(2010)
- → Building a free French wordnet from multilingual resources(2008)137 cited
- → Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP(2021)103 cited