Understanding large text corpora via sparse machine learning | doi.page