Contextualizing Misinformation: A User-Centric Approach to Linguistic and Topical Patterns in News Consumption
Abstract
Exposure to misinformation poses significant challenges to democratic processes and public health, particularly during critical events like elections. This study adopts a user-centric approach to analyze the linguistic features of misinformation actually consumed by individuals during web browsing. Using data from a nationally representative panel of 1,240 American adults and their web-browsing data (21M URL visits) during the 2020 U.S. Presidential Election, we examine linguistic and topical differences in the content of 91K unique misinformation and hard news webpages by utilizing natural language processing techniques and Large Language Models. We find that misinformation consumed by users is generally easier to read, exhibits higher negative sentiment, and employs more moral language than hard news. We also find significant linguistic variations across topics--misinformation can be diverse and vary in linguistic features depending on the subject matter. We also identify heterogeneity across key user characteristics: older adults consume more misinformation about COVID-19 and health, with content showing more negative sentiment and fewer moral terms than expected. Republicans engage with misinformation characterized by more negative sentiment and higher moral language, focusing less on health topics and more on social and political issues. These results highlight the importance of a user-centric approach and suggest that interventions to combat misinformation should be tailored to specific topics and user characteristics for greater effectiveness.