Comparison of fake and real news based on morphological analysis
Citations Over TimeTop 10% of 2020 papers
Abstract
Easy access to information results in the phenomenon of false news spreading intentionally through social networks to manipulate people's opinions. Fake news detection has recently attracted growing interest from the general public and researchers. The paper deals with the morphological analysis of two datasets containing 28 870 news articles. The results were verified using the third dataset consisting of 402 news articles. The analysis of the datasets was carried out using lemmatization and POS tagging. The morphological analysis as a process of classifying the words into grammatical-semantic classes and assigning grammatical categories to these words. Individual words from articles were annotated and statistically significant differences were examined between the classes found in fake and real news articles. The results of the analysis show that statistically significant differences are mainly in the verbs and nouns word classes. Finding statistically significant differences in individual categories of word classes is an important piece of information for the future fake news classifier in terms of selecting appropriate variables for the classification.
Related Papers
- → Stemming and Lemmatization for Information Retrieval Systems in Amazigh Language(2018)17 cited
- → Lemmatization of Multi-Word Entity Names for Polish Language Using Rules Automatically Generated Based on the Corpus Analysis(2018)5 cited
- → Automatic lemmatization in Setswana: towards a prototype(2005)8 cited
- → nikopartanen/old-literary-finnish-lemmatization: Old Literary Finnish Lemmatization Dataset(2021)
- → nikopartanen/old-literary-finnish-lemmatization: Old Literary Finnish Lemmatization Dataset(2021)