0 citations0 references

An N-gram-Based BERT model for Sentiment Classification Using Movie Reviews

2022pp. 41–46

Citations Over TimeTop 19% of 2022 papers

Tina Esther Trueman, Ashok Kumar Jayaraman, Erik Cambria, Gayathri Ananthakrishnan, Satanik Mitra

Abstract

An abundance of product reviews and opinions is being produced every day across the internet and other media. Sentiment analysis analyzes those data and classifies them as positive or negative. In this paper, a classification model is proposed for n-gram sentiment analysis using BERT. Specifically, the large IMDB movie review dataset is used that contains 50K instances. This dataset is tokenized and encoded into unigrams, bigrams, and trigrams and their combinations such as unigram and bigram, bigram and trigram, and unigram, bigram, and trigram. The proposed BERT model employs on these extracted features. Then, this model is evaluated using the F1 score and its micro, macro, and weighted-average scores. The model shows comparable results to state-of-the-art methods for all n-gram features. In particular, the model achieves 94.64% highest accuracy for the combination of bigram and trigram features, and 94.68% unigram, bigram, and trigram features than other n-gram features.

Citations Over TimeTop 19% of 2022 papers

Abstract

Related Papers