Features Extraction Effect on the Accuracy of Sentiment Classification Using Ensemble Models

Faiza Mohammad Al-kharboush; Mohammed Abdullah Al-Hagery

doi:10.21275/SR21303123511

Features Extraction Effect on the Accuracy of Sentiment Classification Using Ensemble Models

Faiza Mohammad Al-kharboush, Mohammed Abdullah Al-Hagery

Abstract: A great number of works in sentiment classification have been developed, usually involving machine learning algorithms. The ensemble classifier is a subfield of machine learning that combines different base classifiers to form one powerful classifier. In the text classification, the ensemble classifier cannot process the text directly. Instead, it requires a feature extraction technique to convert the text to numeric forms. The extraction technique has great effects on the classification accuracy. The purpose of this paper is to enhance the accuracy of the ensemble classifier by defining the best feature extraction technique for the ensemble sentiment classifier. Hence, the accuracy of an ensemble model with three well-known feature extraction techniques, which are Bag of words (BOW), Term Frequency-Inverse Document Frequency (TF-IDF), Word2vec, are evaluated and analyzed on four experimental datasets. The ensemble classifier was composed of Support Vector Machine (SVM), Logistic regression (LR), k-nearest neighbor (KNN), and Random Forest (RF) as base classifiers. The analysis result indicates that using an ensemble classifier with TF-IDF delivered better classification accuracy than using BOW or word2vec. In contrast, the ensemble classifier usually reported its lowest accuracy with word2vec

Keywords: Features selection, Sentiment, Analysis, Ensemble models, classification accuracy

Features Extraction Effect on the Accuracy of Sentiment Classification Using Ensemble Models

Rate this Article

Received Comments