Classification of Textual Sentiment Using Ensemble Technique

Mamun, Md. Mashiur Rahaman; Sharif, Omar; Hoque, Mohammed Moshiul

doi:10.1007/s42979-021-00922-z

Classification of Textual Sentiment Using Ensemble Technique

Original Research
Published: 05 November 2021

Volume 3, article number 49, (2022)
Cite this article

SN Computer Science Aims and scope Submit manuscript

685 Accesses
7 Citations
Explore all metrics

Abstract

In recent years, the widespread use of the Internet has resulted in a revolutionary way for people to share their feelings or sentiment on blogs, social media, e-commerce sites, and online platforms. Most of the feelings expressed on the online platforms are in textual forms (such as status, tweets, comments, and reviews). These textual expressions are unstructured, laborious, and time-consuming to organize, manipulate, or efficient storage due to their messy forms. Textual sentiment analysis refers to the automatic process of assigning an expression or text to an appropriate polarity (positive, negative, and neutral). Although Bengali is ranked seventh most popular language globally and the second famous Indic language, the development of language processing tools is minimal to date. This paper proposes an ensemble-based technique to classify Bengali textual sentiment into two categories: positive and negative. Due to the unavailability of the Bengali sentiment corpus, this work also developed a dataset (called ‘Bengali Sentiment Analysis Dataset or BSaD’) containing 8122 text expressions. This work investigates eight popular baseline classifiers [such as Logistic Regression (LR), Randon Forest (RF), Decision Tree (DT), K-nearest Neighbor (KNN), Support Vector Machine (SVM), Multinomial Naive Bayes (MNB), Stochastic Gradient Descent, and AdaBoost] with Term frequency-Inverse document frequency (TF-IDF) and Bag-of-words (BoW) feature for textual sentiment analysis on three datasets. This work also investigates the four ensemble methods (LR + RF, RF + SVM, LR + SVM, and LR + RF + SVM) developed by combining three best-performing base classifiers (LR, RF, and SVM). Experimental results show that the ensemble approach (i.e., LR + RF + SVM) with TF-IDF (uni-gram + bi-gram + tri-gram) features outperformed the other classifier models achieving the highest accuracy 82% on the developed dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

A survey on sentiment analysis methods, applications, and challenges

Article 07 February 2022

A review on sentiment analysis and emotion detection from text

Article 28 August 2021

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

Article 19 November 2021

Notes

https://www.wordclouds.com/.

References

Akhtar MS, Ekbal A, Cambria E. How intense are you? Predicting intensities of emotions and sentiments using stacked ensemble [application notes]. Comput Intell Mag. 2020;15(1):64–75. https://doi.org/10.1109/MCI.2019.2954667.
Article Google Scholar
Akhtar MS, Gupta D, Ekbal A, Bhattacharyya P. Feature selection and ensemble construction. Knowl Based Syst. 2017;125(C):116–35. https://doi.org/10.1016/j.knosys.2017.03.020.
Article Google Scholar
Amrani YA, Lazaar M, Kadiria KEE. Random forest and support vector machine based hybrid approach to sentiment analysis. Procedia Comput Sci. 2018;127:511–20.
Article Google Scholar
Bakar A, Razi MF, Norisma I, Liyana S, Norazlina K. Sentiment analysis of noisy Malay text: state of art, challenges and future work. IEEE Access. 2020;8:24687–96.
Article Google Scholar
Banglapedia: Bangla language. 2019. https://www.kaggle.com/tazimhoque/bengali-sentiment-text. Accessed 23 Mar 2020.
Chowdhury RR, Hossain MS, Hossain S, Andersson K. Analyzing sentiment of movie reviews in Bangla by applying machine learning techniques. In: International conference on Bangla speech and language processing (ICBSLP). IEEE; 2019. p. 1–6.
Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20(1):37–46.
Article Google Scholar
Das A, Iqbal MA, Sharif O, Hoque MM. BEmoD: development of Bengali emotion dataset for classifying expressions of emotion in texts. In: Intelligent computing and optimization. ICO 2020. Advances in intelligent systems and computing, vol. 1324. Berlin: Springer; 2021. p. 1124–36.
Das A, Sharif O, Hoque MM, Sarker IH. Emotion classification in a resource constrained language using transformer-based approach. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: student research workshop. Association for Computational Linguistics; 2021. p. 150–8 (Online). https://doi.org/10.18653/v1/2021.naacl-srw.19. https://aclanthology.org/2021.naacl-srw.19
Dashtipour K, Ieracitano C, Morabito FC, Raza A, Hussain A. An ensemble based classification approach for persian sentiment analysis. In: Progresses in artificial intelligence and neural systems. Singapore: Springer; 2021. p. 207–15.
Gamal D, Alfonse M, El-Horbaty ESM, Salem ABM. Analysis of machine learning algorithms for opinion mining in different domains. Mach Learn Knowl Extr. 2019;1(1):224–34.
Article Google Scholar
Garg K, Lobiyal DK. Hindi EmotionNet: a scalable emotion lexicon for sentiment classification of Hindi text. ACM Trans Asian Low Resour Lang Inf Process. 2020;19(4):1–35.
Article Google Scholar
Hossain E, Sharif O, Hoque MM. Sentiment polarity detection on Bengali book reviews using multinomial naive Bayes. 2020. arXiv preprint arXiv:2007.02758.
Hossain E, Sharif O, Hoque MM. NLP-CUET@LT-EDI-EACL2021: multilingual code-mixed hope speech detection using cross-lingual representation learner. In: Proceedings of the first workshop on language technology for equality, diversity and inclusion. Kyiv: Association for Computational Linguistics; 2021. p. 168–74. https://aclanthology.org/2021.ltedi-1.25.
Hossain E, Sharif O, Hoque MM, Sarker IH. SentiLSTM: a deep learning approach for sentiment analysis of restaurant reviews. 2020. arXiv preprint arXiv:2011.09684.
Islam MS, Islam MA, Hossain MA, Dey JJ. Supervised approach of sentimentality extraction from Bengali facebook status. In: 2016 19th International conference on computer and information technology (ICCIT). IEEE; 2016. p. 383–7.
Lai Y, Zhang L, Han D, Zhou R, Wang G. Fine-grained emotion classification of Chinese microblogs based on graph convolution networks. World Wide Web. 2020;23(5):2771–87.
Article Google Scholar
Le CC, Prasad P, Alsadoon A, Pham L, Elchouemi A. Text classification: Naïve Bayes classifier with sentiment lexicon. IAENG Int J Comput Sci. 2019;46(2):141–8.
Google Scholar
Luo L. Network text sentiment analysis method combining LDA text representation and GRU-CNN. Pers Ubiquitous Comput. 2019;23:405–12.
Article Google Scholar
Magatti D, Calegari S, Ciucci D, Stella F. Automatic labeling of topics. In: 2009 Ninth international conference on intelligent systems design and applications. IEEE; 2009. p. 1227–32.
Mamta AE, Bhattacharyya P, Srivastava S, Kumar A, Saha T. Multi-domain tweet corpora for sentiment analysis: resource creation and evaluation. In: Proceedings of the 12th LREC. Marseille: European Language Resources Association; 2020. p. 5046–54.
Prabowo R, Thelwall M. Sentiment analysis: a combined approach. J Informetr. 2009;3(2):143–57.
Article Google Scholar
Pranckevičius T, Marcinkevičius V. Application of logistic regression with part-of-the-speech tagging for multi-class text classification. In: 2016 IEEE 4th workshop on advances in information, electronic and electrical engineering (AIEEE). IEEE; 2016. p. 1–5.
Rahman M, Kumar Dey E, et al. Datasets for aspect-based sentiment analysis in Bangla and its baseline evaluation. Data. 2018;3(2):15.
Article Google Scholar
Sarkar K. Sentiment polarity detection in Bengali tweets using LSTM recurrent neural networks. In: 2019 Second international conference on advanced computational and communication paradigms (ICACCP). IEEE; 2019. p. 1–6.
Sarkar K. Heterogeneous classifier ensemble for sentiment analysis of Bengali and Hindi tweets. Sādhanā. 2020;45(1):1–17.
Article Google Scholar
Sarkar K, Bhowmick M. Sentiment polarity detection in Bengali tweets using multinomial naïve Bayes and support vector machines. In: 2017 IEEE Calcutta conference (CALCON). IEEE; 2017. p. 31–6.
Schapire RE. Explaining adaboost. 2013. https://doi.org/10.1007/978-3-642-41136-6_5.
Sharif O, Hoque MM. Identification and classification of textual aggression in social media: resource creation and evaluation. In: Chakraborty T, Shu K, Bernard HR, Liu H, Akhtar MS, editors. Combating online hostile posts in regional languages during emergency situation. Cham: Springer; 2021. p. 9–20.
Chapter Google Scholar
Sharif O, Hoque MM, Hossain E. Sentiment analysis of Bengali texts on online restaurant reviews using multinomial naïve Bayes. In: International conference on advances in science, engineering and robotics technology (ICASERT). IEEE; 2019. p. 1–6.
Sharif O, Hoque MM, Kayes ASM, Nowrozy R, Sarker IH. Detecting suspicious texts using machine learning techniques. Appl Sci. 2020;10(18). https://doi.org/10.3390/app10186527.
Sharif O, Hossain E, Hoque MM. Combating hostility: Covid-19 fake news and hostile post detection in social media. 2021. arXiv preprint arXiv:2101.03291.
Tabassum N, Khan MI. Design an empirical framework for sentiment analysis from Bangla text using machine learning. In: Proceedings of ECCE. IEEE; 2019. p. 1–5.
Taher S, Akhter K, Hasan KM. Bangla dataset for opinion mining. 2018. https://doi.org/10.13140/RG.2.2.20214.96327.
Taher SA, Akhter KA, Hasan KA. N-gram based sentiment mining for Bangla text using support vector machine. In: 2018 International conference on Bangla speech and language processing (ICBSLP). IEEE; 2018. p. 1–5.
Tan S. An effective refinement strategy for KNN text classifier. Expert Syst Appl. 2006;30(2):290–8. https://doi.org/10.1016/j.eswa.2005.07.019.
Article Google Scholar
Tokunaga T, Makoto I. Text categorization based on weighted inverse document frequency. In: Special interest groups and information process Society of Japan (SIG-IPSJ). Citeseer; 1994.
Wahid MF, Hasan MJ, Alom MS. Cricket sentiment analysis from Bangla text using recurrent neural network with long short term memory model. In: International conference on Bangla speech and language processing (ICBSLP). IEEE; 2019. p. 1–4.
Xia H, Yang Y, Pan X, Zhang Z, An W. Sentiment analysis for online reviews using conditional random fields and support vector machines. Electron Commer Res. 2020;20(2):343–60.
Article Google Scholar
Xu G, Yu Z, Yao H, Li F, Meng Y, Wu X. Chinese text sentiment analysis based on extended sentiment dictionary. IEEE Access. 2019;7:43749–62.
Article Google Scholar
Zhang T. Solving large scale linear prediction problems using stochastic gradient descent algorithms. In: Proceedings of the twenty-first international conference on machine learning, ICML ’04. New York: Association for Computing Machinery; 2004. p. 116. https://doi.org/10.1145/1015330.1015332.
Zhang Y, Jin R, Zhou ZH. Understanding bag-of-words model: a statistical framework. Int J Mach Learn Cybern. 2010;1(1–4):43–52.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Chittagong University of Engineering and Technology, Chittagong, 4349, Bangladesh
Md. Mashiur Rahaman Mamun, Omar Sharif & Mohammed Moshiul Hoque

Authors

Md. Mashiur Rahaman Mamun
View author publications
You can also search for this author in PubMed Google Scholar
Omar Sharif
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Moshiul Hoque
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammed Moshiul Hoque.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Enabling Innovative Computational Intelligence Technologies for IOT” guest edited by Omer Rana, Rajiv Misra, Alexander Pfeiffer, Luigi Troiano, and Nishtha Kesswani.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mamun, M.M.R., Sharif, O. & Hoque, M.M. Classification of Textual Sentiment Using Ensemble Technique. SN COMPUT. SCI. 3, 49 (2022). https://doi.org/10.1007/s42979-021-00922-z

Download citation

Received: 30 July 2021
Accepted: 03 October 2021
Published: 05 November 2021
DOI: https://doi.org/10.1007/s42979-021-00922-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classification of Textual Sentiment Using Ensemble Technique

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

A review on sentiment analysis and emotion detection from text

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Classification of Textual Sentiment Using Ensemble Technique

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

A review on sentiment analysis and emotion detection from text

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation