A Hybrid Learning Approach for Text Classification Using Natural Language Processing

El Mir, Iman; El Kafhali, Said; Haqiq, Abdelkrim

doi:10.1007/978-3-031-07969-6_32

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 489))

Included in the following conference series:

International Conference On Big Data and Internet of Things

381 Accesses
1 Citations

Abstract

Text classification and categorization is a hot topic that involves assigning tags or categories to a text based on its content. It is one of the important tasks of automatic natural language processing (NLP) in many applications such as topic tagging, sentiment analysis, intent detection, spam filtering, and email routing. Machine learning text classification can support businesses to automatically analyze and structure their textual documents promptly and inexpensively, to automate processes and improve data-driven decisions. In this article, we propose a new algorithm to classify textual documents using a hybrid approach that combines a set of given algorithms, using the best for each class. These documents can be classified into a set of possible class labels given a priori. Two machine learning algorithms are used to evaluate our proposed approach: Naive Bayesian (NB) and Logistic Regression (LR). The obtained results showed that the proposed hybrid algorithm is more efficient than NB and LR algorithms with an accuracy of 91.86%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 219.00; Price excludes VAT (USA)

Softcover Book: USD 279.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Elnagar, A., Al-Debsi, R., Einea, O.: Arabic text classification using deep learning models. Inf. Process. Manag. 57(1), 102121 (2020)
Article Google Scholar
Hartmann, J., Huppertz, J., Schamp, C., Heitmann, M.: Comparing automated text classification methods. Int. J. Res. Mark. 36(1), 20–38 (2019)
Article Google Scholar
Liu, H., Yin, Q., Wang, W.Y.: Towards explainable NLP: a generative explanation framework for text classification. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp. 5570–5581 (2019)
Google Scholar
Yadav, A., Vishwakarma, D.K.: Sentiment analysis using deep learning architectures: a review. Artif. Intell. Rev. 53(6), 4335–4385 (2019). https://doi.org/10.1007/s10462-019-09794-5
Article Google Scholar
Medhat, W., Hassan, A., Korashy, H.: Sentiment analysis algorithms and applications: a survey. Ain Shams Eng. J. 5(4), 1093–1113 (2014)
Article Google Scholar
Jain, A.P., Dandannavar, P.: Application of machine learning techniques to sentiment analysis. In: 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT), pp. 628–632. IEEE (2016)
Google Scholar
Shah, K., Patel, H., Sanghvi, D., Shah, M.: A comparative analysis of logistic regression, random forest and KNN models for the text classification. Augmented Hum. Res. 5(1), 1–16 (2020). https://doi.org/10.1007/s41133-020-00032-0
Article Google Scholar
Li, Q., et al.: A survey on text classification: from shallow to deep learning. ACM Comput. Surv. 37(4), 1–35 (2020)
Article Google Scholar
Kadhim, A.I.: Survey on supervised machine learning techniques for automatic text classification. Artif. Intell. Rev. 52(1), 273–292 (2019). https://doi.org/10.1007/s10462-018-09677-1
Article MathSciNet Google Scholar
Kowsari, K., Jafari Meimandi, K., Heidarysafa, M., Mendu, S., Barnes, L., Brown, D.: Text classification algorithms: a survey. Information 10(4), 150 (2019)
Article Google Scholar
Deng, X., Li, Y., Weng, J., Zhang, J.: Feature selection for text classification: a review. Multimedia Tools Appl. 78(3), 3797–3816 (2018). https://doi.org/10.1007/s11042-018-6083-5
Article Google Scholar
Marie-Sainte, S.L., Alalyani, N.: Firefly algorithm based feature selection for Arabic text classification. J. King Saud Univ.-Comput. Inf. Sci. 32(3), 320–328 (2020)
Google Scholar
Kou, G., Yang, P., Peng, Y., Xiao, F., Chen, Y., Alsaadi, F.E.: Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods. Appl. Soft Comput. 86, 105836 (2020)
Article Google Scholar
Meng, Y., Shen, J., Zhang, C., Han, J.: Weakly-supervised hierarchical text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6826-6833, AAAI Press, Palo Alto, California USA (2019)
Google Scholar
Burdisso, S.G., Errecalde, M., Montes-y-Gómez, M.: A text classification framework for simple and effective early depression detection over social media streams. Expert Syst. Appl. 133, 182–197 (2019)
Article Google Scholar
Sachan, D. S., Zaheer, M., Salakhutdinov, R.: Revisiting lstm networks for semi-supervised text classification via mixed objective function. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6940–6948, AAAI Press, Palo Alto, California USA (2019)
Google Scholar
Sun, C., Qiu, X., Xu, Y., Huang, X.: How to fine-tune BERT for text classification? In: Sun, M., Huang, X., Ji, H., Liu, Z., Liu, Y. (eds.) CCL 2019. LNCS (LNAI), vol. 11856, pp. 194–206. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32381-3_16
Chapter Google Scholar
Kim, H., Jeong, Y.S.: Sentiment classification using convolutional neural networks. Appl. Sci. 9(11), 2347 (2019)
Article Google Scholar
Dzisevič, R., Šešok, D.: Text classification using different feature extraction approaches. In: 2019 Open Conference of Electrical, Electronic and Information Sciences (eStream), pp. 1–4. IEEE (2019)
Google Scholar
Christian, H., Agus, M.P., Suhartono, D.: Single document automatic text summarization using term frequency-inverse document frequency (TF-IDF). ComTech Comput. Math. Eng. Appl. 7(4), 285–294 (2016)
Google Scholar
Indra, S. T., Wikarsa, L., Turang, R.: Using logistic regression method to classify tweets into the selected topics. In: 2016 International Conference on Advanced Computer Science and Information Systems (ICACSIS), pp. 385–390. IEEE (2016)
Google Scholar
http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, pp. 649–657 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Hassan First University of Settat, Institute of Sports Sciences, Computer, Networks, Modeling, and Mobility Laboratory (IR2M), B.P. 539, 26000, Settat, Morocco
Iman El Mir
Hassan First University of Settat, Faculty of Sciences and Techniques, Computer, Networks, Modeling, and Mobility Laboratory (IR2M), B.P. 577, 26000, Settat, Morocco
Said El Kafhali & Abdelkrim Haqiq

Authors

Iman El Mir
View author publications
You can also search for this author in PubMed Google Scholar
Said El Kafhali
View author publications
You can also search for this author in PubMed Google Scholar
Abdelkrim Haqiq
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Said El Kafhali .

Editor information

Editors and Affiliations

ENSIAS, Mohammed V University, Rabat, Morocco
Mohamed Lazaar
UNILEHAVRE, UNIROUEN, Normandie Université, Le Havre, France
Claude Duvallet
Vrije Universiteit Brussel, Brussels, Belgium
Abdellah Touhafi
ENSA, Abdelmalek Essaâdi University, Tetuan, Morocco
Mohammed Al Achhab

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

El Mir, I., El Kafhali, S., Haqiq, A. (2022). A Hybrid Learning Approach for Text Classification Using Natural Language Processing. In: Lazaar, M., Duvallet, C., Touhafi, A., Al Achhab, M. (eds) Proceedings of the 5th International Conference on Big Data and Internet of Things. BDIoT 2021. Lecture Notes in Networks and Systems, vol 489. Springer, Cham. https://doi.org/10.1007/978-3-031-07969-6_32

Download citation

DOI: https://doi.org/10.1007/978-3-031-07969-6_32
Published: 03 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-07968-9
Online ISBN: 978-3-031-07969-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

A Hybrid Learning Approach for Text Classification Using Natural Language Processing