Abstract
This paper describes the development of statistical classifiers to help diagnose meningococcal meningitis, i.e. the most sever, infectious and deadliest type of this disease. The goal is to find a mechanism able to determine whether a patient has this type of meningitis from a set of symptoms that can be directly observed in the earliest stages of this pathology. Currently, in Brazil, a country that is heavily affected by meningitis, all suspected cases require immediate hospitalization and the beginning of a treatment with invasive tests and medicines. This procedure, therefore, entails expensive treatments unaffordable in less developed regions. For this purpose, we have gathered together a dataset of 22,602 records of suspected meningitis cases from the Brazilian state of Bahia. Seven classification techniques have been applied from input data of nine symptoms and other information about the patient such as age, sex and the area they live in, and a 10 cross-fold validation has been performed. Results show that the techniques applied are suitable for diagnosing the meningococcal meningitis. Several indexes, such as precision, recall or ROC area, have been computed to show the accuracy of the models. All of them provide good results, but the best corresponds to the J48 classifier with a precision of 0.942 and a ROC area over 0.95. These results indicate that our model can indeed help lead to a non-invasive and early diagnosis of this pathology. This is especially useful in less developed areas, where the epidemiologic risk is usually high and medical expenses, sometimes, unaffordable.
Similar content being viewed by others
References
Tunkel, A. R., et al., Practice Guidelines for the Management of Bacterial Meningitis. Clin. Infect. Dis. 39(9):1267–1284, 2004. doi:10.1086/425368.
World Health Organization (2015) Meningococcal meningitis. Fact sheet N141
Nunn, A., Brasil, Ministéio da saúde, secretaria de vigilância em saúde, departamento de vigilância epidemiológica. Guide to Epidemiological Surveillance. 7. ed. Chapter 12:21–47, 2009. ISBN 978-85-334-1632-1.
Chaudhuri, A., Martinez-Martin, P., Kennedy, P. G., Andrew Seaton, R., Portegies, P., Bojar, M., Steiner, I., EFNS guideline on the management of community-acquired bacterial meningitis: report of an EFNS Task Force on acute bacterial meningitis in older children and adults. Eur. J. Neurol. 15(7):649–59, 2008. doi:10.1111/j.1468-1331.2008.02193.x.
Huang, M.-L., and Chen, H.-Y., Glaucoma classification model based on GDx VCC measured parameters by decision tree. J. Med. Syst. 34:1141–1147, 2010. doi:10.1007/s10916-009-9333-2.
Farion, K., Michalowski, W., Wilk, S., O’Sullivan, D., Matwin, S., A tree-based decision model to support prediction of the severity of asthma exacerbations in children. J. Med. Syst. 43:551–562, 2010. doi:10.1007/s10916-009-9268-7.
Ting, H., Mai, Y.-T., Hsu, H.-C., Wu, H.-C., Tseng, M.-H., Decision tree based diagnostic system for moderate to severe obstructive sleep apnea. J. Med. Syst. 38:94, 2014. doi:10.1007/s10916-014-0094-1.
Chao, C.-M., Yu, Y.-W., Cheng, B.-W., Kuo, Y.-L., Construction the model on the breast cancer survival analysis use support vector machine, logistic regression and decision tree. J. Med. Syst. 38:106–112, 2014. doi:10.1007/s10916-014-0106-1.
Quinlan, J. R., Induction of decision trees. Mach. Learn. 1:81–106, 1986.
Quinlan, J. R., C4.5: programs for machine learning. CA, USA: Morgan Kaufmann Publishers Inc, 1993. ISBN:1-55860-238-0.
Breiman L, Random forests. Mach. Learn. 45(1):5–32, 2001. doi:10.1023/A:1010933404324.
Svetnik, V., Liaw, A., Tong, C., Culberson, J. C., Sheridan, R. P., Feuston, B. P., Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling. J. Chem. Inf. Comput. Sci. 43(6):1947–1958, 2003. doi:10.1021/ci034160g.
Freund, Y., and Mason, L.: The alternating decision tree algorithm. In: Proceedings of the 16th International Conference on Machine Learning. ISBN:1-55860-612-2, pp. 124–133 (1999)
Takada, M., et al., Prediction of axillary lymph node metastasis in primary breast cancer patients using a decision tree-based model. BMC Med. Inform. Decis. Mak. 12:54, 2012. doi:10.1186/147269471254.
Cristianini, N., and Shawe-Taylor, J., An introduction to support vector machines and other kernel-based learning methods. NY, USA: Cambridge University Press, 1999. ISBN:0-521-78019-5.
Zhang, T.: An introduction to support vector machines and other kernel-based learning methods: a review. AI Mag. 2(22) (2001)
Cortes, C., and Vapnik, V., Support-vector networks. Mach. Learn. 20:273–297, 1995.
Singh, M., and Provan, G. M., Efficient learning of selective bayesian network classifier. international conference on machine learning. Philadelphia, PA: Computer and Information Science Department. University of Pennsylvania, 1995.
Mitchell, T. M.: Machine learning. McGraw-Hill International Editions. ISBN 0071154671 (1997)
Bala, J., Chang, K. C., Williams, A., Weng, Y., Hybrid bayesian decision tree for classification workshop on probabilistic graphical models for classification. Croatia: Cavtat-Dubrovnik, 2003.
Aloraini, A., Different machine learning algorithms for breast cancer diagnosis. Int. J. Artif. Intell. Appl. 3 (6):21–30, 2012. doi:10.5121/ijaia.2012.3603.
Shaukat, K., Masood, N., Mehreen, S., Azmeen, U., Dengue fever prediction: A data mining problem. J. Data Min. Genomics Proteomics 6:3, 2015. doi:10.4172/2153-0602.1000181.
Han, J., Rodriguez, J. C., Beheshti, M., Discovering decision tree based diabetes prediction model. Adva. Softw. Eng. (ASEA 2008:99–109, 2008.
Dhakate, P., Rajeswari, K., Abin, D., Analysis of different classifiers for medical dataset using various measures. Int. J. Comput. Appl. 5(111):20–24, 2015.
Emina, A., and Subasi, A., Medical decision support system for diagnosis of heart arrhytmia using DWT and random forest classifier. J. Med. Syst. 40:108, 2016. doi:10.1007/s10916-016-0467-8.
Park, K., Ali, A., Kim, D., An, Y., Kim, M., Shin, H., Robust predictive model for evaluating breast cancer survivability. Eng. Appl. Artif. Intell. 26:2194–2205, 2013. doi:10.1016/j.engappai.2013.06.013.
Acuna, E., and Rodriguez, C.: The treatment of missing values and its effects in the classifier accuracy. In: Banks, D., House, L., McMorris, F. R., Arabie, P., Gaul, W. (Eds.) Classification, Clustering and Data Mining Applications, Proceedings of the Meeting of the International Federation of Classifications Societies (IFCS), Illinois Institute of Technology, pp. 639–648. Springer, Chicago, Berlin (2004). doi:10.1007/978-3-642-17103-1_60
Witten, I. H., and Frank, E.: Data mining: Practical machine learning tools and techniques. Morgan Kaufmann Publishers. ISBN: 0-12-088407-0 (2005)
Alberg, A. J., Park, J. W., Hager, B. W., Brock, M. V., Diener-West, M., The use of ”overall accuracy” to evaluate the validity of screening or diagnostic tests. J. Gen. Intern. Med. 19:460–465, 2004. doi:10.1111/j.1525-1497.2004.30091.x.
Swets, J., Measuring the accuracy of diagnostic systems. Science 240(4857):1285–93, 1988.
Acknowledgements
We thank Prof. Maria Rita Donalisio, M.D. Ph.D., Associate Professor at the Faculty of Medical Sciences (State University of Campinas), for her assistance during the first part of this research, and Prof. José-Luis Pérez-de-la-Cruz, Ph.D., Full Professor at the University of Málaga, for his comments on the early version of this paper. We would also like to show our gratitude to the National Health Service of Brazil for providing us the data used in this study.
Author information
Authors and Affiliations
Corresponding author
Additional information
This article is part of the Topical Collection on Patient Facing Systems
Rights and permissions
About this article
Cite this article
Lélis, VM., Guzmán, E. & Belmonte, MV. A Statistical Classifier to Support Diagnose Meningitis in Less Developed Areas of Brazil. J Med Syst 41, 145 (2017). https://doi.org/10.1007/s10916-017-0785-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10916-017-0785-5