Abstract
The classification problems described in the Machine Learning literature usually relate to the classification of data in which each example is associated to a class belonging to a finite set of classes, all at the same level. However, there are classification issues, of a hierarchical nature, where the classes can be either subclasses or super classes of other classes. In many hierarchical problems, one or more examples may be associated with more than one class simultaneously. These problems are known as hierarchical multi-label classification (HMC) problems. In this work, the ML-KNN algorithm was used to predict hierarchical multi-label problems, in order to determine the number of classes that can be assigned to an example. Through the experiments performed on 10 protein function databases and the statistical analysis of the results, it can be shown that the adaptations performed in the ML-KNN algorithm brought significant performance improvements based on the hierarchical precision and recall metrics Hierarchical.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dumais, S., Chen, H.: Hierarchical classification of web content. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece, pp. 256–263 (2000)
Sun, A., Lim, E.-P.: Hierarchical text classification and evaluation. In: Proceedings of the 2001 IEEE International Conference on Data Mining. IEEE Computer Society, pp. 521–528 (2001)
Costa, E.P., Lorena, A.C., Carvalho, A.P.L.F., Freitas, A.A.: A review of performance evaluation measures for hierarchical classifiers. In: Proceedings of the AAAI07 - Workshop on Evaluation Methods for Machine Learning II, pp. 1–6 (2007)
Holden, N., Freitas, A.: A hierarchical classification of protein function with ensembles of rules and particle swarm optimization. Soft. Comput. 13, 259–272 (2008)
Barutcuoglu, Z., DeCoro, C.: Hierarchical shape classification using Bayesian aggregation. In: Proceedings of the IEEE International Conference on Shape Modeling and Applications, Matsushima, Japan, pp. 44–44 (2006)
Carvalho, A.C.P.F., Freitas, A.: A Tutorial on Hierarchical Classification with Applications in Bioinformatics, vol. 1. Idea Group, São Paulo (2007)
Cerri, R., Carvalho, A.C.P.L.F., e Costa, E.P.: Classificação hierárquica de proteínas utilizando técnicas de aprendizado de máquina. In: II Workshop on Computational Intelligence, páginas 1–6, Salvador (2008)
Guyon, I., Elisseeff, A.: An introduction to feature extraction. In: Guyon, I., Nikravesh, M., Gunn, S., Zadeh, L.A. (eds.) Feature Extraction, Foundations and Applications, vol. 207, pp. 1–24. Springer, Heidelberg (2006)
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Proceedings of the Fourteenth International Conference on Machine Learning. Morgan Kaufmann Publishers Inc., pp. 412–420 (1997)
Spyromitros, E., Tsoumakas, G., Vlahavas, I.: An empirical study of lazy multilabel classification algorithms. In: Hellenic conference on Artificial Intelligence, Berlin, Alemanha, pp. 401–406 (2009)
Borges, H.B., Nievola, J.C.: Multi-label hierarchical classification using a competitive neural network for protein function prediction. In: 2012 International Joint Conference on Neural Networks (IJCNN 2012), Brisbane, Austrália, vol. 1, pp. 1–8. IEEE Press, Piscataway (2012)
Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (ed.) Data Mining and Knowledge Discovery Handbook, 2nd edn. Springer, Boston (2010)
Zhang, M.L., Zhou, Z.H.: Ml-kNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)
Kiritchenko, S., Matwin, S., Famili, A.F.: Hierarchical text categorization as a tool of associating genes with gene ontology codes. In: Proceedings of the Second European Workshop on Data Mining and Text Mining in Bioinformatics, Pisa, Italia (2004)
Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics 1, 80–83 (1945)
Stojanova, D., Ceci, M., Malerba, D., Džeroski, S.: Learning hierarchical multi-label classification trees from network data. In: Fürnkranz, J., Hüllermeier, E., Higuchi, T. (eds.) DS 2013. LNCS, vol. 8140, pp. 233–248. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40897-7_16
Amati, G., Rijsbergen, C.J.V.: Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst. (TOIS) 20(4), 357–389 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Almeida, T.B., Borges, H.B. (2017). An Adaptation of the ML-kNN Algorithm to Predict the Number of Classes in Hierarchical Multi-label Classification. In: Torra, V., Narukawa, Y., Honda, A., Inoue, S. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2017. Lecture Notes in Computer Science(), vol 10571. Springer, Cham. https://doi.org/10.1007/978-3-319-67422-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-67422-3_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67421-6
Online ISBN: 978-3-319-67422-3
eBook Packages: Computer ScienceComputer Science (R0)