Abstract
Multi-label learning has attracted widespread attention in machine learning, and many multi-label learning algorithms have been witnessed. However, two main challenging issues remain: the high dimension of data and the label correlation. In this paper, a new classification method, called penalized partial least squares discriminant analysis for multi-label learning (PPML), is proposed. It aims at performing dimension reduction and capturing the label correlations simultaneously. Specifically, PPML first identifies a latent space for the variable and label space via partial least squares discriminant analysis (PLS-DA). To tackle with the problem of high dimensionality in solving PLS-DA, a ridge penalization is exerted on the optimization problem. After that, the latent space is used to construct learning model. The experimental results on the standard public data sets indicate that PPML has better performance than the state-of-the-art approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barutcuoglu, Z., Schapire, R., Troyanskaya, O.: Hierarchical multi-label prediction of gene function. Bioinformatics 22(7), 830–836 (2006)
Yang, S., Kim, S., Ro, Y.: Semantic home photo categorization. IEEE Transactions on Circuits and Systems for Video Technology 17, 324–335 (2007)
Read, J.: Scalable multi-label classification, PhD thesis, University of Waikato, Hamilton, New Zealand (2010)
Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, 2nd edn., Spring (2010)
Tsoumakas, G., Katakis, I., Vlahavas, I.: Randomk-Labelsets for Multi-Label Classification. IEEE Transactions on Knowledge and Data Engineering 23(7), 1079–1089 (2011)
Zhang, M.-L., Zhou, Z.-H.: ML-kNN: a lazy learning approach to multi-label learning. Pattern Recognition 40(7), 2038–2048 (2007)
Zhang, M.-L., Zhou, Z.-H.: Multi-label Neural Network with Applications to Functional Genomics and Text Categorization. IEEE Transactions on Knowledge and Data Engineering 18(10), 1338–1351 (2006)
Bellman, R.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton (1961)
Jolliffe, I.: Principal Component Analysis. Springer, New York (1986)
Fisher, R.: The use of multiple measurements in taxonomic problems. Annals of Eugenics 7, 179–188 (1936)
Hotelling, H.: Relations between two sets of variables. Biometrika 28, 312–377 (1936)
Huang, S.-J., Zhou, Z.-H.: Multi-label learning by exploiting label correlations locally. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence (AAAI 2012), Toronto, Canada, pp. 949–955. AAAI Press (2012)
Xu, M., Li, Y.-F., Zhou, Z.-H.: Multi-Label Learning with PRO Loss. In: Proceedings of the 27th AAAI Conference on Artificial Intelligence (AAAI 2013), Bellevue, WA (2013)
Sun, L., Ji, S.-W., Ye, J.-P.: Canonical Correlation Analysis for Multilabel Classification: A Least-Squares Formulation. Extensions, and Analysis, Pattern Analysis and IEEE Transactions on Machine Intelligence 33(1), 194 (2011)
Qian, B., Davidson, I.: Semi-supervised dimension reduction for multi-label classification. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence (2010)
Zhang, M.-L., Zhang, K.: Multi-label learning by exploiting label dependency. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2010), Washington, D.C., pp. 999–1007 (2010)
Wold, H.: Path Models with Latent Variables: The NIPALS Approach. In: Blalock, H.M., Aganbegian, A., Borodkin, F.M., Boudon, R., Capecchi, V. (eds.) Quantitative Sociology: International Perspectives on Mathematical and Statistical Modeling, pp. 307–357. Academic Press, New York (1975)
Hoerl, A., Kennard, R.: Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970)
Frnkranz, J., Hllermeier, E., Menca, E.L., Brinker, K.: Multilabel classification via calibrated label ranking. Machine Learning 23(2), 133–153 (2008)
Tsoumakas, G., Dimou, A., Spyromitros, E., Mezaris, V., Kompatsiaris, I., Vlahavas, I.: Correlation-based pruning of stacked binary relevance models for multi-label learning. In: Proceedings of the Workshop on Learning from Multi-Label Data (MLD 2009), pp. 101–116 (2009)
Zhang, M.-L., Zhou, Z.-H.: A review on multi-label learning algorithms. In: IEEE Transactions on Knowledge and Data Engineering (2013) doI:10.1109/TKDE.2013.39
Read, J.: A pruned problem transformation method for multi-label classification. In: Proc. 2008 New Zealand Computer Science Research Student Conference (NZCSRS 2008), pp. 143–150 (2008)
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. In: ECML/PKDD 2009, pp. 254–269 (2009)
Clare, A.J., King, R.D.: Knowledge Discovery in Multi-label Phenotype Data. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 42–53. Springer, Heidelberg (2001)
Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Dietterich, G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, vol. 14, pp. 681–687. MIT Press, Cambridge (2002)
Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research 11(1), 19–60 (2010)
Jenatton, R., Mairal, J., Obozinski, G., Bach, F.: Proximal methods for sparse hierarchical dictionary learning. In: Proceedings of the International Conference on Machine Learning, ICML (2010)
Sampson, P., Streissguth, A., Barr, H., Bookstein, F.: eurobehavioral effects of prenatal alcohol: Part II. Partial Least Squares Analysis, Neurotoxicology and Teratology 11(5), 477–491 (1989)
De Jong, S.: SIMPLS: an alternative approach to partial least squares regression. Chemometrics and Intelligent Laboratory Systems 18(3), 251–263 (1993)
Spyromitros, E., Tsoumakas, G., Vlahavas, I.P.: An Empirical Study of Lazy Multilabel Classification Algorithms. In: Darzentas, J., Vouros, G.A., Vosinakis, S., Arnellos, A. (eds.) SETN 2008. LNCS (LNAI), vol. 5138, pp. 401–406. Springer, Heidelberg (2008)
Cheng, W., H\(\ddot{u}\)llermeier, E.: Combining instance-based learning and logistic regression for multilabel classification. Machine Learning 76(2-3), 211–225 (2009)
Tsoumakas, G., Spyromitros-Xioufis, E., Vilcek, J., Vlahavas, I.: Mulan: A Java Library for Multi-Label Learning. Journal of Machine Learning Research 12, 2411–2414 (2011)
Huang, S.-J., Yu, Y., Zhou, Z.-H.: Multi-label hypothesis reuse. In: Proceedings of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Beijing, China, pp. 525–533 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ma, Z., Liu, H., Su, K., Zheng, Z. (2014). PPML: Penalized Partial Least Squares Discriminant Analysis for Multi-Label Learning. In: Li, F., Li, G., Hwang, Sw., Yao, B., Zhang, Z. (eds) Web-Age Information Management. WAIM 2014. Lecture Notes in Computer Science, vol 8485. Springer, Cham. https://doi.org/10.1007/978-3-319-08010-9_69
Download citation
DOI: https://doi.org/10.1007/978-3-319-08010-9_69
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08009-3
Online ISBN: 978-3-319-08010-9
eBook Packages: Computer ScienceComputer Science (R0)