Abstract
Many real applications require the representation of complex entities and their relations. Frequently, networks are the chosen data structures, due to their ability to highlight topological and qualitative characteristics. In this work, we are interested in supervised classification models for data in the form of networks. Given two or more classes whose members are networks, we build mathematical models to classify them, based on various graph distances. Due to the complexity of the models, made of tens of thousands of nodes and edges, we focus on model simplification solutions to reduce execution times, still maintaining high accuracy. Experimental results on three datasets of biological interest show the achieved performance improvements.
Similar content being viewed by others
References
Agren, R., Bordel, S., Mardinoglu, A., Pornputtapong, N., Nookaew, I., Nielsen, J.: Reconstruction of genome-scale active metabolic networks for 69 human cell types and 16 cancer types using INIT. PLoS Comput. Biol. 8(5), 1002518 (2012)
Attar, N., Aliakbaryb, S.: Classification of complex networks based on similarity of topological network features. Chaos 27, 091102 (2017). https://doi.org/10.1063/1.4997921
Bartlett, J., Bayani, J., Marshall, A., Dunn, J.A., Campbell, A., Cunningham, C., Sobol, M.S., Hall, P.S., Poole, C.J., Cameron, D.A., et al.: Comparing breast cancer multiparameter tests in the OPTIMA prelim trial: no test is more equal than the others. JNCI: J. Natl. Cancer Inst. 108(9) (2016). https://doi.org/10.1093/jnci/djw050
Bonacich, P.: Factoring and weighting approaches to status scores and clique identification. J. Math. Sociol. 2(1), 113–120 (1972)
Borgwardt, K.M., Kriegel, H.-P.: Shortest-path kernels on graphs. In: Proceedings of the Fifth IEEE International Conference on Data Mining, ICDM ’05, pp 74–81. IEEE Computer Society, Washington (2005)
Borgwardt, K.M., Ong, C.S., Schönauer, S., Vishwanathan, S. V. N., Smola, A.J., Kriegel, H.-P.: Protein function prediction via graph kernels. Bioinformatics 21(1), 47–56 (2005)
Carpi, L.C., Schieber, T.A., Pardalos, P.M., Marfany, G., Masoller, C., Díaz-Guilera, A., Ravetti, M.G.: Assessing diversity in multiplex networks. Sci. Rep. 9(1), 4511 (2019). https://doi.org/10.1038/s41598-019-38869-0
Davis, S., Meltzer, P.S.: GEOQuery: a bridge between the gene expression omnibus (geo) and bioconductor. Bioinformatics 23(14), 1846–1847 (2007)
DeBerardinis, R.J., Thompson, C.B.: Cellular metabolism and disease: what do metabolic outliers teach us? Cell 148(6), 1132–1144 (2012)
Deyarmin, B., Kane, J.L., Valente, A.L., van Laar, R., Gallagher, C., Shriver, C.D., Ellsworth, R.E.: Effect of ASCO/CAP guidelines for determining ER status on molecular subtype. Ann. Surg. Oncol. 20(1), 87–93 (2013)
Fuglede, B., Topsoe, F.: Jensen-Shannon divergence and hilbert space embedding. In: ISIT 2004. Proceedings. International Symposium on Information Theory, 2004, pp. 31+ (2004)
Gadiyaram, V., Ghosh, S., Vishveshwara, S.: A graph spectral-based scoring scheme for network comparison. J. Complex Networks 5(2), 219–244 (2017)
Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern. Anal. Applic. 13(1), 113–129 (2010)
Gautier, L., Cope, L., Bolstad, B.M., Irizarry, R.A.: Affy−−analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20(3), 307–315 (2004)
Ghosh, S., Gadiyaram, V., Vishveshwara, S.: Validation of protein structure models using network similarity score. Proteins: Struct., Funct., Bioinf. 85(9), 1759–1776 (2017)
Granata, I., Guarracino, M.R., Kalyagin, V.A., Maddalena, L., Manipur, I., Pardalos, P.P.: Supervised classification of metabolic networks. In: 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2688–2693, IEEE (2018)
Guarracino, M.R., Xanthopoulos, P., Pyrgiotakis, G., Tomaino, V., Moudgil, B.M., Pardalos, P.M.: Classification of cancer cell death with spectral dimensionality reduction and generalized eigenvalues. Artif. Intell. Med. 53(2), 119–125 (2011)
Guarracino, M.R., Cifarelli, C., Seref, O., Pardalos, P.M.: A classification method based on generalized eigenvalue problems. Optim. Methods Softw. 22(1), 73–81 (2007)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explorations 11(1), 10–18 (2009)
Hamilton, W.L., Ying, R., Leskovec, J.: Representation learning on graphs: Methods and applications. arXiv:1709.05584 (2017)
Hucka, M., Finney, A., Sauro, H.M., Bolouri, H., Doyle, J.C., Kitano, H., Arkin, A.P., Bornstein, B.J., Bray, D., Cornish-Bowden, A., et al.: The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19(4), 524–531 (2003)
Liu, Q., Dong, Z., Wang, E.: Cut based method for comparing complex networks. Sci. Rep. 8(1), 5134 (2018). https://doi.org/10.1038/s41598-018-21532-5
Luo, H., Huang, Z., Xiao, G.: Image classification with a novel semantic linear-time graph kernel. In: 2015 11Th International Conference on Semantics, Knowledge and Grids (SKG), pp. 235–238 (Aug 2015)
Ma, H., Zeng, A.-P.: Reconstruction of metabolic networks from genome data and analysis of their global structure for various organisms. Bioinformatics 19(2), 270–277 (2003)
Marshall, K., Phillippy, K., Sherman, P., Holko, M., Yefanov, A., Lee, H., Zhang, N., Robertson, C., Serova, N., Davis, S., Soboleva, A.: NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Res. 41, D991–5 (2013). Database issue
Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press (1998)
Richiardi, J., Ng, B.: Recent advances in supervised learning for brain graph classification. In: 2013 IEEE Global Conference on Signal and Information Processing, pp. 907–910 (Dec 2013)
Schieber, T.A., Carpi, L., Díaz-Guilera, A., Pardalos, P.M., Masoller, C., Ravetti, M.G.: Quantification of network structural dissimilarities. Nat. Commun. 8, 01 (2017)
Shervashidze, N., Vishwanathan, S.V.N., Petri, T., Mehlhorn, K., Borgwardt, K.M.: Efficient graphlet kernels for large graph comparison. J. Mach. Learn. Res. - Proc. Track 5, 488–495 (2009)
Trafalis, T.B., Gilbert, R.C.: Robust support vector machines for classification and computational issues. Optim. Methods Softw. 22(1), 187–198 (2007)
Tsuda, K., Saigo, H.: Graph Classification. In: Managing and Mining Graph Data, pp. 337–363 (2010)
Uhlén, M., Fagerberg, L., Hallström, B.M., Lindskog, C., Oksvold, P., Mardinoglu, A., Sivertsson, Å, Kampf, C., Sjöstedt, E., Asplund, A., et al.: Tissue-based map of the human proteome. Science 347(6220), 1260419 (2015)
Van Laar, R.K.: Design and multiseries validation of a web-based gene expression assay for predicting breast cancer recurrence and patient survival. J. Mol. Diagn. 13(3), 297–304 (2011)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Berlin (1995)
Vishwanathan, S.V.N., Schraudolph, N., Kondor, N., Borgwardt, K.: Graph kernels. J. Mach. Learn. Res. 11, 1201–1242 (2010)
Wilkinson, J.: The Algebraic Eigenvalue Problem. Clarendon Press, Oxford (1965)
Xanthopoulos, P., Guarracino, M.R., Pardalos, P.M.: Robust generalized eigenvalue classifier with ellipsoidal uncertainty. Annals OR 216(1), 327–342 (2014)
Acknowledgements
The work was carried out also within the activities of M.R.G. and L.M. as members of the INdAM Research group GNCS. The authors would like to thank G. Trerotola for the technical support.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported by the PON MISE project BigImaging and by the National Research University Higher School of Economics, RSF grant n. 14-41-00039.
Rights and permissions
About this article
Cite this article
Granata, I., Guarracino, M.R., Kalyagin, V.A. et al. Model simplification for supervised classification of metabolic networks. Ann Math Artif Intell 88, 91–104 (2020). https://doi.org/10.1007/s10472-019-09640-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-019-09640-y