Abstract
Identifying protein-protein interactions (PPIs) can help us to know the protein function and is critical for understanding the mechanisms of proteome. Recently, lots of computational methods such as the domain-based approach have been developed for predicting the protein-protein interactions. The conventional domain-based methods usually need to infer the interacting domain pairs from already known interacting sets of proteins, and then to predict the PPIs. However, it is difficult to provide the detailed information that which of the domain pairs will actually interact for the PPIs prediction. Therefore, it is of great importance to develop a new computational model which can ignore the information whether a domain pair is interacting or not. In this paper, we propose a novel method using multi-instance learning (MIL) for predicting protein-protein interactions based on the domain information. Firstly, the domain pairs of two proteins were composed. Then, we use the amino acid composition feature encoding method to encode the domain pairs. Finally, two multi-instance learning methods were used for training the data. The experiment results demonstrate that the proposed method is effective.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Shi, M.G., et al.: Predicting protein–protein interactions from sequence using correlation coefficient and high-quality interaction dataset. Amino Acids 38(3), 891–899 (2010)
Guo, Y., Yu, L., Wen, Z., et al.: Using support vector machine combined with auto covariance to predict protein–protein interactions from protein sequences. Nucleic Acids Research 36(9), 3025–3030 (2008)
Skrabanek, L., Saini, H.K., Bader, G.D., et al.: Computational prediction of protein–protein Interactions. Molecular Biotechnology 38(1), 1–17 (2008)
Yu, J., Fotouhi, F.: Computational approaches for predicting protein–protein interactions: A survey. Journal of Medical Systems 30(1), 39–44 (2006)
Zhang, Q.C., Petrey, D., Deng, L., et al.: Structure-based prediction of protein-protein interactions on a genome-wide scale. Nature 490(7421), 556–560 (2012)
You, Z.H., Lei, Y.K., Zhu, L., et al.: Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis. BMC Bioinformatics 14(suppl. 8), S10 (2013)
Zahiri, J., Yaghoubi, O., Mohammad-Noori, M., et al.: PPIevo: Protein–protein interaction prediction from PSSM based evolutionary information. Genomics 102(4), 237–242 (2013)
Memi, V., Wallqvist, A., Reifman, J.: Reconstituting protein interaction networks using parameter-dependent domain-domain interactions. BMC Bioinformatics 14(1), 154 (2013)
Wojcik, J., Schächter, V.: Protein-protein interaction map inference using interacting domain profile pairs. Bioinformatics 17(suppl. 1), S296–S305 (2001)
Roslan, R., Othman, R.M., Shah, Z.A., et al.: Utilizing shared interacting domain patterns and Gene Ontology information to improve protein–protein interaction prediction. Computers in Biology and Medicine 40(6), 555–564 (2010)
Binny, P.S., Saha, S., Anishetty, R., et al.: A matrix based algorithm for protein–protein interaction prediction using domain–domain associations. Journal of Theoretical Biology 326, 36–42 (2013)
Jang, W.H., Jung, S.H., Han, D.S.: A computational model for predicting protein interactions based on multidomain collaboration. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 9(4), 1081–1090 (2012)
Ray, S., Scott, S., Blockeel, H.: Multi-instance learning. In: Encyclopedia of Machine Learning, pp. 701–710 (2010)
Zhou, Z.H.: Multi-instance learning: A survey. Department of Computer Science and Technology. Nanjing University (2004)
Gärtner, T., Flach, P.A., et al.: Multi-Instance Kernels. In: Proceedings of the 19th International Conference on Machine Learning, Sydney, Australia, pp. 179–186 (2002)
Mei, S.Y., Fei, W.: Structural Domain Based Multiple Instance Learning for Predicting Gram-Positive Bacterial Protein Subcellular Localization. In: International Joint Conference, pp. 195–200. IEEE (2009)
Wang, J., Zucker, J.D.: Solving multiple-instance problem: A lazy learning approach. In: Proceedings of the 17th International Conference on Machine Learning, San Francisco, pp. 1119–1125 (2000)
Zhou, Z.-H., Zhang, M.-L.: Ensembles of multi-instance learners. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 492–502. Springer, Heidelberg (2003)
Zhang, Y.P., Zhang, H., et al.: Multiple-Instance Learning with Instance Selection via Constructive Covering Algorithm. Tsinghua Science and Technology 19 (2014)
Zhang, L., Zhang, B.: A geometrical-representationMcCulloch-Neural model and its application. IEEETransactions on Neural Networks 10, 925–929 (1999)
Jang, W.H., Jung, S.H., Han, D.S.: A computational model for predicting protein interactions based on multidomain collaboration. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 9(4), 1081–1090 (2012)
Shen, J., Zhang, J., et al.: Predicting protein–protein interactions based only on sequences information. Proceedings of the National Academy of Sciences 104(11), 4337–4341 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhang, YP., Zha, Y., Li, X., Zhao, S., Du, X. (2014). Using the Multi-instance Learning Method to Predict Protein-Protein Interactions with Domain Information. In: Miao, D., Pedrycz, W., Ślȩzak, D., Peters, G., Hu, Q., Wang, R. (eds) Rough Sets and Knowledge Technology. RSKT 2014. Lecture Notes in Computer Science(), vol 8818. Springer, Cham. https://doi.org/10.1007/978-3-319-11740-9_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-11740-9_24
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11739-3
Online ISBN: 978-3-319-11740-9
eBook Packages: Computer ScienceComputer Science (R0)