Skip to main content

Novel Gene Ontology Based Distance Metric for Function Prediction via Clustering in Protein Interaction Networks

  • Conference paper
ICT Innovations 2014 (ICT Innovations 2014)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 311))

Included in the following conference series:

  • 940 Accesses

Abstract

The increased availability of large-scale protein-protein interaction (PPI) data has made it possible to have a network level understanding of the basic components and organization of the cell machinery. A significant number of proteins in protein interaction networks (PIN) remain uncharacterized and predicting their function remains a major challenge. We propose a novel distance metric for PIN clustering. First we augment the graph representing the PIN with weights derived from Gene Ontology (GO) semantic similarity and we use this augmented representation in a random walk with restarts (RWR) process. The distance between a pair of proteins is calculated from the steady state distribution of the RWR. We validate our approach by function prediction via clustering in a purified and reliable Saccharomyces cerevisiae PIN. We show that the rise of function prediction performance when using the novel distance metric is significant, as compared to traditional approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. von Mering, C., Krause, R., Sne, B., et al.: Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417(6887), 399–403 (2002)

    Article  Google Scholar 

  2. Hakes, L., Lovell, S.C., Oliver, S.G., et al.: Specificity in protein interactions and its relationship with sequence diversity and coevolution. PNAS 104(19), 7999–8004 (2007)

    Article  Google Scholar 

  3. Harwell, L.H., Hopfield, J.J., Leibler, S., Murray, A.W.: From molecular to modular cell bi-ology. Nature 402, c47–c52 (1999)

    Google Scholar 

  4. The gene ontology consortium: Gene ontology: Tool for the unification of biology. Nature Genetics 25(1), 25–29 (2000)

    Google Scholar 

  5. Pesquita, C., Faria, D., Bastos, H., Ferreira, A., Falcão, A.O., Couto, F.M.: Metrics for GO based protein semantic similarity: a systematic evaluation. BMC Bioinformatic 9(5), S4 (2008)

    Google Scholar 

  6. Brohée, S., van Helden, J.: Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics 7, 48 (2006)

    Article  Google Scholar 

  7. Barabasi, A.L., Oltvai, Z.N.: Network biology: understanding the cell’s functional organization. Nat. Rev. Genet. 5, 101–113 (2004)

    Article  Google Scholar 

  8. Arnau, V., Mars, S., Marin, I.: Iterative cluster analysis of protein interaction data. Bioinformatics 21, 364–378 (2005)

    Article  Google Scholar 

  9. Rives, A.W., Galitski, T.: Modular organization of cellular networks. PNAS 100, 1128–1133 (2003)

    Article  Google Scholar 

  10. Friedel, C.C., Zimmer, R.: Inferring topology from clustering coefficients in protein-protein interaction networks. BMC Bioinformatics 7, 519 (2006)

    Article  Google Scholar 

  11. Pereira-Leal, J.B., Enright, A.J., Ouzounis, C.A.: Detection of functional modules from protein interaction networks. Proteins 54, 49–57 (2004)

    Article  Google Scholar 

  12. Luo, F., Yang, Y., Chen, C.F., Chang, R., Zhou, J., et al.: Modular organization of protein interaction networks. Bioinformatics 23, 207–214 (2007)

    Article  Google Scholar 

  13. Bader, G.D., Hogue, C.W.: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4, 2 (2003)

    Article  Google Scholar 

  14. King, A.D., Przulj, N., Jurisica, I.: Protein complex prediction via cost-based clustering. Bioinformatics 20, 3013–3020 (2004)

    Article  Google Scholar 

  15. Enright, A.J., Dongen, S.V., Ouzounis, C.A.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30(7), 1575–1584 (2002)

    Article  Google Scholar 

  16. Mukhopadhyay, A., Ray, S., De, M.: Detecting Protein Complexes in PPI Network: A Gene Ontology-based Multiobjective Evolutionary Approach. Molecular BioSystems 8(11), 3036–3048 (2012)

    Article  Google Scholar 

  17. Zhang, Y., Lin, H., Yang, Z., Wang, J., Li, Y., Xu, B.: Protein Complex Prediction in Large Ontology Attributed Protein-Protein Interaction Networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics 10(3), 729–741 (2013)

    Article  Google Scholar 

  18. Uetz, P., et al.: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403(6770), 623–627 (2000)

    Article  Google Scholar 

  19. Ito, T., et al.: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Genetics 98(8), 4569–4574 (2001)

    Google Scholar 

  20. Ho, Y., et al.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415(6868), 180–183 (2002)

    Article  Google Scholar 

  21. Krogan, N.J., et al.: Global Landscape of Protein Complexes in the Yeast Saccharomyces cerevisiae. Nature 440(7084), 637–643 (2006)

    Article  Google Scholar 

  22. Gavin, A.C., et al.: Proteome survey reveals modularity of the yeast cell machinery. Nature 440(7084), 631–636 (2006)

    Article  Google Scholar 

  23. Dwight, S.S., et al.: Saccharomyces Genome Database (SGD) provides secondary gene annotation using Gene Ontology (GO). Nucleic Acids Research 30(1), 69–72 (2002)

    Article  Google Scholar 

  24. Ivanoska, I., Trivodaliev, K., Kalajdziski, S.: Protein Function Prediction Using Semantic Driven K-Medoids Clustering Algorithm. International Journal of Machine Learning and Computing 4(1), 52–56 (2014)

    Article  Google Scholar 

  25. Resnik, P.: Using information content to evaluate semantic similarity. In: IJCAI 2005, pp. 448–453 (1995)

    Google Scholar 

  26. Witsenburg, T., Blockeel, H.: K-means based approaches to clustering nodes in annotated graphs. In: Kryszkiewicz, M., Rybinski, H., Skowron, A., Raś, Z.W. (eds.) ISMIS 2011. LNCS (LNAI), vol. 6804, pp. 346–357. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  27. Langfelder, P., Zhang, B., Horvath, S.: Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics 24(5), 719–720 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kire Trivodaliev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Trivodaliev, K., Ivanoska, I., Kalajdziski, S., Kocarev, L. (2015). Novel Gene Ontology Based Distance Metric for Function Prediction via Clustering in Protein Interaction Networks. In: Bogdanova, A., Gjorgjevikj, D. (eds) ICT Innovations 2014. ICT Innovations 2014. Advances in Intelligent Systems and Computing, vol 311. Springer, Cham. https://doi.org/10.1007/978-3-319-09879-1_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09879-1_17

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09878-4

  • Online ISBN: 978-3-319-09879-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics