Skip to main content

OnSim: A Similarity Measure for Determining Relatedness Between Ontology Terms

  • Conference paper
  • First Online:
Data Integration in the Life Sciences (DILS 2015)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9162))

Included in the following conference series:

Abstract

Accurately measuring relatedness between ontology terms becomes a building block for determining similarity of ontology-based annotated entities, e.g., genes annotated with the Gene Ontology. However, existing measures that determine similarity between ontology terms mainly rely on taxonomic hierarchies of classes, and may not fully exploit the semantics encoded in the ontology, i.e., object properties and their axioms. This limitation may conduct to ignore the stated or inferred facts where an ontology term participate in the ontology, i.e., the term neighborhood. Thus, high values of similarity can be erroneously assigned to terms that are taxonomically similar, but whose neighborhoods are different. We present OnSim, a measure where semantics encoded in the ontology is considered as a first-class citizen and exploited to determine relatedness of ontology terms. OnSim considers the neighborhoods of two terms, as well as the object properties that are present in the neighborhood facts and the justifications that support the entailment of these facts. We have extended an existing annotation-based similarity measure with OnSim, and empirically studied the impact of producing accurate values of ontology term relatedness. Experiments were run on benchmarks published by the Collaborative Evaluation of Semantic Similarity Measures (CESSM) tool. The observed results suggest that OnSim increases the Pearson’s correlation coefficient of the annotation-based similarity measure with respect to gold standard similarity measures, as well as its effectiveness is improved with respect to state-of-the-art semantic similarity measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://chem2bio2rdf.org/.

  2. 2.

    http://bio2rdf.org/.

  3. 3.

    http://openphacts.org.

  4. 4.

    http://linkedlifedata.com.

  5. 5.

    http://geneontology.org/.

  6. 6.

    http://www.ebi.ac.uk/GOA.

  7. 7.

    http://www.pseudomonas.com/go_annotation_project_2014.jsp.

  8. 8.

    http://xldb.di.fc.ul.pt/tools/cessm/about.php.

  9. 9.

    http://www.uniprot.org/.

  10. 10.

    http://www.human-phenotype-ontology.org/.

  11. 11.

    https://www.fi-star.eu.

  12. 12.

    http://www.ebi.ac.uk/GOA.

  13. 13.

    According to OWL2 semantics the inferred fact is \(a_i\) subClassOf \(r_{ij}\) some \(a_j\).

  14. 14.

    For this ontology we used the Product TN for Sim and \(Sim_D\).

  15. 15.

    http://xldb.di.fc.ul.pt/tools/cessm/.

  16. 16.

    http://xldb.di.fc.ul.pt/biotools/cessm2014/.

  17. 17.

    http://blast.ncbi.nlm.nih.gov/.

  18. 18.

    http://www.ebi.ac.uk/Tools/sss/fasta/.

References

  1. Benik, J., Chang, C., Raschid, L., Vidal, M.-E., Palma, G., Thor, A.: Finding cross genome patterns in annotation graphs. In: Bodenreider, O., Rance, B. (eds.) DILS 2012. LNCS, vol. 7348, pp. 21–36. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  2. Cook, W., Rohe, A.: Blossom iv: code for minimum weight perfect matchings (2008). http://www2.isye.gatech.edu/wcook/software.html

  3. Couto, F.M., Pinto, H.S.: The next generation of similarity measures that fully explore the semantics in biomedical ontologies. J. Bioinf. Comput. Biol. 11(5), 1–12 (2013)

    Article  Google Scholar 

  4. Couto, F.M., Silva, M.J., Coutinho, P.: Measuring semantic similarity between gene ontology terms. Data Knowl. Eng. 61(1), 137–152 (2007)

    Article  Google Scholar 

  5. d’Amato, C., Staab, S., Fanizzi, N.: On the influence of description logics ontologies on conceptual similarity. In: Gangemi, A., Euzenat, J. (eds.) EKAW 2008. LNCS (LNAI), vol. 5268, pp. 48–63. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  6. Devos, D., Valencia, A.: Practical limits of function prediction. Proteins: Struct. Funct. Bioinf. 41(1), 98–107 (2000)

    Article  Google Scholar 

  7. Jeh, G., Widom, J.: Simrank: a measure of structural-context similarity. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 538–543. ACM (2002)

    Google Scholar 

  8. Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. CoRR, arXiv:cmp-lg/9709008 (1997)

  9. Lin, D.: An information-theoretic definition of similarity. In: ICML, vol. 98 (1998)

    Google Scholar 

  10. Lord, P., Stevens, R., Brass, A., Goble, C.: Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation. Bioinformatics 19, 1275–1283 (2003)

    Article  Google Scholar 

  11. Schmachtenberg, M., Bizer, C., Paulheim, H.: Adoption of the linked data best practices in different topical domains. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 245–260. Springer, Heidelberg (2014)

    Google Scholar 

  12. Palma, G., Vidal, M.-E., Haag, E., Raschid, L., Thor, A.: Measuring relatedness between scientific entities in annotation datasets. In: ACM-BCB 2013. ACM (2013)

    Google Scholar 

  13. Pekar, V., Staab, S.: Taxonomy learning: factoring the structure of a taxonomy into a semantic classification decision. In: Proceedings of the 19th ICCL, vol. 1, pp. 1–7. Association for Computational Linguistics (2002)

    Google Scholar 

  14. Pesquita, C., Faria, D., Bastos, H., Falcao, A., Couto, F.: Evaluating go-based semantic similarity measures. In: SMB/ECCB 2007 Bio-ontologies SIG (2007)

    Google Scholar 

  15. Pesquita, C., Pessoa, D., Faria, D., Couto, F.: Cessm: collaborative evaluation of semantic similarity measures. Challenges Bioinf. (JB2009) 157, 190 (2009)

    Google Scholar 

  16. Resnik, P.: Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. J. Artif. Intell. Res. 11, 95–130 (1999)

    Google Scholar 

  17. Schwartz, J., Steger, A., Weißl, A.: Fast algorithms for weighted bipartite matching. In: Nikoletseas, S.E. (ed.) WEA 2005. LNCS, vol. 3503, pp. 476–487. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  18. Sevilla, J.L., Segura, V., Podhorski, A., Guruceaga, E., Mato, J.M., Martínez-Cruz, L.A., Corrales, F.J., Rubio, A.: Correlation between gene expression and go semantic similarity. IEEE/ACM Trans. Comput. Biol. Bioinf. 2(4), 330–338 (2005)

    Article  Google Scholar 

  19. Shi, C., Kong, X., Huang, Y., Yu, P.S., Wu, B.: Hetesim: a general framework for relevance measure in heterogeneous networks. arXiv preprint arXiv:1309.7393 (2013)

  20. Smith, T., Waterman, M.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)

    Article  Google Scholar 

  21. Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: Pathsim: meta path-based top-k similarity search in heterogeneous information networks. In: VLDB 2011 (2011)

    Google Scholar 

Download references

Acknowledgments

This work was supported by the German Ministry of Economy and Energy within the TIGRESS project (Ref. KF2076928MS3) and the EU’s 7th Framework Programme FI.ICT-2011.1.8 (FI-STAR, Grant 604691).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ignacio Traverso-Ribón .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Traverso-Ribón, I., Vidal, ME., Palma, G. (2015). OnSim: A Similarity Measure for Determining Relatedness Between Ontology Terms. In: Ashish, N., Ambite, JL. (eds) Data Integration in the Life Sciences. DILS 2015. Lecture Notes in Computer Science(), vol 9162. Springer, Cham. https://doi.org/10.1007/978-3-319-21843-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-21843-4_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-21842-7

  • Online ISBN: 978-3-319-21843-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics