Abstract
Accurately measuring relatedness between ontology terms becomes a building block for determining similarity of ontology-based annotated entities, e.g., genes annotated with the Gene Ontology. However, existing measures that determine similarity between ontology terms mainly rely on taxonomic hierarchies of classes, and may not fully exploit the semantics encoded in the ontology, i.e., object properties and their axioms. This limitation may conduct to ignore the stated or inferred facts where an ontology term participate in the ontology, i.e., the term neighborhood. Thus, high values of similarity can be erroneously assigned to terms that are taxonomically similar, but whose neighborhoods are different. We present OnSim, a measure where semantics encoded in the ontology is considered as a first-class citizen and exploited to determine relatedness of ontology terms. OnSim considers the neighborhoods of two terms, as well as the object properties that are present in the neighborhood facts and the justifications that support the entailment of these facts. We have extended an existing annotation-based similarity measure with OnSim, and empirically studied the impact of producing accurate values of ontology term relatedness. Experiments were run on benchmarks published by the Collaborative Evaluation of Semantic Similarity Measures (CESSM) tool. The observed results suggest that OnSim increases the Pearson’s correlation coefficient of the annotation-based similarity measure with respect to gold standard similarity measures, as well as its effectiveness is improved with respect to state-of-the-art semantic similarity measures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
According to OWL2 semantics the inferred fact is \(a_i\) subClassOf \(r_{ij}\) some \(a_j\).
- 14.
For this ontology we used the Product TN for Sim and \(Sim_D\).
- 15.
- 16.
- 17.
- 18.
References
Benik, J., Chang, C., Raschid, L., Vidal, M.-E., Palma, G., Thor, A.: Finding cross genome patterns in annotation graphs. In: Bodenreider, O., Rance, B. (eds.) DILS 2012. LNCS, vol. 7348, pp. 21–36. Springer, Heidelberg (2012)
Cook, W., Rohe, A.: Blossom iv: code for minimum weight perfect matchings (2008). http://www2.isye.gatech.edu/wcook/software.html
Couto, F.M., Pinto, H.S.: The next generation of similarity measures that fully explore the semantics in biomedical ontologies. J. Bioinf. Comput. Biol. 11(5), 1–12 (2013)
Couto, F.M., Silva, M.J., Coutinho, P.: Measuring semantic similarity between gene ontology terms. Data Knowl. Eng. 61(1), 137–152 (2007)
d’Amato, C., Staab, S., Fanizzi, N.: On the influence of description logics ontologies on conceptual similarity. In: Gangemi, A., Euzenat, J. (eds.) EKAW 2008. LNCS (LNAI), vol. 5268, pp. 48–63. Springer, Heidelberg (2008)
Devos, D., Valencia, A.: Practical limits of function prediction. Proteins: Struct. Funct. Bioinf. 41(1), 98–107 (2000)
Jeh, G., Widom, J.: Simrank: a measure of structural-context similarity. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 538–543. ACM (2002)
Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. CoRR, arXiv:cmp-lg/9709008 (1997)
Lin, D.: An information-theoretic definition of similarity. In: ICML, vol. 98 (1998)
Lord, P., Stevens, R., Brass, A., Goble, C.: Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation. Bioinformatics 19, 1275–1283 (2003)
Schmachtenberg, M., Bizer, C., Paulheim, H.: Adoption of the linked data best practices in different topical domains. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 245–260. Springer, Heidelberg (2014)
Palma, G., Vidal, M.-E., Haag, E., Raschid, L., Thor, A.: Measuring relatedness between scientific entities in annotation datasets. In: ACM-BCB 2013. ACM (2013)
Pekar, V., Staab, S.: Taxonomy learning: factoring the structure of a taxonomy into a semantic classification decision. In: Proceedings of the 19th ICCL, vol. 1, pp. 1–7. Association for Computational Linguistics (2002)
Pesquita, C., Faria, D., Bastos, H., Falcao, A., Couto, F.: Evaluating go-based semantic similarity measures. In: SMB/ECCB 2007 Bio-ontologies SIG (2007)
Pesquita, C., Pessoa, D., Faria, D., Couto, F.: Cessm: collaborative evaluation of semantic similarity measures. Challenges Bioinf. (JB2009) 157, 190 (2009)
Resnik, P.: Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. J. Artif. Intell. Res. 11, 95–130 (1999)
Schwartz, J., Steger, A., Weißl, A.: Fast algorithms for weighted bipartite matching. In: Nikoletseas, S.E. (ed.) WEA 2005. LNCS, vol. 3503, pp. 476–487. Springer, Heidelberg (2005)
Sevilla, J.L., Segura, V., Podhorski, A., Guruceaga, E., Mato, J.M., Martínez-Cruz, L.A., Corrales, F.J., Rubio, A.: Correlation between gene expression and go semantic similarity. IEEE/ACM Trans. Comput. Biol. Bioinf. 2(4), 330–338 (2005)
Shi, C., Kong, X., Huang, Y., Yu, P.S., Wu, B.: Hetesim: a general framework for relevance measure in heterogeneous networks. arXiv preprint arXiv:1309.7393 (2013)
Smith, T., Waterman, M.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)
Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: Pathsim: meta path-based top-k similarity search in heterogeneous information networks. In: VLDB 2011 (2011)
Acknowledgments
This work was supported by the German Ministry of Economy and Energy within the TIGRESS project (Ref. KF2076928MS3) and the EU’s 7th Framework Programme FI.ICT-2011.1.8 (FI-STAR, Grant 604691).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Traverso-Ribón, I., Vidal, ME., Palma, G. (2015). OnSim: A Similarity Measure for Determining Relatedness Between Ontology Terms. In: Ashish, N., Ambite, JL. (eds) Data Integration in the Life Sciences. DILS 2015. Lecture Notes in Computer Science(), vol 9162. Springer, Cham. https://doi.org/10.1007/978-3-319-21843-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-21843-4_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21842-7
Online ISBN: 978-3-319-21843-4
eBook Packages: Computer ScienceComputer Science (R0)