Skip to main content

Discovery of Probabilistic Mappings between Taxonomies: Principles and Experiments

  • Chapter
Journal on Data Semantics XV

Part of the book series: Lecture Notes in Computer Science ((JODS,volume 6720))

Abstract

In this paper, we investigate a principled approach for defining and discovering probabilistic mappings between two taxonomies. First, we compare two ways of modeling probabilistic mappings which are compatible with the logical constraints declared in each taxonomy. Then we describe a generate and test algorithm which minimizes the number of calls to the probability estimator for determining those mappings whose probability exceeds a certain threshold. Finally, we provide an experimental analysis of this approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adams, E.: A Primer of Probability logic, CSLI. Stanford University, Stanford (1998)

    Google Scholar 

  2. Adjiman, P., Chatalic, P., Goasdoué, F., Rousset, M.C., Simon, L.: Distributed reasoning in a peer-to-peer setting: Application to the semantic web. Journal of Artificial Intelligence Research (JAIR) 25, 269–314 (2006)

    MATH  MathSciNet  Google Scholar 

  3. Aumueller, D., Do, H.H., Massmann, S., Rahm, E.: Schema and ontology matching with COMA++. In: SIGMOD 2005: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 906–908. ACM, New York (2005)

    Google Scholar 

  4. Beeri, C., Dowd, M., Fagin, R., Statman, R.: On the structure of Armstrong relations for functional dependencies. Journal of the ACM (JACM) 31(1), 30–46 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  5. Benjelloun, O., Sarma, A.D., Halevy, A.Y., Widom, J.: ULDBs: Databases with uncertainty and lineage. In: VLDB, pp. 953–964 (2006)

    Google Scholar 

  6. Castano, S., Ferrara, A., Lorusso, D., Näth, T.H., Möller, R.: Mapping validation by probabilistic reasoning. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 170–184. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  7. Castano, S., Ferrara, A., Messa, G.: Results of the H-MATCH ontology matchmaker in OAEI 2006. In: Proceedings of the ISWC 2006 Workshop on Ontology Matching, Athens, GA, USA (2006)

    Google Scholar 

  8. Castano, S., Ferrara, A., Montanelli, S.: H-MATCH: an algorithm for dynamically matching ontologies in peer-based systems. In: SWDB, pp. 231–250 (2003)

    Google Scholar 

  9. Chiticariu, L., Hernández, M.A., Kolaitis, P.G., Popa, L.: Semi-automatic schema integration in clio. In: VLDB, pp. 1326–1329 (2007)

    Google Scholar 

  10. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. The MIT Press, Cambridge (September 2001)

    MATH  Google Scholar 

  11. Dalvi, N.N., Suciu, D.: Answering queries from statistics and probabilistic views. In: VLDB, pp. 805–816 (2005)

    Google Scholar 

  12. David, J., Guillet, F., Gras, R., Briand, H.: An interactive, asymmetric and extensional method for matching conceptual hierarchies. In: EMOI-INTEROP Workshop, Luxembourg (2006)

    Google Scholar 

  13. Dean, M., Schreiber, G.: OWL web ontology language reference. W3C recommendation, W3C (February 2004)

    Google Scholar 

  14. Degroot, M.H.: Optimal Statistical Decisions (Wiley Classics Library). Wiley-Interscience, Hoboken (April 2004)

    Book  Google Scholar 

  15. Do, H.H., Rahm, E.: COMA - a system for flexible combination of schema matching approaches. In: VLDB (2002)

    Google Scholar 

  16. Doan, A., Domingos, P., Levy, A.Y.: Learning mappings between data schemas. In: Proceedings of the AAAI 2000 Workshop on Learning Statistical Models from Relational DatA (2000)

    Google Scholar 

  17. Doan, A., Madhavan, J., Domingos, P., Halevy, A.Y.: Learning to map between ontologies on the Semantic Web. In: WWW, pp. 662–673 (2002)

    Google Scholar 

  18. Dong, X.L., Halevy, A.Y., Yu, C.: Data integration with uncertainty. In: VLDB, pp. 687–698 (2007)

    Google Scholar 

  19. Duchon, P., Flajolet, P., Louchard, G., Schaeffer, G.: Boltzmann samplers for the random generation of combinatorial structures. Comb. Probab. Comput. 13(4-5), 577–625 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  20. Euzenat, J., Ferrara, A., Hollink, L., Isaac, A., Joslyn, C., Malais, V., Meilicke, C., Nikolov, A., Pane, J., Sabou, M., et al.: Results of the ontology alignment evaluation initiative 2009. In: Fourth International Workshop on Ontology Matching, Washington, DC (2009)

    Google Scholar 

  21. Euzenat, J.: Semantic Precision and Recall for Ontology Alignment Evaluation. In: IJCAI, pp. 348–353 (2007)

    Google Scholar 

  22. Euzenat, J.: Ontology alignment evaluation initiative (July 2008), http://www.oaei.ontologymatching.org/

  23. Euzenat, J., Shvaiko, P.: Ontology matching. Springer, Heidelberg (2007)

    MATH  Google Scholar 

  24. Euzenat, J., Valtchev, P.: Similarity-based ontology alignment in OWL-Lite. In: ECAI, pp. 333–337 (2004)

    Google Scholar 

  25. Fagin, R.: Horn clauses and database dependencies. J. ACM 29(4), 952–985 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  26. Fellbaum, C.: WordNet: An Electronic Lexical Database (Language, Speech, and Communication). The MIT Press, Cambridge (May 1998)

    MATH  Google Scholar 

  27. Flake, G.W., Lawrence, S.: Efficient SVM regression training with SMO. Mach. Learn. 46(1-3), 271–290 (2002)

    Article  MATH  Google Scholar 

  28. Gal, A.: Managing uncertainty in schema matching with top-k schema mappings. Journal on Data Semantics 6 (2006)

    Google Scholar 

  29. Gal, A., Anaby-Tavor, A., Trombetta, A., Montesi, D.: A framework for modeling and evaluating automatic semantic reconciliation. The VLDB Journal 14(1), 50–67 (2005), http://www.portal.acm.org.gate6.inist.fr/citation.cfm?id=1053477

    Article  Google Scholar 

  30. Giunchiglia, F., Shvaiko, P., Yatskevich, M.: S-match: an algorithm and an implementation of semantic matching. In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds.) ESWS 2004. LNCS, vol. 3053, pp. 61–75. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  31. Hamdi, F., Safar, B., Reynaud, C., Zargayouna, H.: Alignment-based Partitioning of Large-scale Ontologies. In: Guillet, F., Ritschard, G., Zighed, D.A., Briand, H. (eds.) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol. 292, pp. 251–269. Springer, Heidelberg (2010), http://www.hal.inria.fr/inria-00432606/en/

    Chapter  Google Scholar 

  32. Hamdi, F., Zargayouna, H., Safar, B., Reynaud, C.: TaxoMap in the OAEI 2008 alignment contest. In: Ontology Alignment Evaluation Initiative (OAEI) 2008, Campaign - Int. Workshop on Ontology Matching (2008)

    Google Scholar 

  33. Hayes, P. (ed.) RDF Semantics. W3C Recommendation, World Wide Web Consortium (February 2004), http://www.w3.org/TR/rdf-mt/

  34. Ichise, R., Takeda, H., Honiden, S.: Integrating multiple internet directories by instance-based learning. In: International Joint Conference on Artificial Intelligence (IJCAI), vol. 18, pp. 22–30 (2003)

    Google Scholar 

  35. Ichise, R., Hamasaki, M., Takeda, H.: Discovering relationships among catalogs. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS (LNAI), vol. 3245, pp. 371–379. Springer, Heidelberg (2004), http://www.citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.121.5336

    Chapter  Google Scholar 

  36. Isaac, A., van der Meij, L., Schlobach, S., Wang, S.: An empirical study of instance-based ontology matching. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 253–266. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  37. Koller, D., Levy, A., Pfeffer, A.: P-CLASSIC: a tractable probablistic description logic. In: Proceedings of the National Conference on Artificial Intelligence, pp. 390–397 (1997)

    Google Scholar 

  38. Li, W.S., Clifton, C.: SEMINT: a tool for identifying attribute correspondences in heterogeneous databases using neural networks. Data Knowl. Eng. 33(1), 49–84 (2000)

    Article  MATH  Google Scholar 

  39. Lin, F., Sandkuhl, K.: A survey of exploiting wordnet in ontology matching. In: Artificial Intelligence in Theory and Practice II, pp. 341–350 (2008)

    Google Scholar 

  40. Madhavan, J., Bernstein, P.A., Doan, A., Halevy, A.: Corpus-based schema matching. In: International Conference on Data Engineering, pp. 57–68 (2005)

    Google Scholar 

  41. Madhavan, J., Bernstein, P.A., Rahm, E.: Generic schema matching with cupid. The VLDB Journal, 49–58 (2001), http://www.citeseer.ist.psu.edu/madhavan01generic.html

  42. Mao, M., Peng, Y.: PRIOR system: Results for OAEI 2006. In: Proceedings of the Ontology Alignment Evaluation Initiative, pp. 165–172 (2006)

    Google Scholar 

  43. Melnik, S., Garcia-Molina, H., Rahm, E., et al.: Similarity flooding: A versatile graph matching algorithm and its application to schema matching. In: Proceedings of the International Conference on Data Engineering, pp. 117–128 (2002)

    Google Scholar 

  44. Mitchell, T.: Machine Learning. McGraw-Hill Education (ISE Editions) (1997), http://www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20&path=ASIN/0071154671

  45. Nottelmann, H., Straccia, U.: Information retrieval and machine learning for probabilistic schema matching. Information Processing and Management 43(3), 552–576 (2007)

    Article  Google Scholar 

  46. Nottelmann, H., Straccia, U.: A probabilistic, logic-based framework for automated web director alignment. In: Ma, Z. (ed.) Soft Computing in Ontologies and the Semantic Web. Studies in Fuzziness and Soft Computing, vol. 204, pp. 47–77. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  47. Quinlan, R.J.: C4.5: Programs for Machine Learning. Morgan Kaufmann Series in Machine Learning. Morgan Kaufmann, San Francisco (January 1993), http://www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20&path=ASIN/1558602380

    Google Scholar 

  48. Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB Journal 10(4), 334–350 (2001)

    Article  MATH  Google Scholar 

  49. Ramesh, G., Maniatty, W., Zaki, M.J.: Feasible itemset distributions in data mining: theory and application. In: PODS, pp. 284–295 (2003)

    Google Scholar 

  50. Resnik, P.: Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research 11(95), 130 (1999)

    MATH  Google Scholar 

  51. Saïs, F., Pernelle, N., Rousset, M.C.: Combining a logical and a numerical method for data reconciliation. In: Spaccapietra, S. (ed.) Journal on Data Semantics XII. LNCS, vol. 5480, pp. 66–94. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  52. Serafini, L., Bouquet, P., Magnini, B., Zanobini, S.: An algorithm for matching contextualized schemas via SAT. In: Proceedings of CONTEXT 2003 (2003)

    Google Scholar 

  53. Shvaiko, P., Euzenat, J.: A survey of schema-based matching approaches. In: Spaccapietra, S. (ed.) Journal on Data Semantics IV. LNCS, vol. 3730, pp. 146–171. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  54. Shvaiko, P., Euzenat, J.: Ten challenges for ontology matching. In: Meersman, R., Tari, Z. (eds.) OTM 2008, Part II, pp. 1164–1182. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  55. Stumme, G., Maedche, A.: FCA-MERGE: Bottom-Up Merging of Ontologies. In: Proc. of the 17th International Joint Conference on Artificial Intelligence, pp. 225–234 (2001)

    Google Scholar 

  56. Tournaire, R., Petit, J.M., Rousset, M.C., Termier, A.: Discovery of Probabilistic Mappings between Taxonomies: Principles and Experiments (technical report) (2009), http://www.membres-liglab.imag.fr/tournaire/longpaper.pdf

  57. Tournaire, R., Rousset, M.C.: Découverte automatique de correspondances entre taxonomies - internal report (in french) (2008), http://www.membres-liglab.imag.fr/tournaire/irap08.pdf

  58. Van Rijsbergen, C.J.: Information retrieval. Butterworths, London (1975)

    MATH  Google Scholar 

  59. Wang, P., Xu, B.: Lily: Ontology alignment results for OAEI 2009. Shvaiko, et al [SEG+ 09] (2009)

    Google Scholar 

  60. Wang, S., Englebienne, G., Schlobach, S.: Learning concept mappings from instance similarity. In: Sheth, A.P., Staab, S., Dean, M., Paolucci, M., Maynard, D., Finin, T., Thirunarayan, K. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 339–355. Springer, Heidelberg (2008), http://www.portal.acm.org.gate6.inist.fr/citation.cfm?id=1483184

    Chapter  Google Scholar 

  61. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Tournaire, R., Petit, JM., Rousset, MC., Termier, A. (2011). Discovery of Probabilistic Mappings between Taxonomies: Principles and Experiments. In: Spaccapietra, S. (eds) Journal on Data Semantics XV. Lecture Notes in Computer Science, vol 6720. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22630-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22630-4_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22629-8

  • Online ISBN: 978-3-642-22630-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics