Skip to main content

Matching Techniques for Data Integration and Exploration: From Databases to Big Data

  • Chapter
  • First Online:
A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years

Part of the book series: Studies in Big Data ((SBD,volume 31))

Abstract

In the last two decades, data matching has been addressed for different purposes and in different application contexts, ranging from data integration, to ontology evolution, to semantic data clouding, until more recent exploratory data analysis over large/big datasets. This paper describes the evolution of research activity on matching techniques for data integration and exploration at the ISLab group of the Università degli Studi di Milano. We analyze the matching techniques according to the structure of target data, the algorithmic pattern of the matching process, and the application focus, and we discuss the results of using our techniques for exploratory analysis of a real dataset composed by all the SEBD proceedings publications in the timeframe 1993–2016.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Data have been collected from the DBLP database (http://dblp.org), except for the year 2013 that is missing from DBLP. 1993 data have been collected from the Scopus DB (https://www.scopus.com).

References

  1. C.C. Aggarwal, S.Y. Philip, On clustering massive text and categorical data streams. Knowl. Inf. Syst. 24(2), 171–196 (2010)

    Article  Google Scholar 

  2. P. Berkhin, Grouping multidimensional data, A Survey of Clustering Data Mining Techniques (Springer, Berlin, 2006)

    Chapter  Google Scholar 

  3. D.M. Blei, A.Y. Ng, M.I. Jordan, Latent dirichlet allocation. J. Mach. Learn. Res. 3(4–5), 993–1022 (2003)

    MATH  Google Scholar 

  4. S. Castano, V. De Antonellis, Global viewing of heterogeneous data sources. IEEE Trans. Knowl. Data Eng. 13(2), 277–297 (2001)

    Article  Google Scholar 

  5. S. Castano, A. Ferrara, S. Montanelli, Matching ontologies in open networked systems: techniques and applications. J. Data Semant. V, 25–63 (2006)

    Google Scholar 

  6. S. Castano, A. Ferrara, S. Montanelli, Structured data clouding across multiple webs. Inf. Syst. 37(4), 352–371 (2012)

    Article  Google Scholar 

  7. S. Castano, A. Ferrara, S. Montanelli, Human-in-the-loop web resource classification, in Proceedings of the On the Move to Meaningful Internet Systems: OTM 2016 Conferences (Rhodes, Greece, 2016), pp. 229–244

    Google Scholar 

  8. S. Castano, A. Ferrara, S. Montanelli, Exploratory analysis of textual data streams. Future Gener. Comput. Syst. 68, 391–406 (2017)

    Article  Google Scholar 

  9. A. Ferrara, A. Nikolov, F. Scharffe, Data Linking for the Semantic Web. Semantic Web: Ontology and Knowledge Base Enabled Tools, Services, and Applications 169 (2013)

    Google Scholar 

  10. A. Ferrara, L. Genta, S. Montanelli, S. Castano, Dimensional clustering of linked data: techniques and applications. Trans. Large-Scale Data- Knowl.-Centered Syst. XIX, 55–86 (2015)

    MathSciNet  Google Scholar 

  11. A.Y. Halevy, Answering queries using views: a survey. VLDB J. 10(4), 270–294 (2001)

    Article  MATH  Google Scholar 

  12. A. Halevy, A. Rajaraman, J. Ordille, Data integration: the teenage years, in Proceedings of the 32nd International Conference on Very Large Data Bases, VLDB Endowment (2006), pp. 9–16

    Google Scholar 

  13. C.D. Manning, P. Raghavan, H. Schütze, Introduction to Information Retrieval, vol. 1 (Cambridge university press Cambridge, Cambridge, 2008)

    Book  MATH  Google Scholar 

  14. E. Rahm, P.A. Bernstein, A survey of approaches to automatic schema matching. VLDB J. 10(4), 334–350 (2001)

    Article  MATH  Google Scholar 

  15. P. Shvaiko, J. Euzenat, A Survey of Schema-based Matching Approaches. J. Data Semant. IV (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefano Montanelli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Castano, S., Ferrara, A., Montanelli, S. (2018). Matching Techniques for Data Integration and Exploration: From Databases to Big Data. In: Flesca, S., Greco, S., Masciari, E., Saccà, D. (eds) A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years. Studies in Big Data, vol 31. Springer, Cham. https://doi.org/10.1007/978-3-319-61893-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-61893-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-61892-0

  • Online ISBN: 978-3-319-61893-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics