Incremental mining of the schema of semistructured data

Zhou, Aoying; Jin, Wen; Zhou, Shuigeng; Qian, Weining; Tian, Zenping

doi:10.1007/BF02948811

Incremental mining of the schema of semistructured data

Published: May 2000

Volume 15, pages 241–248, (2000)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Zhou Aoying¹,
Jin Wen¹,
Zhou Shuigeng¹,
Qian Weining¹ &
…
Tian Zenping¹

40 Accesses
2 Citations
Explore all metrics

Abstract

Semistructured data are specified in lack of any fixed and rigid schema, even though typically some implicit structure appears in the data. The huge amounts of on-line applications make it important and imperative to mine the schema of semistructured data, both for the users (e.g., to gather useful information and facilitate querying) and for the systems (e.g., to optimize access). The critical problem is to discover the hidden structure in the semistructured data. Current methods in extracting Web data structure are either in a general way independent of application background, or bound in some concrete environment such as HTML, XML etc. But both face the burden of expensive cost and difficulty in keeping along with the frequent and complicated variances of Web data. In this paper, the problem of incremental mining of schema for semistructured data after the update of the raw data is discussed. An algorithm for incrementally mining the schema of semistructured data is provided, and some experimental results are, also given, which show that incremental mining for semistructured data is more efficient than non-incremental mining.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on semi-supervised learning

Article Open access 15 November 2019

A comprehensive survey of data mining

Article 06 February 2020

Big data preprocessing: methods and prospects

Article Open access 01 November 2016

References

Fayyad U M, Piatetsky-Shapiro G, Smyth P, Uthurusamy R. Advances inKnowledge Discovery and Data Mining. AAAI/MIT Press, 1996.
Chen M S, Han J H, Yu P S. Data mining: An overview from a database perspective.IEEE Trans. KDE, Dec. 1996, 8(6): 866–883.
Google Scholar
Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases. InProc. the ACM SIGMOD Conference on Management of Data. Washington, D. C., May 1993.
Agrawal R, Srikant R. Fast Algorithms for mining association rules. InProc. the 20th Int. Conference on Very Large Databases, Santiago, Chile, Sept. 1994.
Srikant R, Agrawal R. Mining generalized association rules. InProc. the 21st Int. Conference on Very Large Databases, Zurich, Switzerland, Sept. 1995.
Fu Y, Han J. Meta-rule-guided mining of association rules in relational databases. InProc. 1st Int. Workshop on Integration of Knowledge Discovery with Deductive and Object-Oriented Databases (KDOOD’95), Singapore, Dec. 1995, pp.39–46.
Koperski K, Han J. Discovery of spatial association rules in geographic information databases. InAdvances in Spatial Databases, Proceedings of 4th Symposium, SSD’95, (Aug.6–9, Portiand, Maine). Springer-Verlag, Berlin. 1995, pp.47–66.
Google Scholar
Nestorov S, Abiteboul S, Motwani R. Inferring structure in semistructured data. (http://www.cs.stanford.edu/~rajeev)
Wang K, Liu H Q. Schema discovery for semistructured data. InProc. KDD’97.
Arocena G O, Mendelzon A O. WebOQL: Restructuring documents, databases and Webs. InProc. ICDE, Orlando, Florida, USA, February 1998.
Lakshmanan L, Sadri F, Subramanian I. A declarative language for querying and restructuring the Web. InProc. 6th Int. Workshop on Research Issues in Data Engineering, New Orleans, 1996.
Mendelzon A O, Mihaila G, Milo T. Querying the World Wide Web. InProc. PDIS’96, Miami, December 1996.
Papakonstantinow Y, Garcia-Marlia H, Widom J. Object exchange, across heterogeneous information sources. InProc. ICDE, Taiwan, march 1995, pp.251–260,
Cheung D W, Han J, Wong C Y. Maintenance of discovered association rules in large databases: An incremental updating technique. InProc. ICDE, New Orleans, LA., Feb. 1996.

Download references

Author information

Authors and Affiliations

Department of Computer Science, Fudan University, 200433, Shanghai, P.R. China
Zhou Aoying, Jin Wen, Zhou Shuigeng, Qian Weining & Tian Zenping

Authors

Zhou Aoying
View author publications
You can also search for this author in PubMed Google Scholar
Jin Wen
View author publications
You can also search for this author in PubMed Google Scholar
Zhou Shuigeng
View author publications
You can also search for this author in PubMed Google Scholar
Qian Weining
View author publications
You can also search for this author in PubMed Google Scholar
Tian Zenping
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhou Aoying.

Additional information

This work was supported by the National Natural Science Foundation of China, the National 973 Fundamental Research Programme of China and the Doctoral Programme Foundation of Higher-Education.

ZHOU Aoying received his M.S. degree in computer science from Chengdu University of Science and Technology in 1988, and his Ph.D. degree in computer software from Fudan University in 1993. He is currently a Professor in the Department of Computer Science, Fudan University His main research interests include object-oriented data model for multimedia information, CIMS data management, data mining and data warehousing, the novel database technologies and their application in digital library and electronic commerce.

JIN Wen received his M.S. degree in computer science from Southeast University in 1996. He is currently a Ph.D. candidate at the Computer School of Simon Fraser University, Canada. His current research interests include databases, data warehouses and data mining.

ZHOU Shuigeng is currently a Ph.D. candidate at Computer Science Department, Fudan University. He received his B.E. degree and M.E. degree, both in electronic engineering, from Huazhong University of Science and Technology, University of Electronic Science and Technology in 1988 and 1991, respectively. Before he started his Ph.D. program, he was a senior electronic engineer at the General Design Department of Shanghai Space-flight Academy. His current research areas cover databases, data warehousing, data mining and information retrieval.

QIAN Weining is a graduate student at Computer Science Department, Fudan University. His current research interests are databases and data mining

TIAN Zengping received the Ph.D. degree in computer science from Fudan University in 1997. He is currently an associate Professor at the same university. Dr. Tian’s research interests include database, multimedia database, workflow and Web mining.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, A., Jin, W., Zhou, S. et al. Incremental mining of the schema of semistructured data. J. Comput. Sci. & Technol. 15, 241–248 (2000). https://doi.org/10.1007/BF02948811

Download citation

Received: 14 December 1998
Revised: 22 October 1999
Issue Date: May 2000
DOI: https://doi.org/10.1007/BF02948811

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Incremental mining of the schema of semistructured data

Abstract

Access this article

Similar content being viewed by others

A survey on semi-supervised learning

A comprehensive survey of data mining

Big data preprocessing: methods and prospects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Incremental mining of the schema of semistructured data

Abstract

Access this article

Similar content being viewed by others

A survey on semi-supervised learning

A comprehensive survey of data mining

Big data preprocessing: methods and prospects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation