In Praise of Laziness: A Lazy Strategy for Web Information Extraction

Ozcan, Rifat; Altingovde, Ismail Sengor; Ulusoy, Özgür

doi:10.1007/978-3-642-28997-2_65

Rifat Ozcan²²,
Ismail Sengor Altingovde²³ &
Özgür Ulusoy²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7224))

Included in the following conference series:

European Conference on Information Retrieval

2720 Accesses
2 Citations

Abstract

A large number of Web information extraction algorithms are based on machine learning techniques. For such extraction algorithms, we propose employing a lazy learning strategy to build a specialized model for each test instance to improve the extraction accuracy and avoid the disadvantages of constructing a single general model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aha, D.W. (ed.): Lazy learning. Kluwer Academic Publishers, Norwell (1997)
MATH Google Scholar
Chang, C.-H., Kayed, M., Girgis, M.R., Shaalan, K.F.: A survey of web information extraction systems. IEEE Trans. Knowl. Data Eng. 18(10), 1411–1428 (2006)
Article Google Scholar
Freitag, D.: Information extraction from html: Application of a general machine learning approach. In: Proceedings of AAAI/IAAI, pp. 517–523 (1998)
Google Scholar
Veloso, A., Meira Jr., W., Zaki, M.J.: Lazy associative classification. In: Proceedings of IEEE International Conference on Data Mining, pp. 645–654 (2006)
Google Scholar
Wachsmuth, H., Stein, B., Engels, G.: Constructing efficient information extraction pipelines. In: Proceedings of CIKM 2011, pp. 2237–2240 (2011)
Google Scholar
WebKB: CMU, world wide knowledge base (WebKB) project (2011) http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data

Download references

Author information

Authors and Affiliations

Computer Engineering Department, Bilkent University, Ankara, Turkey
Rifat Ozcan & Özgür Ulusoy
L3S Research Center, Hannover, Germany
Ismail Sengor Altingovde

Authors

Rifat Ozcan
View author publications
You can also search for this author in PubMed Google Scholar
Ismail Sengor Altingovde
View author publications
You can also search for this author in PubMed Google Scholar
Özgür Ulusoy
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Yahoo! Research, Diagonal 177, 08018, Barcelona, Spain
Ricardo Baeza-Yates & B. Barla Cambazoglu &
Centrum Wiskunde & Informatica, Science Park 123, Amsterdam, The Netherlands
Arjen P. de Vries
Websays, Nàpols 294 7-4, 08025, Barcelona, Spain
Hugo Zaragoza
Yahoo! Research, Diagnoal 177, 08018, Barcelona, Spain
Vanessa Murdock
Yahoo! Labs, Tower 3, Matam Park, 31905, Haifa, Israel
Ronny Lempel
ISTI-CNR, via G. Moruzzi, 1, 56124, Pisa, Italy
Fabrizio Silvestri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ozcan, R., Altingovde, I.S., Ulusoy, Ö. (2012). In Praise of Laziness: A Lazy Strategy for Web Information Extraction. In: Baeza-Yates, R., et al. Advances in Information Retrieval. ECIR 2012. Lecture Notes in Computer Science, vol 7224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28997-2_65

Download citation

DOI: https://doi.org/10.1007/978-3-642-28997-2_65
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28996-5
Online ISBN: 978-3-642-28997-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics