Abstract
In this work, we introduce a new method for information extraction from the semantic web. The fundamental idea is to model the semantic information contained in the microformats of a set of web pages, by using a data structure called semantic network. Then, we introduce a novel technique for information extraction from semantic networks. In particular, the technique allows us to extract a portion—a slice—of the semantic network with respect to some criterion of interest. The slice obtained represents relevant information retrieved from the semantic network and thus from the semantic web. Our approach can be used to design novel tools for information retrieval and presentation, and for information filtering that was distributed along the semantic web.
This work has been partially supported by the Spanish Ministerio de Ciencia e Innovación under grant TIN2008-06622-C03-02, by the Generalitat Valenciana under grant ACOMP/2009/017, by the Universidad Politécnica de Valencia (Programs PAID-05-08 and PAID-06-08) and by the Mexican Dirección General de Educación Superior Tecnológica (Programs CICT 2008 and CICT 2009).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Microformats.org. The Official Microformats Site (2009), http://microformats.org/
Khare, R., Çelik, T.: Microformats: a Pragmatic Path to the Semantic Web. In: WWW 2006: Proceedings of the 15th International Conference on World Wide Web, pp. 865–866. ACM, New York (2006)
hCard. Simple, Open, Distributed Format for Representing People, Companies, Organizations, and Places (2009), http://microformats.org/wiki/hcard
Sowa, J.F. (ed.): Principles of Semantic Networks: Explorations in the Representation of Knowledge. Morgan Kaufmann, San Francisco (1991)
Sowa, J.F.: Semantic Networks. In: Shapiro, S.C. (ed.) Encyclopedia of Artificial Intelligence. John Wiley & Sons, Chichester (1992)
Wang, W., Rada, R.: Structured Hypertext with Domain Semantics. ACM Transactions on Information Systems (TOIS) 16(4), 372–412 (1998)
Mollá, D.: Learning of Graph-based Question Answering Rules. In: Proc. HLT/NAACL 2006 Workshop on Graph Algorithms for Natural Language Processing, pp. 37–44 (2006)
Silva, J.: A Program Slicing Based Method to Filter XML/DTD Documents. In: van Leeuwen, J., Italiano, G.F., van der Hoek, W., Meinel, C., Sack, H., Plášil, F. (eds.) SOFSEM 2007. LNCS, vol. 4362, pp. 771–782. Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ramos, J.G., Silva, J., Arroyo, G., Solorio, J.C. (2010). A Technique for Information Retrieval from Microformatted Websites. In: Pnueli, A., Virbitskaite, I., Voronkov, A. (eds) Perspectives of Systems Informatics. PSI 2009. Lecture Notes in Computer Science, vol 5947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11486-1_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-11486-1_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11485-4
Online ISBN: 978-3-642-11486-1
eBook Packages: Computer ScienceComputer Science (R0)