Skip to main content

An Ensemble Architecture for Linked Data Lexicalization

  • Conference paper
  • First Online:
Computational Linguistics and Intelligent Text Processing (CICLing 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10761))

  • 853 Accesses

Abstract

Linked Data has revamped the representation of knowledge by introducing the triple data structure which can encode knowledge with the associated semantics including the context by interlinking with external resources across documents. Although Linked Data is an attractive and effective mechanism to represent knowledge as created and consumed by humans in the form of a natural language, it still has a dimension of separation from natural language. Hence, in recent times, there has been an increase interest in transforming Linked Data into natural language in order to harness the benefits of Linked Data in applications interacting with natural language. This paper presents a framework that lexicalizes the Linked Data triples into natural language using an ensemble architecture. The proposed architecture is comprised of four different pattern based modules which lexicalize triples by analysing the triple features. The four pattern mining modules are based on occupational metonyms, Context Free Grammar (CFG), relation extraction using Open Information Extraction (OpenIE), and triple properties. The framework was evaluated using a two-fold evaluation process consisting of linguistic accuracy analysis and human evaluation for a test sample. The linguistic accuracy evaluation showed that the framework can produce 283 accurate lexicalization patterns for a set of 25 ontology classes resulting in a 70.75% accuracy, which is an approximately 91% increase compared to the existing state-of-the-art model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.rivinduperera.com/information/realtextlex.html.

References

  1. Berners-Lee, T.: Linked Data Design Issues. Technical report, World Wide Web Consortium (W3C) (2006)

    Google Scholar 

  2. Ngomo, A., Auer, S., Lehmann, J., Zaveri, A.: Introduction to linked data and its lifecycle on the web. In: 7th International Conference on Reasoning Web: Semantic Technologies for the Web of Data. ACM (2014)

    Google Scholar 

  3. Perera, R., Nand, P., Klette, G.: Realtext-lex: a lexicalization framework for RDF triples. Prague Bull. Math. Linguist. 106(1), 45–68 (2016)

    Article  Google Scholar 

  4. Perera, R., Nand, P.: RealText asg : a model to present answers utilizing the linguistic structure of source question. In: 29th Pacific Asia Conference on Language, Information and Computation (PACLIC). Association for Computational Linguistics (2015)

    Google Scholar 

  5. Perera, R., Nand, P.: Answer presentation in question answering over linked data using typed dependency subtree patterns. In: Open Knowledge Base and Question Answering Workshop collocated with 26th International Conference on Computational Linguistics (COLING), p. 44 (2016)

    Google Scholar 

  6. Bizer, C., Lehmann, J., Kobilarov, G.: DBpedia-a crystallization point for the Web of Data. Web Semant. 7(3), 154–165 (2009)

    Google Scholar 

  7. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K. (ed.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52

    Chapter  Google Scholar 

  8. Kobilarov, G., Bizer, C., Auer, S., Lehmann, J.: DBpedia - a linked data hub and data source for web and enterprise applications. Int. World Wide Web Conf. 18, 1–3 (2009)

    Google Scholar 

  9. Panther, K., Thornburg, L.: A conceptual analysis of English-er nominals. Appl. Cogn. Linguist. 1, 149–200 (2002)

    Google Scholar 

  10. Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, 1st edn. Prentice Hall PTR, Upper Saddle River (2000)

    Google Scholar 

  11. Kohlschütter, C., Fankhauser, P., Nejdl, W.: Boilerplate Detection using Shallow Text Features. In: ACM International Conference on Web Search and Data Mining, pp. 441–450 (2010)

    Google Scholar 

  12. Lee, H., Peirsman, Y., Chang, A., Chambers, N., Surdeanu, M., Jurafsky, D.: Stanford’s multi-pass sieve coreference resolution system at the CoNLL-2011 shared task. In: Conference on Natural Language Learning, Portland. Association for Computational Linguistics (2011)

    Google Scholar 

  13. Schmitz, M., Bart, R., Soderland, S., Etzioni, O.: Open language learning for information extraction. In: Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island, pp. 523–534. ACL, July 2012

    Google Scholar 

  14. Del Corro, L., Gemulla, R.: ClausIE: clause-based open information extraction, pp. 355–366, May 2013

    Google Scholar 

  15. Walter, S., Unger, C., Cimiano, P.: A corpus-based approach for the induction of ontology lexica. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds.) NLDB 2013. LNCS, vol. 7934, pp. 102–113. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38824-8_9

    Chapter  Google Scholar 

  16. Duma, D., Klein, E.: Generating natural language from linked data: unsupervised template extraction. In: 10th International Conference on Computational Semantics (IWCS 2013), Potsdam. ACL (2013)

    Google Scholar 

  17. Ell, B., Harth, A.: A language-independent method for the extraction of RDF verbalization templates. In: 8th International Natural Language Generation Conference, Philadelphia. ACL (2014)

    Google Scholar 

  18. Perera, R., Nand, P.: Interaction history based answer formulation for question answering. Commun. Comput. Inf. Sci. 468, 128–139 (2014)

    Google Scholar 

  19. Perera, R.: Scholar - cognitive computing approach for question answering. Ph.D. thesis, University of Westminster (2012)

    Google Scholar 

  20. Perera, R.: IPedagogy: question answering system based on web information clustering. In: Proceedings - 2012 IEEE 4th International Conference on Technology for Education, T4E 2012, Hyderabad, pp. 245–246. IEEE Press (2012)

    Google Scholar 

  21. Perera, R., Nand, P., Naeem, A.: Utilizing typed dependency subtree patterns for answer sentence generation in question answering systems. Prog. Artif. Intell. 6(2), 1–15 (2017)

    Google Scholar 

  22. Perera, R., Nand, P.: Generating lexicalization patterns for linked open data. In: NLP&LOD2 Collocated with 10th Recent Advances in Natural Language Processing (RANLP), Hissar, Bulgaria. Association for Computational Linguistics (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rivindu Perera .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Perera, R., Nand, P. (2018). An Ensemble Architecture for Linked Data Lexicalization. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2017. Lecture Notes in Computer Science(), vol 10761. Springer, Cham. https://doi.org/10.1007/978-3-319-77113-7_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77113-7_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77112-0

  • Online ISBN: 978-3-319-77113-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics