Skip to main content

NotaryPedia: A Knowledge Graph of Historical Notarial Manuscripts

  • Conference paper
  • First Online:
On the Move to Meaningful Internet Systems: OTM 2019 Conferences (OTM 2019)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11877))

Abstract

The Notarial Archives in Valletta, the capital city of Malta, houses a rich and valuable collection of around twenty thousand notarial manuscripts dating back to the 15th century. The Archive wants to make the contents of this collection easily accessible and searchable to researchers and the general public. Knowledge Graphs have been successfully used to represent similar historical content. Nevertheless, building a Knowledge Graph for the archives is challenging as these documents are written in medieval Latin and currently there is a lack of information extraction tools that recognise this language. This is, furthermore, compounded with a lack of medieval Latin corpora to train and evaluate machine learning algorithms, as well as a lack of an ontological representation for the contents of notarial manuscripts. In this paper, we present NotaryPedia, a Knowledge Graph for the Notarial Archives. We extend our previous work on entity and keyphrase extraction with relation extraction to populate the Knowledge Graph using an ontological vocabulary for notarial deeds. Furthermore, we perform Knowledge Graph completeness using link-prediction and inference. Our work was evaluated using different translational distance and semantic matching models to predict relations amongst literals by promoting them to entities and to infer new knowledge from existing entities. A 49% relation prediction accuracy using TransE was achieved.

This work is partially funded by project E-18LO28-01 as part of the collaboration between the Notarial Archives in Valletta and the University of Malta.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.w3.org/RDF/.

  2. 2.

    http://www.dublincore.org/specifications/dublin-core/dcmi-terms/.

  3. 3.

    http://data.archiveshub.ac.uk/def/.

  4. 4.

    http://iiif.io.

  5. 5.

    http://www.ancientwisdoms.ac.uk/.

  6. 6.

    http://data.archiveshub.ac.uk/.

  7. 7.

    https://www.jisc.ac.uk/archives-hub.

  8. 8.

    The prototype can be accessed from: https://notarypedia.opendatamalta.com/.

  9. 9.

    https://jena.apache.org/.

  10. 10.

    https://jena.apache.org/documentation/fuseki2/.

  11. 11.

    The current Knowledge Graph is found here: https://notarypedia.opendatamalta.com/graph/notarypedia.ttl.

  12. 12.

    The smallest unit of a description in an archival collection, for example a report. [1].

  13. 13.

    An organized unit of documents grouped together either for current use by the creator or in the process of archival arrangement. In our case this is a register [1].

  14. 14.

    http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/.

  15. 15.

    https://github.com/thunlp/OpenKE/tree/master/benchmarks/FB15K.

  16. 16.

    https://github.com/thunlp/OpenKE/tree/master/benchmarks/WN18.

  17. 17.

    http://docs.cltk.org/en/latest/index.html.

  18. 18.

    https://github.com/luozhouyang/python-string-similarity#jaro-winkler.

  19. 19.

    https://github.com/thunlp/OpenKE.

References

  1. ISAD(G): General international standard archival description 2000, 2 edn. (2000)

    Google Scholar 

  2. Ahonen, E., Hyvonen, E.: Publishing Historical Texts on the Semantic Web –A Case Study, pp. 167–173. IEEE (2009)

    Google Scholar 

  3. Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating Embeddings for Modeling Multi-relational Data, pp. 2787–2795 (2013)

    Google Scholar 

  4. Debruyne, C., Beyan, O.D., Grant, R., Collins, S., Decker, S., Harrower, N.: A semantic architecture for preserving and interpreting the information contained in irish historical vital records. Int. J. Digit. Libr. 17(3), 159–174 (2016)

    Article  Google Scholar 

  5. Efremova, J., Montes García, A., Calders, T.: Classification of historical notary acts with noisy labels. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 49–54. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16354-3_6

    Chapter  Google Scholar 

  6. Efremova, J., García, A.M., Iriondo, A.B., Calders, T.: Who are my ancestors? Retrieving family relationships from historical texts. In: Braslavski, P., et al. (eds.) RuSSIR 2015. CCIS, vol. 573, pp. 121–129. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41718-9_6

    Chapter  Google Scholar 

  7. Efremova, J., Montes Garcia, A., Calders, T., Zhang, J.: Towards population reconstruction: extraction of family relationships from historical documents (2015)

    Google Scholar 

  8. Efremova, J., et al.: Multi-source entity resolution for genealogical data. In: Bloothooft, G., Christen, P., Mandemakers, K., Schraagen, M. (eds.) Population Reconstruction, pp. 129–154. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19884-2_7

    Chapter  Google Scholar 

  9. Ehrlinger, L., Wob, W.: Towards a Definition of Knowledge Graphs (2016)

    Google Scholar 

  10. Ellul, C., Abela, C., Azzopardi, J.: Extracting Information from Medieval Notarial deeds, pp. 25–28. EKAW (2018)

    Google Scholar 

  11. Erdmann, A., et al.: Challenges and solutions for latin named entity recognition. In: The COLING 2016 Organizing Committee, pp. 85–93 (2016)

    Google Scholar 

  12. Feeney, K.C., O’Sullivan, D., Tai, W., Brennan, R.: Improving curated web-data quality with structured harvesting and assessment. Int. J. Semant. Web Inf. Syst. 10(2), 35–62 (2014)

    Article  Google Scholar 

  13. Fiorini, S.: Documentary Sources of Maltese History Part I Notarial Documents No 1 Notary Giacomo Zabbara. University of Malta, 1 edn. (1996)

    Google Scholar 

  14. Gonzalez, E.: Unsupervised Relation Extraction by Massive Clustering (2009)

    Google Scholar 

  15. Han, X., et al.: Openke: an open toolkit for knowledge embedding. In: Proceedings of EMNLP (2018)

    Google Scholar 

  16. Monti, M., et al.: Construction of enterprise knowledge graphs. In: Pan, J.Z., Vetere, G., Gomez-Perez, J.M., Wu, H. (eds.) Exploiting Linked Data and Knowledge Graphs in Large Organisations. Springer, Cham (2017). chap 8

    Google Scholar 

  17. Paulheim, H.: Knowledge graph refinement: a survey of approaches and evaluation methods. Semant. Web 8(3), 489–508 (2016)

    Article  Google Scholar 

  18. Pawar, S., Palshikar, G., Bhattacharyya, P.: Relation Extraction: A Survey (2017)

    Google Scholar 

  19. Ruddock, B.: Linked data and the locah project. Bus. Inf. Rev. 28(2), 105–111 (2011)

    Google Scholar 

  20. Siddiqui, T., Aalam, P.: Short text clustering; challenges & solutions: a literature review. Int. J. Math. Comput. Res. 3(6), 1025–1031 (2015)

    Google Scholar 

  21. Srinivas, V.: Link Prediction in Social Networks, 1st edn. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-28922-9

    Book  MATH  Google Scholar 

  22. Villazon-Terrazas, B., Garcia-Santa, N., Ren, Y., Srinivas, K., Rodriguez-Muro, M., Alexopoulos, P., Pan, J.Z.: Construction of enterprise knowledge graphs (I). Exploiting Linked Data and Knowledge Graphs in Large Organisations, pp. 87–116. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-45654-6_4

    Chapter  Google Scholar 

  23. Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017)

    Article  Google Scholar 

  24. Winkler, W.: String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage. In: Proceedings of the Section on Survey Research Methods (1990)

    Google Scholar 

  25. Yang, Y., Lichtenwalter, R.N., Chawla, N.V.: Evaluating link prediction methods. Knowl. Inf. Syst. 45(3), 751–782 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Charlene Ellul , Joel Azzopardi or Charlie Abela .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ellul, C., Azzopardi, J., Abela, C. (2019). NotaryPedia: A Knowledge Graph of Historical Notarial Manuscripts. In: Panetto, H., Debruyne, C., Hepp, M., Lewis, D., Ardagna, C., Meersman, R. (eds) On the Move to Meaningful Internet Systems: OTM 2019 Conferences. OTM 2019. Lecture Notes in Computer Science(), vol 11877. Springer, Cham. https://doi.org/10.1007/978-3-030-33246-4_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-33246-4_39

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-33245-7

  • Online ISBN: 978-3-030-33246-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics