Skip to main content

Open Source Handwritten Text Recognition on Medieval Manuscripts Using Mixed Models and Document-Specific Finetuning

  • Conference paper
  • First Online:
Document Analysis Systems (DAS 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13237))

Included in the following conference series:

Abstract

This paper deals with the task of practical and open source Handwritten Text Recognition (HTR) on German medieval manuscripts. We report on our efforts to construct mixed recognition models which can be applied out-of-the-box without any further document-specific training but also serve as a starting point for finetuning by training a new model on a few pages of transcribed text (ground truth). To train the mixed models we collected a corpus of 35 manuscripts and ca. 12.5k text lines for two widely used handwriting styles, Gothic and Bastarda cursives. Evaluating the mixed models out-of-the-box on four unseen manuscripts resulted in an average Character Error Rate (CER) of 6.22%. After training on 2, 4 and eventually 32 pages the CER dropped to 3.27%, 2.58%, and 1.65%, respectively. While the in-domain recognition and training of models (Bastarda model to Bastarda material, Gothic to Gothic) unsurprisingly yielded the best results, finetuning out-of-domain models to unseen scripts was still shown to be superior to training from scratch. Our new mixed models have been made openly available to the community.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/Calamari-OCR/calamari_models_experimental.

  2. 2.

    https://readcoop.eu/transkribus/public-models.

  3. 3.

    https://github.com/jpuigcerver/PyLaia.

  4. 4.

    https://github.com/ocropus/ocropy.

  5. 5.

    https://zenodo.org/record/5167263.

  6. 6.

    https://zenodo.org/record/4746342.

  7. 7.

    https://en.wikipedia.org/wiki/Diplomatics#Diplomatic_editions_and_transcription.

  8. 8.

    https://www.parzival.unibe.ch/englishpresentation.html.

  9. 9.

    https://lab.sbb.berlin/events/faithful-transcriptions-2/?lang=en.

  10. 10.

    https://www.adfontes.uzh.ch/tutorium/schriften-lesen/schriftgeschichte/bastarda-und-gotische-kursive.

  11. 11.

    https://github.com/ocr4all.

  12. 12.

    https://digi.ub.uni-heidelberg.de/wgd.

  13. 13.

    https://github.com/Calamari-OCR/calamari.

  14. 14.

    In Calamari short notation:conv=40:3\(\,\times \,\)3, pool=2\(\,\times \,\)2, conv=60:3\(\,\times \,\)3, pool=2\(\,\times \,\)2, lstm=200, dropout=0.5.

  15. 15.

    In Calamari short notation:conv=40:3\(\,\times \,\)3, pool=2\(\,\times \,\)2, conv=60:3\(\,\times \,\)3, pool=2\(\,\times \,\)2, conv=120:3\(\,\times \,\)3, pool=2\(\,\times \,\)2,lstm=200, lstm=200, lstm=200, dropout=0.5.

  16. 16.

    https://github.com/OCR-D/ocrd_olena.

  17. 17.

    https://github.com/qurator-spk/sbb_binarization.

References

  1. Breuel, T.M., Ul-Hasan, A., Al-Azawi, M.A., Shafait, F.: High-performance OCR for printed English and Fraktur using LSTM networks. In: 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 683–687. IEEE (2013). https://doi.org/10.1109/ICDAR.2013.140

  2. Diaz, D.H., Qin, S., Ingle, R., Fujii, Y., Bissacco, A.: Rethinking text line recognition models. arXiv preprint (2021). https://arxiv.org/abs/2104.07787

  3. Eichenberger, N., Suwelack, H., Schröer, A.: Faithful transcriptions. 027.7 J. Libr. Cult. (2021). https://doi.org/10.21428/1bfadeb6.d3bdbcd2

  4. Hawk, B.W., Karaisl, A., White, N.: Modelling medieval hands: practical OCR for caroline minuscule. Digit. Humaniti. Q. 13(1) (2019). http://www.digitalhumanities.org/dhq/vol/13/1/000412/000412.html

  5. Hodel, T., Schoch, D., Schneider, C., Purcell, J.: General models for handwritten text recognition: feasibility and state-of-the art. German kurrent as an example. J. Open Humanit. Data 7(13), 1–10 (2021). https://doi.org/10.5334/johd.46

    Article  Google Scholar 

  6. Kahle, P., Colutto, S., Hackl, G., Mühlberger, G.: Transkribus-a service platform for transcription, recognition and retrieval of historical documents. In: 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 4, pp. 19–24. IEEE (2017). https://doi.org/10.1109/ICDAR.2017.307

  7. Kang, L., Riba, P., Rusiñol, M., Fornés, A., Villegas, M.: Pay attention to what you read: non-recurrent handwritten text-line recognition. arXiv preprint (2020). arXiv:2005.13044, https://arxiv.org/abs/2005.13044

  8. Memon, J., Sami, M., Khan, R.A., Uddin, M.: Handwritten optical character recognition (OCR): a comprehensive systematic literature review (SLR). IEEE Access 8, 142642–142668 (2020). https://doi.org/10.1109/ACCESS.2020.3012542

    Article  Google Scholar 

  9. Michael, J., Weidemann, M., Labahn, R.: HTR engine based on NNs P3. Horizon 2020 Technical report (2018). https://readcoop.eu/wp-content/uploads/2018/12/Del_D7_9.pdf

  10. Mocholí Calvo, C., et al.: Development and experimentation of a deep learning system for convolutional and recurrent neural networks. Ph.D. thesis. Universitat Politècnica de València (2018)

    Google Scholar 

  11. Pletschacher, S., Antonacopoulos, A.: The PAGE (page analysis and ground-truth elements) format framework. In: 20th International Conference on Pattern Recognition, pp. 257–260. IEEE (2010). https://doi.org/10.1109/ICPR.2010.72

  12. Reul, C., et al.: OCR4all-an open-source tool providing a (semi-)automatic OCR workflow for historical printings. Appl. Sci. 9(22), 4853 (2019). https://doi.org/10.3390/app9224853

    Article  Google Scholar 

  13. Reul, C., Springmann, U., Wick, C., Puppe, F.: Improving OCR accuracy on early printed books by combining pretraining, voting, and active learning. JLCL: Spec. Issue Autom. Text Layout Recognit. 33(1), 3–24 (2018). https://jlcl.org/content/2-allissues/2-heft1-2018/jlcl_2018-1_1.pdf

  14. Reul, C., Springmann, U., Wick, C., Puppe, F.: Improving OCR accuracy on early printed books by utilizing cross fold training and voting. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 423–428. IEEE (2018). https://doi.org/10.1109/DAS.2018.30

  15. Reul, C., Wick, C., Noeth, M., Wehner, M., Springmann, U.: Mixed model OCR training on historical Latin script for Out-of-the-box recognition and finetuning. In: The 6th International Workshop on Historical Document Imaging and Processing, pp. 7–12 (2021). https://doi.org/10.1145/3476887.3476910

  16. Sánchez, J.A., Romero, V., Toselli, A.H., Villegas, M., Vidal, E.: A set of benchmarks for handwritten text recognition on historical documents. Pattern Recognit. 94, 122–134 (2019). https://doi.org/10.1016/j.patcog.2019.05.025

    Article  Google Scholar 

  17. Springmann, U., Lüdeling, A.: OCR of historical printings with an application to building diachronic corpora: a case study using the RIDGES herbal corpus. Digit. Humanit. Q. 11(2) (2017), http://www.digitalhumanities.org/dhq/vol/11/2/000288/000288.html

  18. Stökl Ben Ezra, D., Brown-DeVost, B., Jablonski, P., Lapin, H., Kiessling, B., Lolli, E.: BiblIA-a general model for medieval hebrew manuscripts and an open annotated dataset. In: The 6th International Workshop on Historical Document Imaging and Processing, pp. 61–66 (2021). https://doi.org/10.1145/3476887.3476896

  19. Wick, C., Reul, C., Puppe, F.: Calamari-a high-performance tensorflow-based deep learning package for optical character recognition. Digit. Humanit. Q. 14(2) (2020). http://www.digitalhumanities.org/dhq/vol/14/2/000451/000451.html

Download references

Acknowledgement

The authors would like to thank our student research assistants Lisa Gugel, Kiara Hart, Ursula Heß, Annika Müller, and Anne Schmid for their extensive segmentation and transcription work as well as Maximilian Nöth and Maximilian Wehner for supporting the data preparation.

This work was partially funded by the German Research Foundation (DFG) under project no. 460665940.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Reul .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Reul, C., Tomasek, S., Langhanki, F., Springmann, U. (2022). Open Source Handwritten Text Recognition on Medieval Manuscripts Using Mixed Models and Document-Specific Finetuning. In: Uchida, S., Barney, E., Eglin, V. (eds) Document Analysis Systems. DAS 2022. Lecture Notes in Computer Science, vol 13237. Springer, Cham. https://doi.org/10.1007/978-3-031-06555-2_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-06555-2_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-06554-5

  • Online ISBN: 978-3-031-06555-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics