Skip to main content

A Database for Handwritten Yoruba Characters

  • Conference paper
  • First Online:
Data Science and Analytics (REDSET 2017)

Abstract

This paper describes a novel publicly available dataset for research on offline Yoruba handwritten character recognition. It contains a total of 6954 characters being made up of several categories from a total number of 183 writers thus making it the largest available dataset for Yoruba handwriting research. It can be used for designing and evaluating handwritten character recognition systems for the Yoruba language as well as provide valuable insights through writer identification. The dataset has been partitioned into training and test sets being shared into 70% and 30% respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Ajao, J.F., Olabiyisi, S.O., Omidiora, E.O.: Yoruba handwriting word recognition quality evaluation of preprocessing attributes using information theory approach. Int. J. Appl. Inf. Syst. (IJAIS) 9(1), 18–23 (2015)

    Google Scholar 

  • Assabie, Y., Bigun, J.: Offline handwritten Amharic word recognition. Pattern Recogn. Lett. 32(8), 1089–1099 (2011)

    Article  Google Scholar 

  • Bencharef, O., Chihab, Y., Mousaid, N., Oujaoura, M.: Data set for Tifinagh handwriting character recognition. Data. Brief 4, 11–13 (2015)

    Article  Google Scholar 

  • Bentayebi, K., Abada, F., Ihzmad, H., Amzazi, S.: Genetic ancestry of a Moroccan population as inferred from autosomal STRs. Meta Gene 2, 427–438 (2014)

    Article  Google Scholar 

  • Djeddi, C., Gattal, A., Souici-Meslati, L., Siddiqi, I., Chibani, Y., El Abed, H.: LAMIS-MSHD: a multi-script offline handwriting database. In: 2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 93–97. IEEE (2014)

    Google Scholar 

  • Graves, A., Schmidhuber, J.: Offline handwriting recognition with multidimensional recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 545–552 (2009)

    Google Scholar 

  • Liu, C.-L., Yin, F., Wang, D.-H., Wang, Q.-F.: CASIA online and offline Chinese handwriting databases. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 37–41. IEEE (2011)

    Google Scholar 

  • Mahmoud, S.A., Ahmad, I., Al-Khatib, W.G., Alshayeb, M., Parvez, M.T., Märgner, V., Fink, G.A.: KHATT: an open Arabic offline handwritten text database. Pattern Recogn. 47(3), 1096–1112 (2014)

    Article  Google Scholar 

  • Marti, U.-V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recogn. 5(1), 39–46 (2002)

    Article  Google Scholar 

  • Oyedotun, O.K., Olaniyi, E.O., Khashman, A.: Deep learning in character recognition considering pattern invariance constraints. Int. J. Intell. Syst. Appl. 7(7), 1 (2015)

    Google Scholar 

  • Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. ICML 3(28), 1310–1318 (2013)

    Google Scholar 

  • Saabni, R.M., El-Sana, J.A.: Comprehensive synthetic Arabic database for on/off-line script recognition research. Int. J. Doc. Anal. Recogn. (IJDAR) 16(3), 285–294 (2013)

    Article  Google Scholar 

  • Saady, Y.E., Rachidi, A., Yassa, M., Mammass, D.: AMHCD: a database for amazigh handwritten character recognition research. Int. J. Comput. Appl. 27(4), 44–48 (2011)

    Google Scholar 

  • Yadav, P., Yadav, N.: Handwriting recognition system-a review. Analysis, 114(19), 36–40 (2015)

    Google Scholar 

Download references

Acknowledgements

This generation of this database was done with help from Learnd Technologies, which helped from the design phase to the scanning phase. The authors thank all members of the Learnd team for the collaboration in the creation the dataset. The financial support of Covenant University Centre for Research Innovation and Discovery (CUCRID) is also acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sanjay Misra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ojumah, S., Misra, S., Adewumi, A. (2018). A Database for Handwritten Yoruba Characters. In: Panda, B., Sharma, S., Roy, N. (eds) Data Science and Analytics. REDSET 2017. Communications in Computer and Information Science, vol 799. Springer, Singapore. https://doi.org/10.1007/978-981-10-8527-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-8527-7_10

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-8526-0

  • Online ISBN: 978-981-10-8527-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics