Skip to main content

Experiments with Automatic and Semi-automatic Detection of Sparse Word Forms in Old Braj

  • Conference paper
  • First Online:
Human Language Technology. Challenges for Computer Science and Linguistics (LTC 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12598))

Included in the following conference series:

  • 259 Accesses

Abstract

This paper presents the work on automatic converb detection in Old Braj poetry from the 15–17 centuries. This is a part of research on non-finite verbal forms in early New Indo-Aryan (NIA) language corpora comprising data from Old Rajasthani, Awadhi, Braj, Dakkhini and Pahari [8]. The goal of the detection mechanism is to successfully identify a plaintext word as a converb or non-converb. Such mechanism facilitates further converb description and analysis, which is of great importance in research on historical syntax of NIA. In order to develop the automatic detector, a selection of state-of-art statistical classification mechanisms was used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://wp.unil.ch/eniat/2015/05/hymns-by-hita-harivam.sa/.

  2. 2.

    We use here consistently two of the three Dixonian primitive terms [2] i.e. A - subject of a transitive verb and O - object of a transitive verb.

References

  1. Bickel, B.: Capturing particulars and universals in clause linkage: a multivariate analysis. In: Bril, I. (ed.) Clause Linking and Clause Hierarchy : Syntax and Pragmatics, pp. 51–102. No. 121 in Studies in Language Companion Series, John Benjamins, Amsterdam (2010). http://dx.doi.org/10.5167/uzh-48989

  2. Dixon, R.M.: Ergativity. Cambridge Studies in Linguistics, Cambridge University Press (1994). https://books.google.pl/books?id=fKfSAu6v5LYC

  3. Dvivedī, L.: Viṣṇudās kavkiṛt Rāmāyana kathā. Sāhitya bhavan limited (1972)

    Google Scholar 

  4. Dwarikesh, D.P.S.: Historical syntax of the conjunctivc participle phrase in new indo-aryan dialects of madhyadesa (midland) of northern india. University of Chicago Ph.D. dissertation (1971)

    Google Scholar 

  5. Emenau, M.: The sanskrit gerund: a synchronic, diachronic and typological analysis. Language 32, 3–16 (1956)

    Article  Google Scholar 

  6. Haspelmath, M.: The converb as a cross-linguistically valid category. In: Haspelmath, M., König, E. (eds.) Converbs in Cross-Linguistic Perspective: Structure and Meaning of Adverbial Verb Forms - Adverbial Participles, Gerunds, pp. 1–55. No. 13 in Empirical approaches to language typology, Mouton de Gruyter, Berlin (1995)

    Google Scholar 

  7. Jaworski, R., Jassem, K., Stroński, K.: Manual and Automatic Tagging of Indo-Aryan Languages. Human Language Technologies as a Challenge for Computer Science and Linguistics, pp. 550–554 (2015)

    Google Scholar 

  8. Jaworski, R., Stroński, K.: New perspectives in annotating early new indo-aryan texts. In: Proceedings of the 32nd South Asian Languages Analysis Round Table SALA-32, Lisbon, Portugal, pp. 66–68 (2016)

    Google Scholar 

  9. Jaworski, R., Stroński, K.: Recognition and multi-layered analysis of converbs in early NIA. In: Proceedings of the 33rd South Asian Languages Analysis Round Table SALA-33, Poznań, Poland, pp. 55–56 (2017)

    Google Scholar 

  10. Langford, J., Li, L., Zhang, T.: Sparse online learning via truncated gradient. In: Advances in Neural Information Processing Systems, pp. 905–912 (2009)

    Google Scholar 

  11. Masica, C.P.: Defining a Linguistic Area: South Asia. Chicago University Press, Chicago (1976)

    Google Scholar 

  12. McGregor, R.: The Language of Indrajit of Orchā: A Study of Early Braj Bhāsā Prose. University of Cambridge Oriental Publications, Cambridge University Press (1968). https://books.google.pl/books?id=EjI3vgAACAAJ

  13. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013). http://arxiv.org/abs/1301.3781

  14. Misra, V.P.: Bhusana granthavali. Nai Dilli, Vani Prakasan (1994)

    Google Scholar 

  15. Stroński, K., Tokaj, J., Verbeke, S.: A diachronic account of converbal constructions in old rajasthani. In: Cennamo, M., Fabrizio, C. (eds.) Historical Linguistics 2015, Selected papers from the 22nd International Conference on Historical Linguistics, Naples, 27–31 July 2015, pp. 424–441. No. 348 in Current Issues in Linguistic Theory, John Benjamins, Amsterdam/Philadephia (2019)

    Google Scholar 

  16. Subbārāo, K.: South Asian Languages: A Syntactic Typology. South Asian Languages: A Syntactic Typology, Cambridge University Press (2012). https://books.google.pl/books?id=ZCfiGYvpLOQC

  17. Tikkanen, B.: The Sanskrit gerund: a synchronic, diachronic, and typological analysis. Studia Orientalia, Finnish Oriental Society (1987). https://books.google.pl/books?id=XTkqAQAAIAAJ

  18. Wallace, W.D.: Object-marking in the history of nepali: a case of syntactic diffusion. Stud. Linguist. Sci. 11(2), 107–128 (1981)

    Google Scholar 

Download references

Acknowledgements

This research was supported by Polish National Science Centre grant 2013/10/M/HS2/00553.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rafał Jaworski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jaworski, R., Stroński, K. (2020). Experiments with Automatic and Semi-automatic Detection of Sparse Word Forms in Old Braj. In: Vetulani, Z., Paroubek, P., Kubis, M. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2017. Lecture Notes in Computer Science(), vol 12598. Springer, Cham. https://doi.org/10.1007/978-3-030-66527-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-66527-2_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-66526-5

  • Online ISBN: 978-3-030-66527-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics