Skip to main content

Modeling Persian Verb Morphology to Improve English-Persian Machine Translation

  • Conference paper
Advances in Artificial Intelligence and Its Applications (MICAI 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8265))

Included in the following conference series:

  • 1360 Accesses

Abstract

Morphological analysis is an essential process in translating from a morphologically poor language such as English into a morphologically rich language such as Persian. In this paper, first we analyze the output of a rule-based machine translation (RBMT) and categorize its errors. After that, we use a statistical approach to rich morphology prediction using a parallel corpus to improve the quality of RBMT. The results of error analysis show that Persian morphology comes with many challenges especially in the verb conjugation. In our approach, we define a set of linguistic features using both English and Persian linguistic information obtained from an English-Persian parallel corpus, and make our model. In our experiments, we generate inflected verb form with the most common feature values as a baseline. The results of our experiments show an improvement of almost 2.6% absolute BLEU score on a test set containing 16 K sentences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 48–54. Association for Computational Linguistics (2003)

    Google Scholar 

  2. Somers, H.: Review article: Example-based machine translation. Machine Translation 14, 113–157 (1999)

    Article  Google Scholar 

  3. Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)

    Google Scholar 

  4. Koehn, P., Hoang, H.: Factored translation models. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), vol. 868, p. 876 (2007)

    Google Scholar 

  5. Avramidis, E., Koehn, P.: Enriching morphologically poor languages for statistical machine translation. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL): Human Language Technologies, pp. 763–770 (2008)

    Google Scholar 

  6. Yeniterzi, R., Oflazer, K.: Syntax-to-morphology mapping in factored phrase-based statistical machine translation from english to turkish. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics(ACL): Human Language Technologies, pp. 454–464 (2010)

    Google Scholar 

  7. Subotin, M.: An exponential translation model for target language morphology. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics(ACL): Human Language Technologies (2011)

    Google Scholar 

  8. Goldwater, S., McClosky, D.: Improving statistical mt through morphological analysis. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 676–683. Association for Computational Linguistics (2005)

    Google Scholar 

  9. Luong, M.T., Nakov, P., Kan, M.Y.: A hybrid morpheme-word representation for machine translation of morphologically rich languages. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 148–157. Association for Computational Linguistics (2010)

    Google Scholar 

  10. Oflazer, K.: Statistical machine translation into a morphologically complex language. In: Gelbukh, A. (ed.) CICLing 2008. LNCS, vol. 4919, pp. 376–387. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  11. Namdar, S., Faili, H.: Using inflected word form to improve persian to english statistical machine translation. In: Proceedings of the 18th National CSI (Computer Society of Iran) Computer Conference (2013)

    Google Scholar 

  12. Minkov, E., Toutanova, K., Suzuki, H.: Generating complex morphology for machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL): Human Language Technologies, vol. 45, p. 128 (2007)

    Google Scholar 

  13. Toutanova, K., Suzuki, H., Ruopp, A.: Applying morphology generation models to machine translation. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics(ACL): Human Language Technologies, vol. 8 (2008)

    Google Scholar 

  14. Clifton, A., Sarkar, A.: Combining morpheme-based machine translation with postprocessing morpheme prediction. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL): Human Language Technologies, vol. 1, pp. 32–42 (2011)

    Google Scholar 

  15. El Kholy, A., Habash, N.: Rich morphology generation using statistical machine translation. In: Proceedings of the 7th International Natural Language Generation Conference (INLG), p. 90 (2012)

    Google Scholar 

  16. El Kholy, A., Habash, N.: Translate, predict or generate: Modeling rich morphology in statistical machine translation. In: Proceedings of European Association for Machine Translation (EAMT), vol. 12 (2012)

    Google Scholar 

  17. de Gispert, A., Marino, J.: On the impact of morphology in english to spanish statistical mt. Speech Communication 50, 1034–1046 (2008)

    Article  Google Scholar 

  18. Vilar, D., Xu, J.: dHaro, L.F., Ney, H.: Error analysis of statistical machine translation output. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC), pp. 697–702 (2006)

    Google Scholar 

  19. Megerdoomian, K.: Finite-state morphological analysis of persian. In: Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages, pp. 35–41. Association for Computational Linguistics (2004)

    Google Scholar 

  20. Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics(ACL): Human Language Technologies, pp. 423–430 (2003)

    Google Scholar 

  21. Mansouri, A., Faili, H.: State-of-the-art english to persian statistical machine translation system. In: 2012 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP), pp. 174–179. IEEE (2012)

    Google Scholar 

  22. Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Computational Linguistics 29, 19–51 (2003)

    Article  MATH  Google Scholar 

  23. Rasooli, M., Faili, H., Minaei-Bidgoli, B.: Unsupervised identification of persian compound verbs. In: Batyrshin, I., Sidorov, G. (eds.) MICAI 2011, Part I. LNCS, vol. 7094, pp. 394–406. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  24. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL): Human Language Technologies, pp. 311–318 (2002)

    Google Scholar 

  25. Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of Association for Machine Translation in the Americas, pp. 223–231 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mahmoudi, A., Faili, H., Arabsorkhi, M. (2013). Modeling Persian Verb Morphology to Improve English-Persian Machine Translation. In: Castro, F., Gelbukh, A., González, M. (eds) Advances in Artificial Intelligence and Its Applications. MICAI 2013. Lecture Notes in Computer Science(), vol 8265. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45114-0_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-45114-0_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-45113-3

  • Online ISBN: 978-3-642-45114-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics