A Novel Rule Refinement Method for SMT through Simulated Post-Editing

Yang, Sitong; Yu, Heng; Liu, Qun

doi:10.1007/978-3-662-45924-9_11

Sitong Yang^16,17,
Heng Yu¹⁶ &
Qun Liu^16,18

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 496))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

1789 Accesses

Abstract

Post-editing has been successfully applied to correct the output of MT systems to generate better translation, but as a downstream task its positive feedback to MT has not been well studied. In this paper, we present a novel rule refinement method which uses Simulated Post-Editing (SiPE) to capture the errors made by the MT systems and generates refined translation rules. Our method is system-independent and doesn’t entail any additional resources. Experimental results on large-scale data show a significant improvement over both phrase-based and syntax-based baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Simard, M., Goutte, C., Isabelle, P.: Statistical phrase-based post-editing. In: Proceedings of NAACL (2007)
Google Scholar
Bechara, H., Ma, Y., van Genabith, J.: Statistical post-editing for a statistical MT system. In: Proceedings of MT Summit XIII, pp. 308–315 (2011)
Google Scholar
Lagarda, A.L., Alabau, V., Casacuberta, F., et al.: Statistical post-editing of a rule-based machine translation system. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion, vol. Short Papers, pp. 217–220. Association for Computational Linguistics (2009)
Google Scholar
Dugast, L., Senellart, J., Koehn, P.: Statistical post-editing on SYSTRAN’s rule-based translation system. In: Proceedings of the Second Workshop on Statistical Machine Translation, pp. 220–223. Association for Computational Linguistics (2007)
Google Scholar
Denkowski, M., Dyer, C., Lavie, A.: Learning from Post-Editing: Online Model Adaptation for Statistical Machine Translation. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (2014)
Google Scholar
Navarro, G.: A guided tour to approximate string matching. Journal of ACM computing surveys (CSUR) 33(1), 31–88 (2001)
Article Google Scholar
Snover, M.G., Madnani, N., Dorr, B., et al.: TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Edit Rate. Journal of Machine Translation 23(2-3), 117–127 (2009)
Article Google Scholar
Hardt, D., Elming, J.: Incremental Re-training for Post-editing SMT. In: Proceedings of AMTA (2010)
Google Scholar
Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Jounral of Computational linguistics 29(1), 19–51 (2003)
Article Google Scholar
Liu, Y., Xia, T., Xiao, X., et al.: Weighted alignment matrices for statistical machine translation. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 2, pp. 1017–1026. Association for Computational Linguistics (2009)
Google Scholar
Brown, P.F., Cocke, J., Pietra, S.A.D., et al.: A statistical approach to machine translation. Journal of Computational linguistics 16(2), 79–85 (1990)
Google Scholar
Och, F.J.: Minimum error rate training in statistical machine translation. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, vol. 1, pp. 160–167. Association for Computational Linguistics (2003)
Google Scholar
Papineni, K., Roukos, S., Ward, T., et al.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)
Google Scholar
Och, F.J., Ney, H.: The alignment template approach to statistical machine translation. Journal of Computational linguistics 30(4), 417–449 (2004)
Article Google Scholar
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open Source Toolkit for Statistical Machine Translation. In: Annual Meeting of the Association for Computati onal Linguistics (ACL), demonstration session, Prague, Czech Republic (2007)
Google Scholar
Chiang, D.: Hierarchical Phrase-Based Translation. Journal of Computational Linguistics 33(2), 201–228 (2007)
Article Google Scholar
Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: Proceedings of the 2003 Conference of the North Americanz Chapter of the Association for Computational Linguistics on Human Language Technology (NAACL 2003), vol. 1, pp. 48–54. Association for Computational Linguistics, Stroudsburg (2003)
Google Scholar
Mundt, J., Parton, K., McKeown, K.: Learning to Automatically Post-Edit Dropped Words in MT. In: Proceedings of AMTA (2012)
Google Scholar
Isabelle, P., Goutte, C., Simard, M.: Domain adaptation of MT systems through automatic post-editing. In: Proceedings of MTS (2007)
Google Scholar
Stolcke, A.: SRILM – An Extensible Language Modeling Toolkit. In: Proceedings of Intl. Conf. on Spoken Language Processing, Denver, vol. 2, pp. 901–904 (2007)
Google Scholar
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of association for machine translation in the Americas, pp. 223–231 (2006)
Google Scholar
Niessen, S., Och, F., Leusch, G., Ney, H.: An evaluation tool for machine translation: fast evaluation for MT research. In: Proceedings of the 2nd International Conference on Language Resources and Evaluation, pp. 39–45 (2000)
Google Scholar
Yu, H., Huang, L., Mi, H., Zhao, K.: Max-Violation Perceptron and Forced Decoding for Scalable MT Training. In: Proceedings of the 2013 Conference on Empirical Methods n Natural Language Processing, pp. 1112–1123 (2013)
Google Scholar
Liang, H., Zhang, M., Zhao, T.: Forced decoding for minimum error rate training in statistical machine translation. Journal of Computational Information Systems (8), 861868 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, China
Sitong Yang, Heng Yu & Qun Liu
University of Chinese Academy of Sciences, China
Sitong Yang
CNGL, School of Computing, Dublin City University, Republic of Ireland
Qun Liu

Authors

Sitong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Heng Yu
View author publications
You can also search for this author in PubMed Google Scholar
Qun Liu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, 100190, Beijing, China
Chengqing Zong
Dept. of Computer Science and Operations Research, University of Montreal, Montreal, Quebec, Canada
Jian-Yun Nie
Peking University, Beijing, China
Dongyan Zhao
Institute of Computer Science & Technology, Peking University, 100871, Beijing, China
Yansong Feng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, S., Yu, H., Liu, Q. (2014). A Novel Rule Refinement Method for SMT through Simulated Post-Editing. In: Zong, C., Nie, JY., Zhao, D., Feng, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2014. Communications in Computer and Information Science, vol 496. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45924-9_11

Download citation

DOI: https://doi.org/10.1007/978-3-662-45924-9_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45923-2
Online ISBN: 978-3-662-45924-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics