Abstract
Massive open online courses and other online study opportunities are providing easier access to education for more and more people around the world. However, one big challenge is still the language barrier: Most courses are available in English, but only 16% of the world’s population speaks English [1]. The language challenge is especially evident in written exams, which are usually not provided in the student’s native language. To overcome these inequities, we analyze AI-driven cross-lingual automatic short answer grading. Our system is based on a Multilingual Bidirectional Encoder Representations from Transformers model [2] and is able to fairly score free-text answers in 26 languages in a fully-automatic way with the potential to be extended to 104 languages. Augmenting training data with machine translated task-specific data for fine-tuning even improves performance. Our results are a first step to allow more international students to participate fairly in education.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Statista: The Most Spoken Languages Worldwide in 2019 (2020). https://www.statista.com/statistics/266808/the-most-spoken-languages-worldwide
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, USA, pp 4171–4186
United Nations: Sustainable Development Goals: 17 Goals to Transform our World (2021). https://www.un.org/sustainabledevelopment/sustainabledevelopment-goals
Correia AP, Liu C, Xu F (2020) Evaluating Videoconferencing Systems for the Quality of the Educational Experience. Distance Educ 41(4):429–452
Koravuna S, Surepally UK (2020) Educational Gamification and Artificial Intelligence for Promoting Digital Literacy. Association for Computing Machinery, New York, NY, USA
Libbrecht P, Declerck T, Schlippe T, Mandl T, Schiffner D (2020) NLP for Student and Teacher: Concept for an AI Based Information Literacy Tutoring System. In: The 29th ACM International Conference on Information and Knowledge Management (CIKM 2020). Galway, Ireland. Accessed 19–23 Oct 2020
Pires T, Schlinger E, Garrette D (2019) How Multilingual is Multilingual BERT? In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, pp 4996–5001
Burrows S, Gurevych I, Stein B (2014) The Eras and Trends of Automatic Short Answer Grading. Int J Artif Intell Educ 25:60–117
Süzen N, Gorban A, Levesley J, Mirkes E (2020) Automatic Short Answer Grading and Feedback Using Text Mining Methods. Procedia Comput Sci 169:726–743
Zehner F (2016) Automatic Processing of Text Responses in Large-Scale Assessments. Ph.D. thesis, TU München
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient Estimation of Word Representations in Vector Space. In: Bengio Y, LeCun Y (eds), Workshop Track Proceedings of the 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA
Gomaa WH, Fahmy AA (2019) Ans2vec: A Scoring System for Short Answers. In: Hassanien AE, Azar AT, Gaber T, Bhatnagar RF, Tolba M (eds) The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2019). Springer International Publishing, Cham, pp 586–595
Mohler M, Bunescu R, Mihalcea R (2011) Learning to Grade Short Answer Questions Using Semantic Similarity Measures and Dependency Graph Alignments. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Portland, Oregon, USA, pp 752–762
Dzikovska M, Nielsen R, Brew C, Leacock C, Giampiccolo D, Bentivogli L, Clark P, Dagan I, Dang HT (2013) SemEval-2013 Task 7: The Joint Student Response Analysis and 8th Recognizing Textual Entailment Challenge. In: Second Joint Conference on Lexical and Computational Semantics (*SEM), Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), vol 2. Association for Computational Linguistics. Atlanta, Georgia, USA
Sawatzki J, Schlippe T, Benner-Wickner M (2021) Deep Learning Techniques for Automatic Short Answer Grading: Predicting Scores for English and German Answers. In: The 2nd International Conference on Artificial Intelligence in Education Technology (AIET 2021), Wuhan, China
Krishnamurthy S, Gayakwad E, Kailasanathan N (2019) Deep Learning for Short Answer Scoring. Int J Recent Technol Eng 7:1712–1715
Sung C, Dhamecha T, Mukhi N (2019) Improving Short Answer Grading Using Transformer-Based Pre-Training. In: Artificial Intelligence in Education, pp 469–481
Camus L, Filighera A (2020) Investigating Transformers for Automatic Short Answer Grading. Artif Intell Educ 12164:43–48
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) RoBERTa: A Robustly Optimized BERT Pretraining Approach
Devlin J (2019) BERT-Base, Multilingual Cased. https://github.com/googleresearch/bert/blob/master/multilingual.md
Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, Klingner J, Shah A, Johnson M, Liu X, Kaiser L, Gouws S, Kato Y, Kudo T, Kazawa H, Stevens K, Kurian G, Patil N, Wang W, Young C, Smith J, Riesa J, Rudnick A, Vinyals O, Corrado G, Hughes M, Dean J (2016) Google’s Neural Machine Translation System: Bridging the Gap Between Human and Machine Translation. CoRR. 1609.08144
Budur E, Özçelik R, Gungor T, Potts C (2020) Data and Representation for Turkish Natural Language Inference. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, pp 8253–8267
Stapleton P, Leung Ka Kin B (2019) Assessing the Accuracy and Teachers’ Impressions of Google Translate: A Study of Primary L2 Writers in Hong Kong. In: English for Specific Purposes, vol 56, pp 18–34
Aiken M (2012) An Analysis of Google Translate Accuracy. Stud Linguist Lit 3:253
Aiken M (2019) An Updated Evaluation of Google Translate Accuracy. Stud Linguist Lit 3:253
Wikimedia: List of Wikipedias (2021). https://meta.wikimedia.org/wiki/List_of_Wikipedias#All_Wikipedias_ordered_by_number_of_articles
Rajapakse TC (2019) Simple Transformers. https://github.com/ThilinaRajapakse/simpletransformers
Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J, Shleifer S, von Platen P, Ma C, Jernite Y, Plu J, Xu C, Scao TL, Gugger S, Drame M, Lhoest Q, Rush AM (2020) Transformers: State-of-the-Art Natural Language Processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, pp 38–45
Kingma DP, Ba J (2015) Adam: A Method for Stochastic Optimization. In: Bengio Y, LeCun Y (eds) 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings, San Diego, CA, USA
Schlippe T, Sawatzki J (2021) AI-Based Multilingual Interactive Exam Preparation. In: The Learning Ideas Conference 2021 (14th Annual Conference). ALICE—Special Conference Track on Adaptive Learning via Interactive, Collaborative and Emotional Approaches. New York, New York, USA
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Schlippe, T., Sawatzki, J. (2022). Cross-Lingual Automatic Short Answer Grading. In: Cheng, E.C.K., Koul, R.B., Wang, T., Yu, X. (eds) Artificial Intelligence in Education: Emerging Technologies, Models and Applications. AIET 2021. Lecture Notes on Data Engineering and Communications Technologies, vol 104. Springer, Singapore. https://doi.org/10.1007/978-981-16-7527-0_9
Download citation
DOI: https://doi.org/10.1007/978-981-16-7527-0_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-7526-3
Online ISBN: 978-981-16-7527-0
eBook Packages: EngineeringEngineering (R0)