Statewide implementation of automated writing evaluation: analyzing usage and associations with state test performance in grades 4-11

Potter, Andrew; Wilson, Joshua

doi:10.1007/s11423-021-10004-9

Statewide implementation of automated writing evaluation: analyzing usage and associations with state test performance in grades 4-11

Research Article
Published: 01 June 2021

Volume 69, pages 1557–1578, (2021)
Cite this article

Educational Technology Research and Development Aims and scope Submit manuscript

631 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Automated Writing Evaluation (AWE) provides automatic writing feedback and scoring to support student writing and revising. The purpose of the present study was to analyze a statewide implementation of an AWE software (n = 114,582) in grades 4-11. The goals of the study were to evaluate (a) to what extent AWE features were used, (b) if equity and access issues influenced AWE usage, and (c) if AWE usage was associated with writing performance on a large-scale state writing assessment. Descriptive statistics and hierarchical linear modeling were used to answer the research questions. Results indicated that the main feature of AWE (i.e., writing and revising essays) were used but some features (peer review and independent lessons) were underutilized. School and student level demographic variables explained little variance in AWE usage. AWE usage was statistically and positively associated with performance on a large-scale state writing assessment when controlling for prior performance and demographics. The study presents evidence that AWE can positively influence writing on a distal measure when implemented at-scale. Implications for large-scale AWE implementation are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automated Feedback and Automated Scoring in the Elementary Grades: Usage, Attitudes, and Associations with Writing Outcomes in a Districtwide Implementation of MI Write

Article 26 January 2021

Trends in automated writing evaluation systems research for teaching, learning, and assessment: A bibliometric analysis

Article 11 August 2023

Predicting students’ writing performance on the NAEP from student- and state-level variables

Article 12 October 2016

References

Allen, L. K., Jacovina, M. E., & McNamara, D. S. (2016). Computer-based writing instruction. In C. A. MacArthur, S. Graham, & J. Fitzgerald (Eds.), Handbook of writing research (pp. 316–329). New York: The Guildford Press.
Google Scholar
Allison, P. D. (2009). Fixed effects regression models. Thousand Oaks, CA: SAGE.
Book Google Scholar
American Institutes for Research. (2018). Utah State Assessments 2017–2018 technical report: Volume 1 Technical report. https://schools.utah.gov/file/97391cfd-9251-4ad1-9266-47b2ebe88e84
Bai, L., & Hu, G. (2017). In the face of fallible AWE feedback: How do students respond. Educational Psychology, 37, 67–81.
Article Google Scholar
Bauer, M. S., Damschroder, L., Hagedom, H., Smith, J., & Kilbourne, A. M. (2015). An introduction to implementation science for the non-specialist. BMC Psychology, 3(32), 1–12.
Google Scholar
Bejar, I. I., Flor, M., Futagi, Y., & Ramineni, C. (2014). On the vulnerability of automated scoring to construct-irrelevant response strategies (CIRS): An illustration. Assessing Writing, 22, 48–59.
Article Google Scholar
Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21, 5–31.
Article Google Scholar
Brasiel, S., Jeong, S., Ames, C., Lawanto, K., Yuan, M., & Martin, T. (2016). Effects of educational technology on mathematics achievement for K-12 students in Utah. Journal of Online Learning Research, 2, 205–226.
Google Scholar
Brindle, M., Graham, S., Harris, K. R., & Hebert, M. (2016). Third and fourth grade teachers’ classroom practices in writing: A national survey. Reading and Writing, 29, 929–954.
Article Google Scholar
Bunch, M. B., Vaughn, D., & Miel, S. (2016). Automated scoring in assessment systems. In Y. Rosen, S. Ferrara, & M. Mosharraf (Eds.), Handbook of research on technology tools for real-world skill development (pp. 611–626). Hershey, PA: IGI Global.
Chapter Google Scholar
Campuzano, L., Dynarski, M., Agodini, R., & Rall, K. (2009). Effectiveness of reading and mathematics software products: Findings from two student cohorts (NCEE 2009–4042). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.
Google Scholar
Carless, D., & Boud, D. (2018). The development of student feedback literacy: Enabling uptake of feedback. Assessment & Evaluation in Higher Education, 43(8), 1315–1325. https://doi.org/10.1080/02602938.2018.1463354
Article Google Scholar
Carver, L. B. (2016). Teacher perception of barriers and benefits in K-12 technology usage. Turkish Online Journal of Educational Technology, 15, 110–116. Retrieved from http://www.tojet.net/articles/v15i1/15111.pdf
Chapelle, C. A., Cotos, E., & Lee, J. (2015). Validity arguments for diagnostic assessment using automated writing evaluation. Language Testing, 32(3), 385–405. https://doi.org/10.1177/0265532214565386
Article Google Scholar
Coe, M., Hanita, M., Nishioka, V., & Smiley, R. (2011). An investigation of the impact of the 6 + 1 trait writing model on grade 5 student writing achievement (Final Report NCEE 2012–4010). Washington, DC: National Center for Education Evaluation and Regional Assistance.
Google Scholar
Conference on College Composition and Communication. (2014). CCCC position statement on teaching, learning and assessing writing in digital environments. Retrieved April 14, 2021, from https://cccc.ncte.org/cccc/resources/positions/writingassessment
Deane, P. (2018). The challenge of writing in school: Conceptualizing writing development within a sociocognitive framework. Educational Psychologist, 53, 280–300.
Article Google Scholar
Dikli, S. (2010). The nature of automated essay scoring feedback. CALICO Journal, 28, 99–134.
Article Google Scholar
Ericsson, P. F., & Haswell, R. J. (Eds.). (2006). Machine scoring of student essays: Truth and consequences. Utah State University Press.
Google Scholar
Flower, L., & Hayes, J. R. (1980). The dynamics of composing: Making plans and juggling constraints. In L. Gregg & E. Steinberg (Eds.), Cognitive processes in writing (pp. 31–50). Hillsdale, NJ: Erlbaum.
Google Scholar
Graham, S., Capizzi, A., Harris, K. R., Hebert, M., & Morphy, P. (2014). Teaching writing to middle school students: A national survey. Reading and Writing, 27, 1015–1042.
Article Google Scholar
Grimes, D., & Warschauer, M. (2010). Utility in a fallible tool: A multi-site case study of automated writing evaluation. The Journal of Technology, Learning and Assessment, 8(6), 1–44. Retrieved from http://www.jtla.org
Grissmer, D. W., & Berends, M. (1994). Student achievement and the changing American family. Santa Monica, CA: RAND Corporation.
Google Scholar
Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77, 81–112.
Article Google Scholar
Hayes, J. R. (1996). A new framework for understanding cognition and affect in writing. In C. M. Levy & S. Ransdell (Eds.), The science of writing: Theories, methods, individual differences, and applications (pp. 1–27). Mahwah, NJ: Erlbaum.
Google Scholar
Hayes, J. R. (2012). Modeling and remodeling writing. Written Communication, 29(3), 369–388.
Article Google Scholar
Hew, K. F., & Brush, T. (2007). Integrating technology into K-12 teaching and learning: Current knowledge gaps and recommendations for future research. Educational Technology Research and Development, 55, 223–252. https://doi.org/10.1007/s11423-006-9022-5
Article Google Scholar
Higgins, D., & Heilman, M. (2014). Managing what we can measure: Quantifying the susceptibility of automated scoring systems to gaming behavior. Educational Measurement: Issues and Practice, 33(3), 36–46.
Article Google Scholar
Hoffman, K., & Liagas, C. (2003). Status and trends in the education of blacks (NCES Publication No. 2003–034). Washington, DC: U.S. Department of Education.
Google Scholar
Huisman, B., Saab, N., van den Broek, P., & van Driel, J. (2019). The impact of formative peer feedback on higher education students’ academic writing: A meta-analysis. Assessment & Evaluation in Higher Evaluation, 44, 863–880.
Article Google Scholar
Hull, M., & Dutch, K. (2019). One-to-one technology and student outcomes: Evidence from Mooresville’s digital conversion initiative. Educational Evaluation and Policy Analysis, 41, 79–97.
Article Google Scholar
Jeno, L. M., Vandvik, V., Eliassen, S., & Grytnes, J. (2019). Testing the novelty effect of an m-learning tool on internalization and achievement: A self-determination theory approach. Computers & Education, 128, 398–413.
Article Google Scholar
Keller, J., & Suzuki, K. (2004). Learner motivation and e-learning design: A multinationally validated process. Journal of Educational Media, 29, 229–239.
Article Google Scholar
Kellogg, R. T., & Whiteford, A. P. (2009). Training advanced writing skills: The case for deliberate practice. Educational Psychologist, 44(4), 250–266.
Article Google Scholar
Kiuhara, S. A., Graham, S., & Hawken, L. S. (2009). Teaching writing to high school students: A national survey. Journal of Educational Psychology, 101, 136–160.
Article Google Scholar
Lee, V. (2000). Using hierarchical linear modeling to study social Contexts: The case of school effects. Educational Psychologist, 35, 125–141. https://doi.org/10.1207/S15326985EP3502_6
Article Google Scholar
Little, C. W., Clark, J. C., Tani, N. E., & Connor, C. M. (2018). Improving writing skills through technology-based instruction: A meta-analysis. Review of Education, 6, 183–201.
Article Google Scholar
Liu, S., & Kunnan, A. J. (2016). Investigating the application of automated writing evaluation to Chinese undergraduate english majors: A case study of WriteToLearn. CALICO Journal, 33, 71–91. https://doi.org/10.1558/cj.v33i1.26380
Article Google Scholar
Liu, M., Li, Y., Xu, W., & Liu, L. (2017). Automated essay feedback generation and its impact on revision. IEEE Transactions on Learning Technologies, 10, 502–513. https://doi.org/10.1109/TLT.2016.2612659
Article Google Scholar
Lu, R., & Overbaugh, R. C. (2009). School environment and technology implementation in K–12 classrooms. Computers in the Schools, 26, 89–106. https://doi.org/10.1080/07380560902906096
Article Google Scholar
Moore, N. S., & MacArthur, C. A. (2016). Student use of automated essay evaluation technology during revision. Journal of Writing Research, 8, 149–175. https://doi.org/10.17239/jowr-2016.08.01.05
Article Google Scholar
Morphy, P., & Graham, S. (2012). Word processing programs and weaker writers/readers: A meta-analysis of research findings. Reading and Writing, 25, 641–678. https://doi.org/10.1007/s11145-010-9292-5
Article Google Scholar
National Center for Education Statistics. (2012). The Nation’s report card: Writing 2011 (NCES 2012–470). Washington, D.C: Institute of Education Sciences, U.S. Department of Education.
Google Scholar
National Commission on Writing for America’s Families, Schools, and Colleges. (2004). Writing: A ticket to work … or a ticket out. A survey of business leaders. New York: College Entrance Examination Board.
Google Scholar
National Council of Teachers of English. (2013). NCTE position statement on machine scoring. Retrieved from: http://www.ncte.org/positions/statements/machine_scoring.
Page, E. B. (2003). Project essay grade: PEG. In M. D. Shermis & J. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp. 43–54). Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Google Scholar
Palermo, C., & Thomson, M. M. (2018). Teacher implementation of self-regulated strategy development with an automated writing evaluation system: Effects on the argumentative writing performance of middle school students. Contemporary Educational Psychology, 54, 255–270.
Article Google Scholar
Palermo, C., & Wilson, J. (2020). Implementing automated writing evaluation in different instructional contexts: A mixed-methods study. Journal of Writing Research, 12(1), 63–108.
Article Google Scholar
Pinheiro, J., Bates, D., DebRoy, S., Sarkar, D. & R Core Team (2019). Nlme: Linear and nonlinear mixed effects models. R package version 3.1-140.
Ranalli, J. (2018). Automated written corrective feedback: How well can students make use of it? Computer Assisted Language Learning, 31, 653–674.
Article Google Scholar
Ranalli, J., Link, S., & Chukharev-Hudilainen, E. (2017). Automated writing evaluation for formative assessment of second language writing: Investigating the accuracy and usefulness of feedback as part of argument-based validation. Educational Psychology, 37, 8–25.
Article Google Scholar
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods. Thousand Oaks: Sage.
Google Scholar
Roscoe, R. D., Allen, L. K., Johnson, A. C., & McNamara, D. S. (2018). Automated writing instruction and feedback: Instructional mode, attitudes, and revising. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 2089–2093. Retrieved from https://journals.sagepub.com/doi/https://doi.org/10.1177/1541931218621471
Shermis, M. D. (2014). State-of-the-art automated essay scoring: Competition, results, and future directions from a United States demonstration. Assessing Writing, 20, 53–76.
Article Google Scholar
Shermis, M. D., Burstein, J. C., & Bliss, L. (2004, April). The impact of automated essay scoring on high stakes writing assessments. Paper presented at the annual meeting of the National Council on Measurement in Education, San Diego
Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. New York: Oxford.
Book Google Scholar
Smola, A. J., & Scholkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14, 199–222.
Article Google Scholar
Stevenson, M. (2016). A critical interpretative synthesis: The integration of automated writing evaluation into classroom writing instruction. Computers and Composition, 42, 1–16. https://doi.org/10.1016/j.compcom.2016.05.001
Article Google Scholar
Stevenson, M., & Phakiti, A. (2014). The effects of computer-generated feedback on the quality of writing. Assessing Writing, 19, 51–65. https://doi.org/10.1016/j.asw.2013.11.007
Article Google Scholar
Strobl, C., Ailhaud, E., Benetos, K., Devitt, A., Kruse, O., Proske, A., & Rapp, C. (2019). Digital support for academic writing: A review of technologies and pedagogies. Computers & Education, 131, 33–48.
Article Google Scholar
U. S. Department of Education, Office of Educational Technology. (2017). Reimagining the role of Technology in Education: 2017 National Educational Technology Plan Update. Washington, DC: Author. Retrieved from https://tech.ed.gov/
Warschauer, M., & Grimes, D. (2008). Automated writing assessment in the classroom. Pedagogies An International Journal, 3, 22–36. https://doi.org/10.1080/15544800701771580
Warschauer, M., Knobel, M., & Stone, L. (2004). Technology and equity in schooling: Deconstructing the digital divide. Educational Policy, 18, 562–588. https://doi.org/10.1177/0895904804266469
Article Google Scholar
Williams, C., & Beam, S. (2019). Technology and writing: Review of research. Computers & Education, 128, 227–242.
Article Google Scholar
Wilson, J. (2018s). Universal screening with automated essay scoring: Evaluating classification accuracy in grades 3 and 4. Journal of School Psychology, 68, 19–37. https://doi.org/10.1016/j.jsp.2017.12.005
Article Google Scholar
Wilson, J., & Andrada, G. N. (2016). Using automated feedback to improve writing quality: Opportunities and challenges. In Y. Rosen, S. Ferrara & M. Mosharraf (Eds.), Handbook of research on technology tools for real-world skill development (pp.678–703). Hershey, PA: IGI Global.
Wilson, J., Chen, D., Sandbank, M. P., & Hebert, M. (2019). Generalizability of automated scores of writing quality in grades 3-5. Journal of Educational Psychology, 111(4), 619–640. https://doi.org/10.1037/edu0000311
Article Google Scholar
Wilson, J., & Czik, A. (2016). Automated essay evaluation software in English language arts classrooms: Effects on teacher feedback, student motivation, and writing quality. Computers & Education, 100, 94–109. https://doi.org/10.1016/j.compedu.2016.05.004
Article Google Scholar
Wilson, J., Huang, Y., Palermo, C., Beard, G., & MacArthur, C. A. (2021). Automated feedback and automated scoring in the elementary grades: Usage, attitudes, and associations with writing outcomes in a districtwide implementation of MI Write. International Journal of Artificial Intelligence in Education. https://doi.org/10.1007/s40593-020-00236-w
Article Google Scholar
Wilson, J., Olinghouse, N. G., & Andrada, G. N. (2014). Does automated feedback improve writing quality. Learning Disabilities A Contemporary Journal, 12, 93–118.
Google Scholar
Wilson, J., & Roscoe, R. D. (2020). Automated writing evaluation and feedback: Multiple metrics of efficacy. Journal of Educational Computing Research, 58, 87–125. https://doi.org/10.1177/0735633119830764
Article Google Scholar
Wise, S. L., & Kong, X. (2010). Response time effort: A new measure of examinee motivation in computer-based tests. Applied Measurement in Education, 18, 22–36.
Google Scholar
Zhu, M., Liu, O. L., & Lee, H. (2020). The effect of automated feedback on revision behavior and learning gains in formative assessment of scientific argument writing. Computers & Education, 143. Advanced online publication. https://doi.org/10.1016/j.compedu.2019.103668

Download references

Acknowledgements

This research was supported in part by Delegated Authority contract EDUC432914160001 from Measurement Incorporated^® and by Grant R305H170046 from the Institute of Education Sciences, U.S. Department of Education, to the University of Delaware. The opinions expressed are those of the authors and do not represent the views of Measurement Incorporated, the Institute, or the U.S. Department of Education, and no official endorsement by these agencies should be inferred. Thank you to Drs. Christina Barbieri and Henry May for feedback on prior drafts.

Author information

Authors and Affiliations

School of Education, University of Delaware, 015 Willard Hall Education Building, Newark, DE, 19716, USA
Andrew Potter
Department of Education, School of Education, University of Delaware, 213E Willard Hall Education Building, Newark, DE, 19716, USA
Joshua Wilson

Authors

Andrew Potter
View author publications
You can also search for this author in PubMed Google Scholar
Joshua Wilson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew Potter.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 1325 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Potter, A., Wilson, J. Statewide implementation of automated writing evaluation: analyzing usage and associations with state test performance in grades 4-11. Education Tech Research Dev 69, 1557–1578 (2021). https://doi.org/10.1007/s11423-021-10004-9

Download citation

Accepted: 13 May 2021
Published: 01 June 2021
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11423-021-10004-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statewide implementation of automated writing evaluation: analyzing usage and associations with state test performance in grades 4-11

Abstract

Access this article

Similar content being viewed by others

Automated Feedback and Automated Scoring in the Elementary Grades: Usage, Attitudes, and Associations with Writing Outcomes in a Districtwide Implementation of MI Write

Trends in automated writing evaluation systems research for teaching, learning, and assessment: A bibliometric analysis

Predicting students’ writing performance on the NAEP from student- and state-level variables

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (DOCX 1325 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Statewide implementation of automated writing evaluation: analyzing usage and associations with state test performance in grades 4-11

Abstract

Access this article

Similar content being viewed by others

Automated Feedback and Automated Scoring in the Elementary Grades: Usage, Attitudes, and Associations with Writing Outcomes in a Districtwide Implementation of MI Write

Trends in automated writing evaluation systems research for teaching, learning, and assessment: A bibliometric analysis

Predicting students’ writing performance on the NAEP from student- and state-level variables

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (DOCX 1325 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation