Skip to main content

Advertisement

Log in

Designing Assessments and Assessing Designs in Virtual Educational Environments

  • Published:
Journal of Science Education and Technology Aims and scope Submit manuscript

Abstract

This study used innovative assessment practices to obtain and document broad learning outcomes for a 15-hour game-based curriculum in Quest Atlantis, a multi-user virtual environment that supports school-based participation in socio scientific inquiry in ecological sciences. Design-based methods were used to refine and align the enactment of virtual narrative and scientific investigations to a challenging problem solving assessment and indirectly to achievement test items that were independent of the curriculum. In study one, one-sixth grade teacher used the curriculum in two of his classes and obtained larger gains in understanding and achievement than his two other classes, which used an expository text to learn the same concepts and skills. Further treatment refinements were carried out, and two forms of virtual formative feedback were introduced. In study two, the same teacher used the curriculum in all four of his classes; the revised curriculum resulted in even larger gains in understanding and achievement. Gains averaged 1.1 SD and 0.4 SD, respectively, with greater gains shown for students who engaged more with formative feedback. Principles for assessing designs and designing assessments in virtual environments are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. This broader discourse is what Gee (2007) calls “the Game” to distinguish from the specifically designed discourses that make up “the game.”

  2. Of course, feedback can be provided that tells students which of the 4–5 associations on each item was most correct. But the students are very unlikely to ever see that particular item or even those particular associations again. Nonetheless, must commercial tests cannot even provide that feedback because it compromises the test.

  3. Because these implementations of the Taiga curriculum were relatively isolated from the broader QA narrative, these incentives played a relatively minor role in this study and will not be discussed further.

  4. These standards were as follows: Scientific inquiry Begin to evaluate the validity of claims based on the amount and quality of the evidence cited; Technology and science Explain how the solution to one problem, such as the use of pesticides in agriculture or the use of dumps for waste disposal, may create other problems; Systems Recognize and describe that systems contain objects as well as processes that interact with each other; Models and Scale Demonstrate how geometric figures, number sequences, graphs, diagrams, sketches, number lines, maps, and stories can be used to represent objects, events, and processes in the real world, although such representation can never be exact in every detail.

  5. This is important to test because statistically significant between-class/within-group differences will inflate the significance of between-group comparisons if not accounted for.

References

  • Almond RG, Mislevy RJ (1999) Graphical models and computerized adaptive testing. Appl Psychol Meas 23(3):223. doi:10.1177/01466219922031347

    Article  Google Scholar 

  • Arici AB (2008) Unpublished dissertation. Indiana University, Departments of Learning and Development Science and Cognitive Science

  • Barab S (2006) Design-based research: a methodological toolkit for the learning scientist. In: Sawyer K (ed) The handbook of the learning sciences. Cambridge University Press, Cambridge, pp 153–170

    Google Scholar 

  • Barab S, Dede C (2007) Games and immersive participatory simulations for science education: an emerging type of curricula. J Sci Educ Technol 16(1):1–3. doi:10.1007/s10956-007-9043-9

    Article  Google Scholar 

  • Barab S, Sadler T, Heiselt C, Hickey D, Zuiker S (2007a) Relating narrative, inquiry, and inscriptions: a framework for socioscientific inquiry. J Sci Educ Technol 16:59–82. doi:10.1007/s10956-006-9033-3

    Article  Google Scholar 

  • Barab SA, Zuiker S, Warren S, Hickey D, Ingram-Goble A, Kwon E-J, Kouper I, Herring SC (2007b) Embodied curriculum: relating formalisms to contexts. Sci Educ 91(5):750–782. doi:10.1002/sce.20217

    Article  Google Scholar 

  • Bass KM, Glaser R (2004) Developing assessments to inform teaching and learning. National Center for Research on Evaluation Standards and Student Testing, Los Angeles

    Google Scholar 

  • Bell P, Hoadley CM, Linn MC (2004) Design-based research in education. In: Linn MC, Davis EA, Bell P (eds) Internet environments for science education. Erlbaum, Mahwah, pp 73–85

    Google Scholar 

  • Bereiter C, Scardamalia M (1989) Intentional learning as a goal of instruction. In: Resnick L (ed) Cognition and instruction: issues and agendas. Lawrence Erlbaum Associates, Hillsdale, pp 361–379

    Google Scholar 

  • Black P, Wiliam D (1998) Assessment and classroom learning. Assess Educ 5(1):7–74. doi:10.1080/0969595980050102

    Google Scholar 

  • Burkhardt H, Schoenfeld AH (2003) Improving educational research; toward a more useful, more influential, and better-funded enterprise. Educ Res 32(9):3–14. doi:10.3102/0013189X032009003

    Article  Google Scholar 

  • Burroughs S, Groce E, Webeck ML (2005) Social studies education in the age of testing and accountability. Educ Meas 24:13–20. doi:10.1111/j.1745-3992.2005.00015.x

    Google Scholar 

  • Cobb P, Confrey J, diSessa A, Lehrer R, Schauble L (2003) Design experiments in educational research. Educ Res 32(1):9. doi:10.3102/0013189X032001009

    Article  Google Scholar 

  • Cohen J (1992) Quantitative methods in psychology: a power primer. Psychol Bull 12(1):155. doi:10.1037/0033-2909.112.1.155

    Article  Google Scholar 

  • Cronbach LJ, Linn RL, Brennan RL, Haertel EH (1997) Generalizability analysis for performance assessments of student achievement or school effectiveness. Educ Psychol Meas 57(3):373. doi:10.1177/0013164497057003001

    Article  Google Scholar 

  • Duschl RA, Gitomer DH (1997) Strategies and challenges to changing the focus of assessment and instruction in science classrooms. Educ Assess 4:37–73. doi:10.1207/s15326977ea0401_2

    Article  Google Scholar 

  • Federation of American Scientists (2006) Harnessing the power of video games for learning. Federation of American Scientists, Washington

    Google Scholar 

  • Feng M, Heffernan NT, Koedinger KR (2006) Addressing the testing challenge with a web-based E-assessment system that tutors as it assesses. In: Proceedings of the 15th international conference on the World Wide Web, Edinburgh, pp 307–316

  • Feuer MJ, Towne L, Shavelson RJ (2002) Scientific culture and educational research. Educ Res 31(8):4. doi:10.3102/0013189X031008004

    Article  Google Scholar 

  • Frederiksen JR, Collins A (1989) A systems approach to educational testing. Educ Res 18(9):27–32

    Google Scholar 

  • Gee JP (2003a) What video games have to teach us about learning and literacy, 1st edn. Palgrave Macmillan, New York

    Google Scholar 

  • Gee JP (2003b) Opportunity to learn: a language-based perspective on assessment. Assess Educ 10(1):27–47. doi:10.1080/09695940301696

    Article  Google Scholar 

  • Gee JP (2004) Situated language and learning: a critique of traditional schooling. Routledge, New York

    Google Scholar 

  • Gee JP (2007) Learning and games. In: Salen K (ed) The ecology of games: connecting youth, games, and learning. MIT Press, Cambridge, MA, pp 21–40

  • Goldstone RL, Son JY (2005) The transfer of scientific principles using concrete and idealized simulations. J Learn Sci 14:69–110. doi:10.1207/s15327809jls1401_4

    Article  Google Scholar 

  • Greeno JG (1997) On claims that answer the wrong questions. Educ Res 26(1):5

    Google Scholar 

  • Greeno JG, Middle School Mathematics through Applications Project Group (1998) The situativity of knowing, learning, and research. Am Psychol 53:5–26. doi:10.1037/0003-066X.53.1.5

    Article  Google Scholar 

  • Greeno JG, Collins AM, Resnick LB (1996) Cognition and learning. In: Berliner DC, Calfee RC (eds) Handbook of educational psychology. Macmillan, New York, pp 15–46

    Google Scholar 

  • Habib L, Wittek L (2007) The portfolio as artifact and actor. Mind Cult Act 14(4):266–282. doi:10.1080/10749030701623763

    Article  Google Scholar 

  • Haertel EH (1999) Performance assessment and educational reform. Phi Delta Kappan 80(9):662–667

    Google Scholar 

  • Haertel EH, Greeno JG (2003) A situative perspective: broadening the foundations of assessment. Measurement 1:154–161

    Google Scholar 

  • Hannafin RD, Foshay WR (2008) Computer-based instruction’s (CBI) rediscovered role in K-12: an evaluation case study of one high-school’s use of CBI to improve pass rates on high-stakes tests. Educ Technol Res Dev 56:147–160. doi:10.1007/s11423-006-9007-4

    Article  Google Scholar 

  • Harlen W (2007) On the relationship between assessment for formative and summative purposes. In: Gardner J (ed) Assessment and learning. Sage Publications Ltd., London, pp 103–118

    Google Scholar 

  • Hattie H, Timperley H (2007) The power of feedback. Rev Educ Res 77:81–112. doi:10.3102/003465430298487

    Article  Google Scholar 

  • Hickey DT, Anderson K (2007) Situative approaches to assessment for resolving problems in educational testing and transforming communities of educational practice. In: Moss P (ed) Evidence and decision making. The 103rd NSSE Yearbook. National Society for the Study of Education/University of Chicago Press, Chicago, pp 269–293

    Google Scholar 

  • Hickey DT, Pellegrino JW (2005) Theory, level, and function: three dimensions for understanding the connections between transfer and student assessment. In: Mestre JP (ed) Transfer of learning from a modern multidisciplinary perspective. Information Age Publishers, Greenwich, pp 251–253

    Google Scholar 

  • Hickey DT, Zuiker SJ (2005) Engaged participation: a sociocultural model of motivation with implications for assessment. Educ Assess 10:277–305. doi:10.1207/s15326977ea1003_7

    Article  Google Scholar 

  • Hickey DT, Kindfield ACH, Horwitz P, Christie MA (2003) Integrating instruction, assessment, & evaluation in a technology-supported genetics environment. Am Educ Res J 40(2):495–538. doi:10.3102/00028312040002495

    Article  Google Scholar 

  • Hickey DT, Zuiker SJ, Taasoobshirazi G, Schafer NJ, Michael MA (2006) Three is the magic number: a design-based framework for balancing formative and summative functions of assessment. Stud Educ Eval 32:180–201. doi:10.1016/j.stueduc.2006.08.006

    Article  Google Scholar 

  • Ketelhut D (2007) The impact of student self-efficacy on scientific inquiry skills: an exploratory investigation in river city, a multi-user virtual environment. J Sci Educ Technol 16(1):99–111. doi:10.1007/s10956-006-9038-y

    Article  Google Scholar 

  • Koedinger KR, Anderson JR, Hadley WH, Mark MA (1997) Intelligent tutoring goes to school in the big city. Int J Artif Intell Educ 8:30–43

    Google Scholar 

  • Lemke J (2000) Across the scales of time. Mind Cult Act 7(4):273–290. doi:10.1207/S15327884MCA0704_03

    Article  Google Scholar 

  • Martindale T, Pearson C, Curda L, Pilcher J (2005) Effects of an online instructional application on reading and mathematics standardized test scores. J Res Technol Educ 37:349–360

    Google Scholar 

  • Messick S (1994) The interplay of evidence and consequences in the validation of performance assessments. Educ Res 23(2):13–18

    Google Scholar 

  • Messick S (1995) Validity of psychological assessment: validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. Am Psychol 50(9):741–749. doi:10.1037/0003-066X.50.9.741

    Article  Google Scholar 

  • Moss PA (1992) Shifting conceptions of validity in educational measurement: implications for performance assessment. Rev Educ Res 62:229–258

    Google Scholar 

  • National Research Council (2001a) In: Atkin JM, Black P, Coffey J (eds) Classroom assessment and the national science education standards. National Academies Press, Washington

    Google Scholar 

  • National Research Council (2001b) In: Pellegrino JW, Chudowski N, Glaser RW (eds) Knowing what students know: the science and design of educational assessment. National Academy Press, Washington

    Google Scholar 

  • National Research Council (2006) Systems for state science assessment. The National Academies Press, Washington

    Google Scholar 

  • Nelson B, Ketelhut DJ, Clarke J, Bowman C, Dede C (2005) Design-based research strategies for developing a scientific inquiry curriculum in a multi-user virtual environment. Educ Technol 45(1):21–27

    Google Scholar 

  • Phelps RP (ed) (2005) Defending standardized testing. Erlbaum, Matewan

    Google Scholar 

  • Sadler DR (1989) Formative assessment and the design of instructional systems. Instr Sci 18(2):119–144. doi:10.1007/BF00117714

    Article  Google Scholar 

  • Sadler TD (2004) Informal reasoning regarding socioscientific issues: a critical review of research. J Res Sci Teach 41(5):513–536. doi:10.1002/tea.20009

    Article  Google Scholar 

  • Shavelson RJ, Baxter GP, Pine J (1991) Performance assessment in science. Appl Meas Educ 4(4):347–362. doi:10.1207/s15324818ame0404_7

    Article  Google Scholar 

  • Shepard LA (1993) Evaluating test validity. Rev Educ Res 19:405–450

    Google Scholar 

  • Shute VJ (2008) Focus on formative feedback. Rev Educ Res 78:153–189. doi:10.3102/0034654307313795

    Article  Google Scholar 

  • Shute VJ, Hansen EG, Almond RG (2008) You can’t fatten a hog by weighing it–or can you? Evaluating an assessment system for learning called ACED. Int J Artif Intell Educ 18:289–316

    Google Scholar 

  • Squire K (2006) From content to context: videogames as designed experience. Educ Res 35(8):19. doi:10.3102/0013189X035008019

    Article  Google Scholar 

  • Steinkuehler CA (2006) Massively multiplayer online video gaming as participation in a discourse. Mind Cult Act 13(1):38–52. doi:10.1207/s15327884mca1301_4

    Article  Google Scholar 

  • Sternberg RJ (2006) Real improvement for real students: test smarter, serve better. Harv Educ Rev 76(4):557–563

    Google Scholar 

  • Taasoobshirazi G, Anderson KA, Zuiker SJ, Hickey DT (2006) Enhancing inquiry, understanding, and achievement in an astronomy multimedia astronomy learning environment. J Sci Educ Technol 15:383–395. doi:10.1007/s10956-006-9028-0

    Article  Google Scholar 

  • United States House of Representatives (2002) House resolution 3801, Educational Sciences Reform Act of 2002

  • Von Secker CE, Lissitz RW (1999) Estimating the impact of instructional practices on student’s achievement in science. J Res Sci Teach 36(10):1110–1126. doi:10.1002/(SICI)1098-2736(199912)36:10<1110::AID-TEA4>3.0.CO;2-T

    Article  Google Scholar 

  • Wenger E (1998) Communities of practice: learning, meaning, and identity. Cambridge University Press, Cambridge

    Google Scholar 

  • Wiggins G (2001) Assessing student performance: exploring the purpose and limits of testing. Jossey-Bass, San Francisco

    Google Scholar 

  • Wilson M (ed) (2004) Towards coherence in classroom assessment and accountability. The 103rd Yearbook of the National Society for the Study of Education. University of Chicago Press, Chicago

    Google Scholar 

  • Winerip M (2005) Are schools passing or failing? Now there’s a third choice…both. New York Times, November 2, p 1

  • Ysseldyke JE, Tardrew S (2007) Use of a progress monitoring system to enable teachers to differentiate mathematics instruction. J Appl Sch Psychol 24:1–28. doi:10.1300/J370v24n01_01

    Article  Google Scholar 

  • Zuiker SJ (2007) Transforming practice: designing for liminal transitions along trajectories of participation. Unpublished doctoral dissertation, Indiana University

Download references

Acknowledgments

Special thanks to Sasha Barab for his leadership as the director of the Quest Atlantis project and the lead developer of the Taiga world and curriculum; thanks to Sasha Barab, Chris Dede, and Doug Clark for crucial feedback on earlier versions of this article. Thanks also to Jacob Summers for his continuing participation in implementation and refinement of Taiga and for the student in his classroom for participating in these studies. Anna Arici, Jo Gilbertson, and Bronwyn Stuckey also contributed to the initial curricular design. Steven Zuiker, Eun Ju Kwon, and Anna Arici were instrumental in the assessment design, curricular revision, implementation support, and analysis in this study and the broader program of inquiry. This research was supported by the National Science Foundation Grant REC-0092831 to Indiana University. The views expressed here do not necessarily represent the views of the National Science Foundation or Indiana University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel T. Hickey.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hickey, D.T., Ingram-Goble, A.A. & Jameson, E.M. Designing Assessments and Assessing Designs in Virtual Educational Environments. J Sci Educ Technol 18, 187–208 (2009). https://doi.org/10.1007/s10956-008-9143-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10956-008-9143-1

Keywords

Navigation