Abstract
Achieving a fair and rigorous assessment of participants in simulation games represents a major challenge. Not only does the difficulty apply to the actual negotiation part, but it also extends to the written assignments that typically accompany a simulation. For one thing, if different raters are involved, it is important to assure that differences in severity do not affect the grades. Recently, comparative judgement (CJ) has been introduced as a method allowing for a team-based grading. This chapter discusses in particular the potential of comparative judgement for assessing briefing papers from 84 students. Four assessors completed 622 comparisons in the Digital Platform for the Assessment of Competences (D-PAC) tool. Results indicate a reliability level of 0.71 for the final rank order, which had demanded a time investment around 10.5 h from the team of assessors. Next to this, there was no evidence of bias towards the most important roles in the simulation game. The study also details how the obtained rank orders were translated into grades, ranging from 11 to 17 out of 20. These elements showcase CJ’s advantage in reaching adequate reliability levels for briefing papers in an efficient manner.
Pierpaolo Settembri writes in a personal capacity and the views he expresses in this publication may not be in any circumstances regarded as stating an official position of the European Commission.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Raymond and Usherwood (2013, p. 4) put it extremely clearly: “University faculty must ask themselves what a simulation adds to a student’s knowledge base that cannot be learned more efficiently in a traditional classroom setting, and how this can be measured”. Baranowski and Weir (2015) offer a deep review of the literature evaluating the effects of simulations, coming to the conclusion that “a small but growing body of evidence lends support to the contention that students who participate in simulations do in fact learn more than students not taking part in this exercise”. For a different outcome, see Raymond (2010).
- 2.
Alternatively, the data for the assessment can be collected by videotaping the meetings. Although this is a highly intrusive method, it yields material useful for subsequent analyses.
- 3.
Perchoc (2016) mentions the example of the International Relations Department of the College of Europe in Bruges.
- 4.
To be noted here that success or failure of a simulation does not necessarily mean that participants managed or failed to find an agreement. This is a subjective notion that the instructor defines on the basis of prior criteria and learning objectives.
- 5.
In their Model United Nations simulation programme, they have a member of the teaching team to chair the final conference “to maintain equity of opportunity in assessment … and to ensure adherence to the rules of procedure” (Obendorf and Randerson 2013, p. 357). They make a similar exception for the activities of the Secretariat. While there might be an undue advantage granted to those who are assigned these roles (hence the need to mitigate or compensate for it in various ways, as explained in this chapter), similar exceptions could be detrimental to the realism of the simulation itself as it creates an artificial subordination between different categories of players that has no equivalent in reality, as the authors themselves admit.
- 6.
The details of this simulation game have been provided in the chapter on verisimilitude. The official page of the course is accessible here: https://www.coleurope.eu/course/settembri-p-hermanin-c-worth-j-negotiation-and-decision-making-eu-simulation-game-50h.
- 7.
The combination of participation and written contribution is common also to other modules. For example, Obendorf and Randerson (2013) describe a formal assessment based on four components, with a similar articulation: a written country position paper (25%), participation in the simulation (35%), a binder of research sources (25%) and reflective essays (15%).
- 8.
This assignment has been described in greater detail in the chapter concerning verisimilitude.
- 9.
For a more detailed description of this tool, please refer to the chapter on verisimilitude.
- 10.
In fact, the total pool consisted of 96 papers, but 12 of these were of different nature. They were assignments to non-institutional actors (journalists, lobbyists, NGOs and other stakeholders), for which the briefing was not a suitable assignment. These 12 assignments have been assessed separately, but based on the same rationale as in the D-PAC tool. The analysis here focuses exclusively on the larger pool.
- 11.
In fact it is standard practice that a course is assessed before and irrespective of how students have been graded.
References
Andrich, D. (1982). An index of person separation in latent trait theory, the traditional KR-20 index, and the Guttman scale response pattern. Education Research and Perspectives, 9(1), 95–104.
Anshel, M. H., Kang, M., & Jubenville, C. (2013). Sources of acute sport stress scale for sports officials: Rasch calibration. Psychology of Sport and Exercise, 14(3), 362–370. https://doi.org/10.1016/j.psychsport.2012.12.003
Baranowski, M., & Weir, K. (2015). Political simulations: What we know, what we think we know, and what we still need to know. Journal of Political Science Education, 11(4), 391–403. https://doi.org/10.1080/15512169.2015.1065748
Bloxham, S., den-Outer, B., Hudson, J., & Price, M. (2016). Let’s stop the pretence of consistent marking: Exploring the multiple limitations of assessment criteria. Assessment & Evaluation in Higher Education, 41(3), 466–481. https://doi.org/10.1080/02602938.2015.1024607
Bramley, T. (2007). Paired comparison methods. In J. B. P. Newton, H. Goldstein, H. Patrick, & P. Tymms (Eds.), Techniques for monitoring the comparability of examination standards (pp. 246–294). London: QCA.
Bramley, T. (2015). Investigating the reliability of adaptive comparative judgment (Cambridge Assessment Research Report). Cambridge: Cambridge Assessment. http://www.cambridgeassessment.org.uk/Images/232694-investigating-the-reliability-of-adaptive-comparative-judgment.pdf. Accessed 01 Dec 2016.
Chin, J., Dukes, R., & Gamson, W. (2009). Assessment in simulation and gaming: A review of the last 40 years. Simulation & Gaming, 40(4), 553–568. https://doi.org/10.1177/1046878109332955
Heldsinger, S., & Humphry, S. (2010). Using the method of pairwise comparison to obtain reliable teacher assessments. The Australian Educational Researcher, 37(2), 1–19. https://doi.org/10.1007/BF03216919
Heldsinger, S., & Humphry, S. (2013). Using calibrated exemplars in the teacher-assessment of writing: An empirical study. Educational Research, 55(3), 219–235. https://doi.org/10.1080/00131881.2013.825159
Jones, I., & Alcock, L. (2014). Peer assessment without assessment criteria. Studies in Higher Education, 39(10), 1774–1787. https://doi.org/10.1080/03075079.2013.821974
Jones, I., Inglis, M., Gilmore, C. K., & Hodgen, J. (2013). Measuring conceptual understanding: The case of fractions. Retrieved from https://dspace.lboro.ac.uk/dspace-jspui/handle/2134/12828. Accessed 1 Dec 2016.
Jones, I., Swan, M., & Pollitt, A. (2015). Assessing mathematical problem solving using comparative judgement. International Journal of Science and Mathematics Education, 13(1), 151–177. https://doi.org/10.1007/s10763-013-9497-6
Laming, D. (2003). Human judgment: The eye of the beholder. Andover: Cengage Learning EMEA.
McMahon, S., & Jones, I. (2015). A comparative judgement approach to teacher assessment. Assessment in Education: Principles, Policy & Practice, 22(3), 368–389. https://doi.org/10.1080/0969594X.2014.978839
Obendorf, S., & Randerson, C. (2013). Evaluating the Model United Nations: Diplomatic simulation as assessed undergraduate coursework. European Political Science, 12(3), 350–364. https://doi.org/10.1057/eps.2013.13
Perchoc, P. (2016). Les simulations européennes. Généalogie d’une adaptation au Collège d’Europe. Politique Européenne, 2016(2), 58–82.
Pollitt, A. (2012). The method of adaptive comparative judgement. Assessment in Education: Principles, Policy, & Practice, 19(3), 281–300. https://doi.org/10.1080/0969594X.2012.665354
Raiser, S., Schneider, A., & Warkalla, B. (2015). Simulating Europe: Choosing the right learning objectives for simulation games. European Political Science, 14(3), 228–240. https://doi.org/10.1057/eps.2015.20
Raymond, C. (2010). Do role-playing simulations generate measurable and meaningful outcomes? A simulation’s effect on exam scores and teaching evaluations. International Studies Perspectives, 11(1), 51–60. https://doi.org/10.1111/j.1528-3585.2009.00392.x
Raymond, C., & Usherwood, S. (2013). Assessment in simulations. Journal of Political Science Education, 9(2), 157–167. https://doi.org/10.1080/15512169.2013.770984
Sadler, D. R. (2009). Indeterminacy in the use of preset criteria for assessment and grading. Assessment & Evaluation in Higher Education, 34(2), 159–179. https://doi.org/10.1080/02602930801956059
Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 34(4), 273–286.
Verhavert, S., De Maeyer, S., Donche, V., & Coertjens, L. (2016, November 3–5). Comparative judgement and scale separation reliability: Yes, but what does it mean? Paper presented at the 17th annual conference Association for Educational Assessment Europe. Limassol: Cyprus.
Whitehouse, C. (2012). Testing the validity of judgements about geography essays using the adaptive comparative judgement method. Manchester: AQA Centre for Education Research and Policy. https://cerp.aqa.org.uk/research-library/testing-validity-judgements-using-adaptive-comparative-judgement-method. Accessed 01 Dec 2016
Whitehouse, C., & Pollitt, A. (2012). Using adaptive comparative judgement to obtain a highly reliable rank order in summative assessment. Manchester: AQA Centre for Education Research and Policy. https://cerp.aqa.org.uk/sites/default/files/pdf_upload/CERP_RP_CW_20062012_2.pdf. Accessed 01 Dec 2016
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Settembri, P., Van Gasse, R., Coertjens, L., De Maeyer, S. (2018). Oranges and Apples? Using Comparative Judgement for Reliable Briefing Paper Assessment in Simulation Games. In: Bursens, P., Donche, V., Gijbels, D., Spooren, P. (eds) Simulations of Decision-Making as Active Learning Tools. Professional and Practice-based Learning, vol 22. Springer, Cham. https://doi.org/10.1007/978-3-319-74147-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-74147-5_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-74146-8
Online ISBN: 978-3-319-74147-5
eBook Packages: EducationEducation (R0)