Oranges and Apples? Using Comparative Judgement for Reliable Briefing Paper Assessment in Simulation Games

Settembri, Pierpaolo; Van Gasse, Roos; Coertjens, Liesje; De Maeyer, Sven

doi:10.1007/978-3-319-74147-5_8

Pierpaolo Settembri⁸,
Roos Van Gasse⁹,
Liesje Coertjens^9,10 &
…
Sven De Maeyer⁹

Part of the book series: Professional and Practice-based Learning ((PPBL,volume 22))

599 Accesses
2 Citations
1 Altmetric

Abstract

Achieving a fair and rigorous assessment of participants in simulation games represents a major challenge. Not only does the difficulty apply to the actual negotiation part, but it also extends to the written assignments that typically accompany a simulation. For one thing, if different raters are involved, it is important to assure that differences in severity do not affect the grades. Recently, comparative judgement (CJ) has been introduced as a method allowing for a team-based grading. This chapter discusses in particular the potential of comparative judgement for assessing briefing papers from 84 students. Four assessors completed 622 comparisons in the Digital Platform for the Assessment of Competences (D-PAC) tool. Results indicate a reliability level of 0.71 for the final rank order, which had demanded a time investment around 10.5 h from the team of assessors. Next to this, there was no evidence of bias towards the most important roles in the simulation game. The study also details how the obtained rank orders were translated into grades, ranging from 11 to 17 out of 20. These elements showcase CJ’s advantage in reaching adequate reliability levels for briefing papers in an efficient manner.

Pierpaolo Settembri writes in a personal capacity and the views he expresses in this publication may not be in any circumstances regarded as stating an official position of the European Commission.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Raymond and Usherwood (2013, p. 4) put it extremely clearly: “University faculty must ask themselves what a simulation adds to a student’s knowledge base that cannot be learned more efficiently in a traditional classroom setting, and how this can be measured”. Baranowski and Weir (2015) offer a deep review of the literature evaluating the effects of simulations, coming to the conclusion that “a small but growing body of evidence lends support to the contention that students who participate in simulations do in fact learn more than students not taking part in this exercise”. For a different outcome, see Raymond (2010).
2.
Alternatively, the data for the assessment can be collected by videotaping the meetings. Although this is a highly intrusive method, it yields material useful for subsequent analyses.
3.
Perchoc (2016) mentions the example of the International Relations Department of the College of Europe in Bruges.
4.
To be noted here that success or failure of a simulation does not necessarily mean that participants managed or failed to find an agreement. This is a subjective notion that the instructor defines on the basis of prior criteria and learning objectives.
5.
In their Model United Nations simulation programme, they have a member of the teaching team to chair the final conference “to maintain equity of opportunity in assessment … and to ensure adherence to the rules of procedure” (Obendorf and Randerson 2013, p. 357). They make a similar exception for the activities of the Secretariat. While there might be an undue advantage granted to those who are assigned these roles (hence the need to mitigate or compensate for it in various ways, as explained in this chapter), similar exceptions could be detrimental to the realism of the simulation itself as it creates an artificial subordination between different categories of players that has no equivalent in reality, as the authors themselves admit.
6.
The details of this simulation game have been provided in the chapter on verisimilitude. The official page of the course is accessible here: https://www.coleurope.eu/course/settembri-p-hermanin-c-worth-j-negotiation-and-decision-making-eu-simulation-game-50h.
7.
The combination of participation and written contribution is common also to other modules. For example, Obendorf and Randerson (2013) describe a formal assessment based on four components, with a similar articulation: a written country position paper (25%), participation in the simulation (35%), a binder of research sources (25%) and reflective essays (15%).
8.
This assignment has been described in greater detail in the chapter concerning verisimilitude.
9.
For a more detailed description of this tool, please refer to the chapter on verisimilitude.
10.
In fact, the total pool consisted of 96 papers, but 12 of these were of different nature. They were assignments to non-institutional actors (journalists, lobbyists, NGOs and other stakeholders), for which the briefing was not a suitable assignment. These 12 assignments have been assessed separately, but based on the same rationale as in the D-PAC tool. The analysis here focuses exclusively on the larger pool.
11.
In fact it is standard practice that a course is assessed before and irrespective of how students have been graded.

References

Andrich, D. (1982). An index of person separation in latent trait theory, the traditional KR-20 index, and the Guttman scale response pattern. Education Research and Perspectives, 9(1), 95–104.
Google Scholar
Anshel, M. H., Kang, M., & Jubenville, C. (2013). Sources of acute sport stress scale for sports officials: Rasch calibration. Psychology of Sport and Exercise, 14(3), 362–370. https://doi.org/10.1016/j.psychsport.2012.12.003
Article Google Scholar
Baranowski, M., & Weir, K. (2015). Political simulations: What we know, what we think we know, and what we still need to know. Journal of Political Science Education, 11(4), 391–403. https://doi.org/10.1080/15512169.2015.1065748
Article Google Scholar
Bloxham, S., den-Outer, B., Hudson, J., & Price, M. (2016). Let’s stop the pretence of consistent marking: Exploring the multiple limitations of assessment criteria. Assessment & Evaluation in Higher Education, 41(3), 466–481. https://doi.org/10.1080/02602938.2015.1024607
Article Google Scholar
Bramley, T. (2007). Paired comparison methods. In J. B. P. Newton, H. Goldstein, H. Patrick, & P. Tymms (Eds.), Techniques for monitoring the comparability of examination standards (pp. 246–294). London: QCA.
Google Scholar
Bramley, T. (2015). Investigating the reliability of adaptive comparative judgment (Cambridge Assessment Research Report). Cambridge: Cambridge Assessment. http://www.cambridgeassessment.org.uk/Images/232694-investigating-the-reliability-of-adaptive-comparative-judgment.pdf. Accessed 01 Dec 2016.
Chin, J., Dukes, R., & Gamson, W. (2009). Assessment in simulation and gaming: A review of the last 40 years. Simulation & Gaming, 40(4), 553–568. https://doi.org/10.1177/1046878109332955
Article Google Scholar
Heldsinger, S., & Humphry, S. (2010). Using the method of pairwise comparison to obtain reliable teacher assessments. The Australian Educational Researcher, 37(2), 1–19. https://doi.org/10.1007/BF03216919
Article Google Scholar
Heldsinger, S., & Humphry, S. (2013). Using calibrated exemplars in the teacher-assessment of writing: An empirical study. Educational Research, 55(3), 219–235. https://doi.org/10.1080/00131881.2013.825159
Article Google Scholar
Jones, I., & Alcock, L. (2014). Peer assessment without assessment criteria. Studies in Higher Education, 39(10), 1774–1787. https://doi.org/10.1080/03075079.2013.821974
Article Google Scholar
Jones, I., Inglis, M., Gilmore, C. K., & Hodgen, J. (2013). Measuring conceptual understanding: The case of fractions. Retrieved from https://dspace.lboro.ac.uk/dspace-jspui/handle/2134/12828. Accessed 1 Dec 2016.
Jones, I., Swan, M., & Pollitt, A. (2015). Assessing mathematical problem solving using comparative judgement. International Journal of Science and Mathematics Education, 13(1), 151–177. https://doi.org/10.1007/s10763-013-9497-6
Article Google Scholar
Laming, D. (2003). Human judgment: The eye of the beholder. Andover: Cengage Learning EMEA.
Google Scholar
McMahon, S., & Jones, I. (2015). A comparative judgement approach to teacher assessment. Assessment in Education: Principles, Policy & Practice, 22(3), 368–389. https://doi.org/10.1080/0969594X.2014.978839
Article Google Scholar
Obendorf, S., & Randerson, C. (2013). Evaluating the Model United Nations: Diplomatic simulation as assessed undergraduate coursework. European Political Science, 12(3), 350–364. https://doi.org/10.1057/eps.2013.13
Article Google Scholar
Perchoc, P. (2016). Les simulations européennes. Généalogie d’une adaptation au Collège d’Europe. Politique Européenne, 2016(2), 58–82.
Article Google Scholar
Pollitt, A. (2012). The method of adaptive comparative judgement. Assessment in Education: Principles, Policy, & Practice, 19(3), 281–300. https://doi.org/10.1080/0969594X.2012.665354
Article Google Scholar
Raiser, S., Schneider, A., & Warkalla, B. (2015). Simulating Europe: Choosing the right learning objectives for simulation games. European Political Science, 14(3), 228–240. https://doi.org/10.1057/eps.2015.20
Article Google Scholar
Raymond, C. (2010). Do role-playing simulations generate measurable and meaningful outcomes? A simulation’s effect on exam scores and teaching evaluations. International Studies Perspectives, 11(1), 51–60. https://doi.org/10.1111/j.1528-3585.2009.00392.x
Article Google Scholar
Raymond, C., & Usherwood, S. (2013). Assessment in simulations. Journal of Political Science Education, 9(2), 157–167. https://doi.org/10.1080/15512169.2013.770984
Article Google Scholar
Sadler, D. R. (2009). Indeterminacy in the use of preset criteria for assessment and grading. Assessment & Evaluation in Higher Education, 34(2), 159–179. https://doi.org/10.1080/02602930801956059
Article Google Scholar
Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 34(4), 273–286.
Article Google Scholar
Verhavert, S., De Maeyer, S., Donche, V., & Coertjens, L. (2016, November 3–5). Comparative judgement and scale separation reliability: Yes, but what does it mean? Paper presented at the 17th annual conference Association for Educational Assessment Europe. Limassol: Cyprus.
Google Scholar
Whitehouse, C. (2012). Testing the validity of judgements about geography essays using the adaptive comparative judgement method. Manchester: AQA Centre for Education Research and Policy. https://cerp.aqa.org.uk/research-library/testing-validity-judgements-using-adaptive-comparative-judgement-method. Accessed 01 Dec 2016
Google Scholar
Whitehouse, C., & Pollitt, A. (2012). Using adaptive comparative judgement to obtain a highly reliable rank order in summative assessment. Manchester: AQA Centre for Education Research and Policy. https://cerp.aqa.org.uk/sites/default/files/pdf_upload/CERP_RP_CW_20062012_2.pdf. Accessed 01 Dec 2016
Google Scholar

Download references

Author information

Authors and Affiliations

European Commission, Brussels, Belgium
Pierpaolo Settembri
University of Antwerp, Antwerp, Belgium
Roos Van Gasse, Liesje Coertjens & Sven De Maeyer
Université Catholique de Louvain, Louvain-la-Neuve, Belgium
Liesje Coertjens

Authors

Pierpaolo Settembri
View author publications
You can also search for this author in PubMed Google Scholar
Roos Van Gasse
View author publications
You can also search for this author in PubMed Google Scholar
Liesje Coertjens
View author publications
You can also search for this author in PubMed Google Scholar
Sven De Maeyer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pierpaolo Settembri .

Editor information

Editors and Affiliations

University of Antwerp, Antwerp, Belgium
Peter Bursens
University of Antwerp, Antwerp, Belgium
Vincent Donche
University of Antwerp, Antwerp, Belgium
David Gijbels
University of Antwerp, Antwerp, Belgium
Pieter Spooren

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Settembri, P., Van Gasse, R., Coertjens, L., De Maeyer, S. (2018). Oranges and Apples? Using Comparative Judgement for Reliable Briefing Paper Assessment in Simulation Games. In: Bursens, P., Donche, V., Gijbels, D., Spooren, P. (eds) Simulations of Decision-Making as Active Learning Tools. Professional and Practice-based Learning, vol 22. Springer, Cham. https://doi.org/10.1007/978-3-319-74147-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-74147-5_8
Published: 06 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-74146-8
Online ISBN: 978-3-319-74147-5
eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics