Retrieving Hierarchical Syllabus Items for Exam Question Analysis

Foley, John; Allan, James

doi:10.1007/978-3-319-30671-1_42

John Foley²¹ &
James Allan²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9626))

Included in the following conference series:

European Conference on Information Retrieval

4290 Accesses
2 Citations

Abstract

Educators, institutions, and certification agencies often want to know if students are being evaluated appropriately and completely with regard to a standard. To help educators understand if examinations are well-balanced or topically correct, we explore the challenge of classifying exam questions into a concept hierarchy.

While the general problems of text-classification and retrieval are quite commonly studied, our domain is particularly unusual because the concept hierarchy is expert-built but without actually having the benefit of being a well-used knowledge-base.

We propose a variety of approaches to this “small-scale” Information Retrieval challenge. We use an external corpus of Q&A data for expansion of concepts, and propose a model of using the hierarchy information effectively in conjunction with existing retrieval models. This new approach is more effective than typical unsupervised approaches and more robust to limited training data than commonly used text-classification or machine learning methods.

In keeping with the goal of providing a service to educators for better understanding their exams, we also explore interactive methods, focusing on low-cost relevance feedback signals within the concept hierarchy to provide further gains in accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Even most standardized tests require test-takers to sign agreements not to distribute or mention the questions, even after the exam is taken.
2.
User Fiire; http://chemistry.stackexchange.com/questions/4250. This example displayed in lieu of the proprietary ACS data.
3.
http://lemurproject.org/galago.php.
4.
The beta version of chemistry.stackexchange.com.

References

Aggarwal, C.C., Zhai, C.: A survey of text classification algorithms. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp. 163–222. Springer, New York (2012)
Chapter Google Scholar
Balog, K., Azzopardi, L., de Rijke, M.: A language modeling framework for expert finding. Inf. Process. Manage. 45(1), 1–19 (2009)
Article Google Scholar
Banerjee, S., Ramanathan, K., Gupta, A.: Clustering short texts using wikipedia. In: SIGIR 2007, New York, NY, USA. ACM (2007)
Google Scholar
Bekkerman, R., Raghavan, H., Allan, J., Eguchi, K.: Interactive clustering of text collections according to a user-specified criterion. In: Proceedings of IJCAI, pp. 684–689 (2007)
Google Scholar
de Melo, G., Weikum, G.: Taxonomic data integration from multilingual wikipedia editions. Knowl. Inf. Syst. 39(1), 1–39 (2014)
Article Google Scholar
Dumais, S., Chen, H.: Hierarchical classification of web content. In: SIGIR 2000, pp. 256–263. ACM, New York, NY, USA (2000)
Google Scholar
Efron, M., Organisciak, P., Fenlon, K.: Improving retrieval of short texts through document expansion. In: SIGIR 2012, pp. 911–920. ACM (2012)
Google Scholar
Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using wikipedia-based explicit semantic analysis. IJCAI 7, 1606–1611 (2007)
Google Scholar
Ganesan, P., Garcia-Molina, H., Widom, J.: Exploiting hierarchical domain structure to compute similarity. ACM Trans. Inf. Syst. 21(1), 64–93 (2003)
Article Google Scholar
Hoi, S.C., Jin, R., Lyu, M.R.: Large-scale text categorization by batch mode active learning. In: WWW 2006, pp. 633–642. ACM (2006)
Google Scholar
Holme, T.: Comparing recent organizing templates for test content between ACS exams in general chemistry and AP chemistry. J. Chem. Edu. 91(9), 1352–1356 (2014)
Article Google Scholar
Holme, T., Murphy, K.: The ACS exams institute undergraduate chemistry anchoring concepts content Map I: general chemistry. J. Chem. Edu. 89(6), 721–723 (2012)
Article Google Scholar
Järvelin, K., Kekäläinen, J.: IR evaluation methods for retrieving highly relevant documents. In: SIGIR 2000, pp. 41–48. ACM (2000)
Google Scholar
Klimt, B., Yang, Y.: Introducing the enron corpus. In: CEAS (2004)
Google Scholar
Lee, J.H., Kim, M.H., Lee, Y.J.: Information retrieval based on conceptual distance in IS-A hierarchies. J. Doc. 49(2), 188–207 (1993)
Article Google Scholar
Liu, X., Croft, W.B.: Cluster-based retrieval using language models. In: SIGIR 2004, pp. 186–193. ACM, New York, NY, USA (2004)
Google Scholar
Luxford, C.J., Linenberger, K.J., Raker, J.R., Baluyut, J.Y., Reed, J.J., De Silva, C., Holme, T.A.: Building a database for the historical analysis of the general chemistry curriculum using ACS general chemistry exams as artifacts. J. Chem. Edu. 92, 230–236 (2014)
Article Google Scholar
Metzler, D., Croft, W.: Analysis of statistical question classification for fact-based questions. Inf. Retr. 8(3), 481–504 (2005)
Article Google Scholar
Metzler, D., Croft, W.B.: A markov random field model for term dependencies. In: SIGIR 2005, pp. 472–479. ACM (2005)
Google Scholar
Omar, N., Haris, S.S., Hassan, R., Arshad, H., Rahmat, M., Zainal, N.F.A., Zulkifli, R.: Automated analysis of exam questions according to bloom’s taxonomy. Procedia - Soc. Behav. Sci. 59, 297–303 (2012)
Article Google Scholar
Petkova, D., Croft, W.B.: Hierarchical language models for expert finding in enterprise corpora. Int. J. Artif. Intell. Tools 17(01), 5–18 (2008)
Article Google Scholar
Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: SIGIR 1998, pp. 275–281. ACM (1998)
Google Scholar
Ren, Z., Peetz, M.-H., Liang, S., van Dolen, W., de Rijke, M.: Hierarchical multi-label classification of social text streams. In: SIGIR 2014, pp. 213–222. ACM, New York, NY, USA (2014)
Google Scholar
Settles, B.: Active learning literature survey. Technical report, University of Wisconsin-Madison, Computer Sciences Technical report 1648, January 2010
Google Scholar
Singhal, A., Pereira, F.: Document expansion for speech retrieval. In: SIGIR 1999, pp. 34–41. ACM (1999)
Google Scholar
Sun, X., Wang, H., Yu, Y.: Towards effective short text deep classification. In: SIGIR 2011, pp. 1143–1144. ACM, New York, NY, USA (2011)
Google Scholar
Tao, T., Wang, X., Mei, Q., Zhai, C.: Language model information retrieval with document expansion. In: NAACL 2006, pp. 407–414. ACL (2006)
Google Scholar
Wang, P., Hu, J., Zeng, H.-J., Chen, Z.: Using wikipedia knowledge to improve text classification. Knowl. Inf. Syst. 19(3), 265–281 (2009)
Article Google Scholar
Xue, G.-R., Xing, D., Yang, Q., Yu, Y.: Deep classification in large-scale text hierarchies. In: SIGIR 2008, pp. 619–626. ACM, New York, NY (2008)
Google Scholar
Zhang, D., Lee, W.S.: Question classification using support vector machines. In: SIGIR 2003, pp. 26–32. ACM, New York, NY, USA (2003)
Google Scholar

Download references

Acknowledgments

The authors thank Prof. Thomas Holme of Iowa State University’s Department of Chemistry for making the data used in this study available and Stephen Battisti of UMass’ Center for Educational Software Development for help accessing and formatting the data.

This work was supported in part by the Center for Intelligent Information Retrieval and in part by NSF grant numbers IIS-0910884 and DUE-1323469. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the sponsors.

Author information

Authors and Affiliations

Center for Intelligent Information Retrieval, College of Information and Computer Sciences, University of Massachusetts Amherst, Amherst, USA
John Foley & James Allan

Authors

John Foley
View author publications
You can also search for this author in PubMed Google Scholar
James Allan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to John Foley .

Editor information

Editors and Affiliations

Department of Information Engineering, University of Padua, Padova, Italy
Nicola Ferro
Faculty of Informatics, University of Lugano (USI), Lugano, Switzerland
Fabio Crestani
Department of Computer Science, Katholieke Universiteit Leuven, Heverlee, Belgium
Marie-Francine Moens
Systèmes d’informations, Big Data et Recherche d’Information, Institut de Recherche en Informatique de Toulouse IRIT/équipe SIG, Toulouse Cedex 04, France
Josiane Mothe
Yahoo! Labs London, London, UK
Fabrizio Silvestri
Department of Information Engineering, University of Padua, Padova, Italy
Giorgio Maria Di Nunzio
TU Delft - EWI/ST/WIS, Delft, The Netherlands
Claudia Hauff
Department of Information Engineering, University of Padua, Padova, Italy
Gianmaria Silvello

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Foley, J., Allan, J. (2016). Retrieving Hierarchical Syllabus Items for Exam Question Analysis. In: Ferro, N., et al. Advances in Information Retrieval. ECIR 2016. Lecture Notes in Computer Science(), vol 9626. Springer, Cham. https://doi.org/10.1007/978-3-319-30671-1_42

Download citation

DOI: https://doi.org/10.1007/978-3-319-30671-1_42
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30670-4
Online ISBN: 978-3-319-30671-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics