Skip to main content

Retrieving Hierarchical Syllabus Items for Exam Question Analysis

  • Conference paper
Advances in Information Retrieval (ECIR 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9626))

Included in the following conference series:

Abstract

Educators, institutions, and certification agencies often want to know if students are being evaluated appropriately and completely with regard to a standard. To help educators understand if examinations are well-balanced or topically correct, we explore the challenge of classifying exam questions into a concept hierarchy.

While the general problems of text-classification and retrieval are quite commonly studied, our domain is particularly unusual because the concept hierarchy is expert-built but without actually having the benefit of being a well-used knowledge-base.

We propose a variety of approaches to this “small-scale” Information Retrieval challenge. We use an external corpus of Q&A data for expansion of concepts, and propose a model of using the hierarchy information effectively in conjunction with existing retrieval models. This new approach is more effective than typical unsupervised approaches and more robust to limited training data than commonly used text-classification or machine learning methods.

In keeping with the goal of providing a service to educators for better understanding their exams, we also explore interactive methods, focusing on low-cost relevance feedback signals within the concept hierarchy to provide further gains in accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Even most standardized tests require test-takers to sign agreements not to distribute or mention the questions, even after the exam is taken.

  2. 2.

    User Fiire; http://chemistry.stackexchange.com/questions/4250. This example displayed in lieu of the proprietary ACS data.

  3. 3.

    http://lemurproject.org/galago.php.

  4. 4.

    The beta version of chemistry.stackexchange.com.

References

  1. Aggarwal, C.C., Zhai, C.: A survey of text classification algorithms. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp. 163–222. Springer, New York (2012)

    Chapter  Google Scholar 

  2. Balog, K., Azzopardi, L., de Rijke, M.: A language modeling framework for expert finding. Inf. Process. Manage. 45(1), 1–19 (2009)

    Article  Google Scholar 

  3. Banerjee, S., Ramanathan, K., Gupta, A.: Clustering short texts using wikipedia. In: SIGIR 2007, New York, NY, USA. ACM (2007)

    Google Scholar 

  4. Bekkerman, R., Raghavan, H., Allan, J., Eguchi, K.: Interactive clustering of text collections according to a user-specified criterion. In: Proceedings of IJCAI, pp. 684–689 (2007)

    Google Scholar 

  5. de Melo, G., Weikum, G.: Taxonomic data integration from multilingual wikipedia editions. Knowl. Inf. Syst. 39(1), 1–39 (2014)

    Article  Google Scholar 

  6. Dumais, S., Chen, H.: Hierarchical classification of web content. In: SIGIR 2000, pp. 256–263. ACM, New York, NY, USA (2000)

    Google Scholar 

  7. Efron, M., Organisciak, P., Fenlon, K.: Improving retrieval of short texts through document expansion. In: SIGIR 2012, pp. 911–920. ACM (2012)

    Google Scholar 

  8. Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using wikipedia-based explicit semantic analysis. IJCAI 7, 1606–1611 (2007)

    Google Scholar 

  9. Ganesan, P., Garcia-Molina, H., Widom, J.: Exploiting hierarchical domain structure to compute similarity. ACM Trans. Inf. Syst. 21(1), 64–93 (2003)

    Article  Google Scholar 

  10. Hoi, S.C., Jin, R., Lyu, M.R.: Large-scale text categorization by batch mode active learning. In: WWW 2006, pp. 633–642. ACM (2006)

    Google Scholar 

  11. Holme, T.: Comparing recent organizing templates for test content between ACS exams in general chemistry and AP chemistry. J. Chem. Edu. 91(9), 1352–1356 (2014)

    Article  Google Scholar 

  12. Holme, T., Murphy, K.: The ACS exams institute undergraduate chemistry anchoring concepts content Map I: general chemistry. J. Chem. Edu. 89(6), 721–723 (2012)

    Article  Google Scholar 

  13. Järvelin, K., Kekäläinen, J.: IR evaluation methods for retrieving highly relevant documents. In: SIGIR 2000, pp. 41–48. ACM (2000)

    Google Scholar 

  14. Klimt, B., Yang, Y.: Introducing the enron corpus. In: CEAS (2004)

    Google Scholar 

  15. Lee, J.H., Kim, M.H., Lee, Y.J.: Information retrieval based on conceptual distance in IS-A hierarchies. J. Doc. 49(2), 188–207 (1993)

    Article  Google Scholar 

  16. Liu, X., Croft, W.B.: Cluster-based retrieval using language models. In: SIGIR 2004, pp. 186–193. ACM, New York, NY, USA (2004)

    Google Scholar 

  17. Luxford, C.J., Linenberger, K.J., Raker, J.R., Baluyut, J.Y., Reed, J.J., De Silva, C., Holme, T.A.: Building a database for the historical analysis of the general chemistry curriculum using ACS general chemistry exams as artifacts. J. Chem. Edu. 92, 230–236 (2014)

    Article  Google Scholar 

  18. Metzler, D., Croft, W.: Analysis of statistical question classification for fact-based questions. Inf. Retr. 8(3), 481–504 (2005)

    Article  Google Scholar 

  19. Metzler, D., Croft, W.B.: A markov random field model for term dependencies. In: SIGIR 2005, pp. 472–479. ACM (2005)

    Google Scholar 

  20. Omar, N., Haris, S.S., Hassan, R., Arshad, H., Rahmat, M., Zainal, N.F.A., Zulkifli, R.: Automated analysis of exam questions according to bloom’s taxonomy. Procedia - Soc. Behav. Sci. 59, 297–303 (2012)

    Article  Google Scholar 

  21. Petkova, D., Croft, W.B.: Hierarchical language models for expert finding in enterprise corpora. Int. J. Artif. Intell. Tools 17(01), 5–18 (2008)

    Article  Google Scholar 

  22. Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: SIGIR 1998, pp. 275–281. ACM (1998)

    Google Scholar 

  23. Ren, Z., Peetz, M.-H., Liang, S., van Dolen, W., de Rijke, M.: Hierarchical multi-label classification of social text streams. In: SIGIR 2014, pp. 213–222. ACM, New York, NY, USA (2014)

    Google Scholar 

  24. Settles, B.: Active learning literature survey. Technical report, University of Wisconsin-Madison, Computer Sciences Technical report 1648, January 2010

    Google Scholar 

  25. Singhal, A., Pereira, F.: Document expansion for speech retrieval. In: SIGIR 1999, pp. 34–41. ACM (1999)

    Google Scholar 

  26. Sun, X., Wang, H., Yu, Y.: Towards effective short text deep classification. In: SIGIR 2011, pp. 1143–1144. ACM, New York, NY, USA (2011)

    Google Scholar 

  27. Tao, T., Wang, X., Mei, Q., Zhai, C.: Language model information retrieval with document expansion. In: NAACL 2006, pp. 407–414. ACL (2006)

    Google Scholar 

  28. Wang, P., Hu, J., Zeng, H.-J., Chen, Z.: Using wikipedia knowledge to improve text classification. Knowl. Inf. Syst. 19(3), 265–281 (2009)

    Article  Google Scholar 

  29. Xue, G.-R., Xing, D., Yang, Q., Yu, Y.: Deep classification in large-scale text hierarchies. In: SIGIR 2008, pp. 619–626. ACM, New York, NY (2008)

    Google Scholar 

  30. Zhang, D., Lee, W.S.: Question classification using support vector machines. In: SIGIR 2003, pp. 26–32. ACM, New York, NY, USA (2003)

    Google Scholar 

Download references

Acknowledgments

The authors thank Prof. Thomas Holme of Iowa State University’s Department of Chemistry for making the data used in this study available and Stephen Battisti of UMass’ Center for Educational Software Development for help accessing and formatting the data.

This work was supported in part by the Center for Intelligent Information Retrieval and in part by NSF grant numbers IIS-0910884 and DUE-1323469. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the sponsors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John Foley .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Foley, J., Allan, J. (2016). Retrieving Hierarchical Syllabus Items for Exam Question Analysis. In: Ferro, N., et al. Advances in Information Retrieval. ECIR 2016. Lecture Notes in Computer Science(), vol 9626. Springer, Cham. https://doi.org/10.1007/978-3-319-30671-1_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-30671-1_42

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-30670-4

  • Online ISBN: 978-3-319-30671-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics