Skip to main content

Advertisement

Log in

Analyzing collaborative learning processes automatically: Exploiting the advances of computational linguistics in computer-supported collaborative learning

  • Published:
International Journal of Computer-Supported Collaborative Learning Aims and scope Submit manuscript

Abstract

In this article we describe the emerging area of text classification research focused on the problem of collaborative learning process analysis both from a broad perspective and more specifically in terms of a publicly available tool set called TagHelper tools. Analyzing the variety of pedagogically valuable facets of learners’ interactions is a time consuming and effortful process. Improving automated analyses of such highly valued processes of collaborative learning by adapting and applying recent text classification technologies would make it a less arduous task to obtain insights from corpus data. This endeavor also holds the potential for enabling substantially improved on-line instruction both by providing teachers and facilitators with reports about the groups they are moderating and by triggering context sensitive collaborative learning support on an as-needed basis. In this article, we report on an interdisciplinary research project, which has been investigating the effectiveness of applying text classification technology to a large CSCL corpus that has been analyzed by human coders using a theory-based multi-dimensional coding scheme. We report promising results and include an in-depth discussion of important issues such as reliability, validity, and efficiency that should be considered when deciding on the appropriateness of adopting a new technology such as TagHelper tools. One major technical contribution of this work is a demonstration that an important piece of the work towards making text classification technology effective for this purpose is designing and building linguistic pattern detectors, otherwise known as features, that can be extracted reliably from texts and that have high predictive power for the categories of discourse actions that the CSCL community is interested in.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. TagHelper tools can be downloaded from http://www.cs.cmu.edu/~cprose/TagHelper.html.

References

  • Aleven, V., Koedinger, K. R., & Popescu, O. (2003). A tutorial dialogue system to support self-explanation: Evaluation and open questions. Proceedings of the 11th International Conference on Artificial Intelligence in Education (AI-ED 2003) pp. 39–46. Amsterdam: IOS Press.

    Google Scholar 

  • Berkowitz, M., & Gibbs, J. (1983). Measuring the developmental features of moral discussion. Merrill-Palmer Quarterly, 29, 399–410.

    Google Scholar 

  • Burstein, J., Kukich, K., Wolff, S., Chi, L., & Chodorow, M. (1998). Enriching automated essay scoring using discourse marking. Proceedings of the Workshop on Discourse Relations and Discourse Marking, Annual Meeting of the Association of Computational Linguistics, Motreal, Canada, pp. 15–21.

  • Burstein, J., Marcu, D., Andreyev, S., & Chodorow, M. (2001). Towards automatic classification of discourse elements in essays. In Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, Toulouse, France, pp. 98–105.

  • Cakir, M., Xhafa, F., Zhou, N., & Stahl, G. (2005). Thread-based analysis of patterns of collaborative interaction in chat. Proceedings of the 12th international conference on Artificial Intelligence in Education (AI-Ed 2005), Amsterdam, The Netherlands, pp. 120–127.

  • Carbonell, J., & Goldstein, J. (1998). The use of MMR, diversity based reranking for reordering documents and producing summaries, Proceedings of ACM SIG-IR 1998.

  • Carvalho, V., & Cohen, W. (2005). On the collective classification of email “Speech Acts.” Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval pp. 345–352. New York: ACM Press.

    Google Scholar 

  • Chi, M. T. H., de Leeuw, N., Chiu, M. H., & LaVancher, C. (1994). Eliciting self-explanations improves understanding. Cognitive Science, 18(3), 439–477.

    Article  Google Scholar 

  • Cohen, J. A. (1960). Coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46.

    Google Scholar 

  • Cohen, W. (2004). Minorthird: Methods for identifying names and ontological relations in text using heuristics for inducing regularities from data. Retrieved from http://minorthird.sourceforge.net.

  • Collins, M. (2002). Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), pp. 1–8.

  • Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., et al. (1998). Learning to extract symbolic knowledge from the World Wide Web. Proceedings of the 15th National Conference on Artificial Intelligence (AAAI-98), pp. 509–516.

  • De Wever, B., Schellens, T., Valcke, M., & Van Keer, H. (2006). Content analysis schemes to analyze transcripts of online asynchronous discussion groups: A review. Computers and Education, 46, 6–28.

    Article  Google Scholar 

  • Dillenbourg, P., Baker, M., Blaye, A., & O'Malley, C. (1995). The evolution of research on collaborative learning. In E. Spada, & P. Reiman (Eds.) Learning in humans and machine: Towards an interdisciplinary learning science (pp. 189–211). Oxford: Elsevier.

    Google Scholar 

  • Dönmez, P., Rosé, C. P., Stegmann, K., Weinberger, A., & Fischer, F. (2005). Supporting CSCL with automatic corpus analysis technology. In T. Koschmann, D. Suthers, & T.-W. Chan (Eds.) Proceedings of the International Conference on Computer Supported Collaborative Learning—CSCL 2005 (pp. 125–134). Taipei, TW: Lawrence Erlbaum.

    Google Scholar 

  • Erkens, G., & Janssen, J. (2006). Automatic coding of communication in collaboration protocols. In S. A. Barab, K. E. Hay, & D. T. Hickey (Eds.) Proceedings of the 7th International Conference of the Learning Sciences (ICLS) (vol. 2, (pp. 1063–1064)). Mahwah, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Evens, M., & Michael, J. (2003). One-on-one tutoring by humans and machines. Mahwah, NJ: Lawrence Earlbaum Associates.

    Google Scholar 

  • Fischer, F., Bruhn, J., Gräsel, C., & Mandl, H. (2002). Fostering collaborative knowledge construction with visualization tools. Learning and Instruction, 12, 213–232.

    Article  Google Scholar 

  • Fleiss, J. L., & Cohen, J. (1973). The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement, 33, 613–619.

    Article  Google Scholar 

  • Foltz, P., Kintsch, W., & Landauer, T. (1998). The measurement of textual coherence with latent semantic analysis. Discourse Processes, 25, 285–308.

    Article  Google Scholar 

  • Fuernkranz, J. (2002). Round robin classification. Journal of Machine Learning Research, 2, 721–747.

    Article  Google Scholar 

  • Goodman, B., Linton, F., Gaimari, R., Hitzeman, J., Ross, H., & Zarrella, J. (2005). Using dialogue features to predict trouble during collaborative learning. Journal of User Modeling and User Adapted Interaction, 15(102), 85–134.

    Article  Google Scholar 

  • Gong, Y., & Liu, X. (2001). Generic text summarization using relevance measure and latent semantic analysis, Proceedings of ACM SIG-IR 2001.

  • Graesser, A. C., Bowers, C. A., Hacker, D. J., & Person, N. K. (1998). An anatomy of naturalistic tutoring. Scaffolding of instruction. In K. Hogan, & M. Pressley (Eds.) . Brooklyn, MA: Brookline Books.

    Google Scholar 

  • Gweon, G., Rosé, C. P., Albright, E., & Cui, Y. (2007). Evaluating the effect of feedback from a CSCL problem solving environment on learning, interaction, and perceived interdependence. Proceedings of CSCL 2007.

  • Gweon, G., Rosé, C. P., Wittwer, J., & Nueckles, M. (2005). An adaptive interface that facilitates reliable content analysis of corpus data. Proceedings of the 10th IFIP TC13 International Conference on Human-Computer Interaction (Interact’05), Rome, Italy.

  • Gweon, G., Rosé, C. P., Zaiss, Z., & Carey, R. (2006). Providing support for adaptive scripting in an on-line collaborative learning environment. Proceedings of CHI 06: ACM conference on human factors in computer systems. New York: ACM Press.

    Google Scholar 

  • Hachey, B., & Grover, C. (2005). Sequence modeling for sentence classification in a legal summarization system. Proceedings of the 2005 ACM Symposium on Applied Computing, pp. 292–296.

  • Henri, F. (1992). Computer conferencing and content analysis. In A. Kaye (Ed.) Collaborative learning through computer conferencing: The Najaden papers (pp. 117–136). Berlin: Springer.

    Google Scholar 

  • Hmelo-Silver, C., & Chernobilsky, E. (2004). Understanding collaborative activity systems: The relation of tools and discourse in mediating learning, Proceedings of the 6th International Conference of the Learning Sciences (ICLS). Los Angeles, California pp. 254–261.

  • Joshi, M., & Rosé, C. P. (2007). Using transactivity in conversation summarization in educational dialog. In Proceedings of the SLaTE Workshop on Speech and Language Technology in Education.

  • King, A. (1998). Transactive peer tutoring: Distributing cognition and metacognition. Computer-supported cooperation scripts. Educational Psychology Review, 10, 57–74.

    Article  Google Scholar 

  • King, A. (1999). Discourse patterns for mediating peer learning. In A., O’Donnell, & A. King (Eds.) Cognitive perspectives on peer learning. New Jersey: Lawrence Erlbaum.

    Google Scholar 

  • King, A. (2007). Scripting collaborative learning processes: A cognitive perspective. In F. Fischer, I. Kollar, H. Mandl, & J. M. Haake (Eds.) Scripting computer-supported collaborative learning: Cognitive, computational, and educational perspectives. New York: Springer.

    Google Scholar 

  • Kollar, I., Fischer, F., & Hesse, F. W. (2006). Collaboration scripts—a conceptual analysis. Educational Psychology Review, 18(2), 159–185.

    Article  Google Scholar 

  • Kollar, I., Fischer, F., & Slotta, J. D. (2005). Internal and external collaboration scripts in webbased science learning at schools. In T. Koschmann, D. Suthers, & T. W. Chan (Eds.) Computer supported collaborative learning 2005: The next 10 years (pp. 331–340). Mahwah, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Krippendorf, K. (1980). Content analysis: An introduction to its methodology. Beverly Hills: Sage Publications.

    Google Scholar 

  • Krippendorff, K. (2004). Reliability in content analysis: some common misconceptions and recommendations. Human Communication Research, 30, 411–433.

    Google Scholar 

  • Kuhn, D. (1991). The skills of argument. Cambridge: Cambridge University Press.

    Google Scholar 

  • Kumar, R., Rosé, C. P., Wang, Y. C., Joshi, M., & Robinson, A. (2007). Tutorial dialogue as adaptive collaborative learning support. Proceedings of the 13th International Conference on Artificial Intelligence in Education (AI-ED 2007). Amsterdam: IOS Press.

    Google Scholar 

  • Kupiec, J., Pederson, J., & Chen, F. (1995). A trainable document summarizer, Proceedings of ACM SIG-IR 1995.

  • Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the International Conference on Machine Learning (ICML-2001), Williamstown, MA.

  • Laham, D. (2000). Automated content assessment of text using latent semantic analysis to simulate human cognition. PhD dissertation, University of Colorado, Boulder.

  • Landauer, T., & Dumais, S. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104, 211–240.

    Article  Google Scholar 

  • Leitão, S. (2000). The potential of argument in knowledge building. Human Development, 43, 332–360.

    Article  Google Scholar 

  • Lewis, D., Yang, Y., Rose, T., & Li, F. (2004). RCV1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5, 361–397.

    Google Scholar 

  • Litman, D., Rosé, C. P., Forbes-Riley, K., Silliman, S., & VanLehn, K. (2006). Spoken versus typed human and computer dialogue tutoring. International Journal of Artificial Intelligence in Education Special Issue on the Best of ITS’04, 16, 145–170.

    Google Scholar 

  • Luckin, R. (2002). Between the lines: Documenting the multiple dimensions of computer-supported collaboration. Computers and Education, 41, 379–396.

    Article  Google Scholar 

  • McLaren, B., Scheuer, O., De Laat, M., Hever, R., de Groot, R., & Rośe, C. P. (2007). Using machine learning techniques to analyze and support mediation of student e-discussions. Proceedings of Artificial Intelligence in Education.

  • O'Donnell, A. M., & Dansereau, D. F. (1992). Scripted cooperation in student dyads: A method for analyzing and enhancing academic learning and performance. In R. Hertz-Lazarowitz, & N. Miller (Eds.) Interaction in cooperative groups. The theoretical anatomy of group learning (pp. 120–141). Cambridge, MA: Cambridge University Press.

    Google Scholar 

  • Page, E. B. (1968). The use of the computer in analyzing student essays. International Review of Education, 14, 210–225.

    Article  Google Scholar 

  • Page, E. B., & Petersen, N. S. (1995). The computer moves into essay grading: Updating the ancient test. Phi Delta Kappan, 76, 561–565.

    Google Scholar 

  • Pennebaker, J. W. (2003). The social, linguistic, and health consequences of emotional disclosure. In J. Suls, & K. A. Wallston (Eds.) Social psychological foundations of health and illness (pp. 288–313). Malden, MA: Blackwell.

    Google Scholar 

  • Pennebaker, J. W., & Francis, M. E. (1996). Cognitive, emotional, and language processes in disclosure. Cognition and Emotion, 10, 601–626.

    Article  Google Scholar 

  • Pennebaker, J. W., Francis, M. E., & Booth, R. J. (2001). Linguistic inquiry and word count: LIWC. Mahwah, NJ: Erlbaum.

    Google Scholar 

  • Piaget, J. (1985). The equilibrium of cognitive structures: The central problem of intellectual development. Chicago: Chicago University Press.

    Google Scholar 

  • Roman, N., Piwek, P., & Carvalho, A. (2006). Politeness and bias in dialogue summarization: Two exploratory studies, in J. Shanahan, Y. Qu, & J. Wiebe (Eds.) Computing attitude and affect in text: Theory and Applications, the Information Retrieval Series. Dordrecht: Springer.

  • Rosé, C. P. (2000). A framework for robust semantic interpretation. Proceedings of 1st Meeting of the North American Chapter of the Association for Computational Linguistics.

  • Rosé, C., Dönmez, P., Gweon, G., Knight, A., Junker, B., Cohen, W., et al. (2005). Automatic and semi-automatic skill coding with a view towards supporting on-line assessment. Proceedings of the 12th International Conference on Artificial Intelligence in Education (AI-ED 2005). Amsterdam: IOS Press.

    Google Scholar 

  • Rosé, C. P., Gweon, G., Arguello, J., Finger, S., Smailagic, A., & Siewiorek, D. (2007). Towards and interactive assessment framework for engineering design project based learning. Proceedings of ASME 2007 International Design Engineering Technical Conferences an d Computers and Information in Engineering Conference.

  • Rosé, C. P., Jordan, P., Ringenberg, M., Siler, S., VanLehn, K., & Weinstein, A. (2001). Interactive conceptual tutoring in atlas-andes. In J. D. Moore, C. L. Redfield, & W. L. Johnson (Eds.) Artificial Intelligence in Education: AI-ED in the wired and wireless future, Proceedings of AI-ED 2001 (pp. 256–266). Amsterdam: IOS Press.

    Google Scholar 

  • Rosé, C., Roque, A., Bhembe, D., & VanLehn, K. (2003). A hybrid text classification approach for analysis of student essays. Proceedings of the HLT-NAACL 03 Workshop on Educational Applications of NLP (pp. 68–75). Morristown, NJ: Association for Computational Linguistics.

  • Rosé C. P., & VanLehn, K. (2005). An evaluation of a hybrid language understanding approach for robust selection of tutoring goals. International Journal of AI in Education, 15(4).

  • Salomon, G., & Perkins, D. N. (1998). Individual and social aspects of learning. Review of Research in Education, 23, 1–4.

    Google Scholar 

  • Schegloff, E., & Sacks, H. (1973). Opening up closings. Semiotica, 8, 289–327.

    Google Scholar 

  • Schoor, C., & Bannert, M. (2007). Motivation and processes of social co-construction of knowledge during CSCL. Poster presented at the 12th Biennial Conference EARLI 2007, Budapest.

  • Serafin, R., & Di Eugenio, B. (2004). FLSA: Extending latent semantic analysis with features for dialogue act classification. Proceedings of the Association for Computational Linguistics. Morristown, NJ: Association for Computational Lingusitics.

  • Soller, A., & Lesgold, A. (2000). Modeling the Process of Collaborative Learning. Proceedings of the International Workshop on New Technologies in Collaborative Learning. Japan: Awaiji–Yumebutai.

  • Stahl, G. (2006). Group cognition: Computer support for building collaborative knowledge. Cambridge, MA: MIT Press.

  • Stegmann, K., Weinberger, A., & Fischer, F. (2007). Facilitating argumentative knowledge construction with computer-supported collaboration scripts. International Journal of Computer-Supported Collaborative Learning, 2(4).

  • Stegmann, K., Weinberger, A., Fischer, F., & Rosé, C. P. (2006). Automatische Analyse natürlich-sprachlicher Daten aus Onlinediskussionen [Automatic corpus analysis of natural language data of online discussions]. Paper presented at the 68th Tagung der Arbeitsgruppe für Empirische Pädagogische Forschung (AEPF, Working Group for Empirical Educational Research) Munich, Germany.

  • Stolcke, A., Ries, K., Coccaro, N., Shriberg, J., Bates, R., Jurafsku, D., et al. (2000). Dialogue act modeling for automatic tagging and recognition of conversational speech. Computational Linguistics, 26(3), 39–373.

    Article  Google Scholar 

  • Teasley, S. D. (1997). Talking about reasoning: How important is the peer in peer collaboration? In L. B. Resnick, R. Säljö, C. Pontecorvo, & B. Burge (Eds.) Discourse, tools and reasoning: Essays on situated cognition (pp. 361–384). Berlin: Springer.

    Google Scholar 

  • Tsochantaridis, I., Hofmann, T., Joachims, T., & Altun, Y. (2004). Support vector learning for interdependent and structured output spaces. Proceedings of the International Conference on Machine Learning 2004.

  • van der Pol, J., Admiraal, W., & Simons, P. R. J. (2006). The affordance of anchored discussion for the collaborative processing of academic texts. International Journal of Computer-Supported Collaborative Learning, 1(3), 339–357.

    Article  Google Scholar 

  • VanLehn, K., Graesser, A., Jackson, G. T., Jordan, P., Olney, A., & Rosé, C. P. (2007). Natural language tutoring: A comparison of human tutors, computer tutors, and text. Cognitive Science, 31(1), 3–52.

    Google Scholar 

  • Vapnik, V. (1995). The nature of statistical learning theory. Heidelberg: Springer.

    Google Scholar 

  • Voss, J. F., & Van Dyke, J. A. (2001). Argumentation in psychology. Discourse Processes, 32(2 & 3), 89–111.

    Article  Google Scholar 

  • Wang, Y. C., Joshi, M., & Rosé, C. P. (2007). A feature based approach for leveraging context for classifying newsgroup style discussion segments. Proceedings of the Association for Computational Linguistics.

  • Wang, H. C., Rosé, C. P., Cui, Y., Chang, C. Y., Huang, C. C., & Li, T. Y. (2007b). Thinking hard together: The long and short of collaborative idea generation for scientific inquiry. Proceedings of Computer Supported Collaborative Learning (CSCL 2007), New Jersey.

  • Webb, N. M. (1989). Peer interaction and learning in small groups. International Journal of Educational Research, 13, 21–39.

    Article  Google Scholar 

  • Wegerif, R. (2006). A dialogic understanding of the relationship between CSCL and teaching thinking skills. International Journal of Computer-Supported Collaborative Learning, 1(1), 143–157.

    Article  Google Scholar 

  • Weinberger, A. (2003). Scripts for computer-supported collaborative learning. Effects of social and epistemic cooperation scripts on collaborative knowledge construction. Ludwig-Maximilian University, Munich. Retrieved from http://edoc.ub.uni-muenchen.de/archive/00001120/01/Weinberger_Armin.pdf.

  • Weinberger, A., & Fischer, F. (2006). A framework to analyze argumentative knowledge construction in computer-supported collaborative learning. Computers & Education, 46(1), 71–95.

    Article  Google Scholar 

  • Weinberger, A., Reiserer, M., Ertl, B., Fischer, F., & Mandl, H. (2005). Facilitating computer-supported collaborative learning with cooperation scripts. In R. Bromme, F. W. Hesse, & H. Spada (Eds.) Barriers and Biases in network-based knowledge communication in groups. Dordrecht: Kluwer.

    Google Scholar 

  • Weiner, B. (1985). An attributional theory of achievement motivation and emotion. Psychological Review, 92, 548–573.

    Article  Google Scholar 

  • Wiebe, J., & Riloff, E. (2005). Creating Subjective and Objective Sentence Classifiers from Unnanotated Texts, Proceedings of the Sixth International Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2005), Springer LNCS, vol. 3406.

  • Wiebe, J., Wilson, T., Bruce, R., Bell, M., & Martin, M. (2004). Learning Subjective Language. Computational Linguistics, 30(3), 277–308.

    Article  Google Scholar 

  • Witten, I. H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques (2nd ed.). Elsevier: San Francisco. ISBN:(ISBN 0-12-088407-0).

    Google Scholar 

  • Yeh, A., & Hirschman, L. (2002). Background and overview for KDD Cup 2002 task 1: Information extraction from biomedical articles. SIGKDD Explorations, 4, 87–89.

    Article  Google Scholar 

  • Zechner, K. (2001). Automatic generation of concise summaries of spoken dialogues in unrestricted domains. Proceedings of ACM SIG-IR 2001.

  • Zhou, L., & Hovy, E. (2006). On the summarization of dynamically introduced information: Online discussions and blogs. In Proceedings of AAAI-2006 Spring Symposium on Computational Approaches to Analyzing Weblogs, Stanford, CA.

Download references

Acknowledgement

This work has grown out of an initiative jointly organized by the American National Science Foundation and the Deutsche Forschungsgemeinschaft to bring together educational psychologists and technology experts from Germany and from the USA to build a new research network for technology-supported education. This work was supported by the National Science Foundation grant number SBE0354420 to the Pittsburgh Science of Learning Center, Office of Naval Research, Cognitive and Neural Sciences Division Grant N00014-05-1-0043, and the Deutsche Forschungsgemeinschaft. We would also like to thank Jaime Carbonnel, William Cohen, Pinar Dönmez, Gahgene Gweon, Mahesh Joshi, Emil Albright, Edmund Huber, Rohit Kumar, Hao-Chuan Wang, Gerry Stahl, Hans Spada, Nikol Rummel, Kenneth Koedinger, Erin Walker, Bruce McLaren, Alexander Renkl, Matthias Nueckles, Rainer Bromme, Regina Jucks, Robert Kraut, and our very helpful anonymous reviewers for their contributions to this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carolyn Rosé.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rosé, C., Wang, YC., Cui, Y. et al. Analyzing collaborative learning processes automatically: Exploiting the advances of computational linguistics in computer-supported collaborative learning. Computer Supported Learning 3, 237–271 (2008). https://doi.org/10.1007/s11412-007-9034-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11412-007-9034-0

Keywords

Navigation