Skip to main content

Exploring N-gram Features in Clickstream Data for MOOC Learning Achievement Prediction

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10179))

Included in the following conference series:

Abstract

MOOC is an emerging online educational model in recent years. With the development of big data technology, a huge amount of learning behavior data can be mined by MOOC platforms. Mining learners’ past clickstream data to predict their future learning achievement by machine learning technology has become a hot research topic recently. Previous methods only consider the static counting-based features and ignore the correlative, temporal and fragmented nature of MOOC learning behavior, and thus have the limitation in interpretability and prediction accuracy. In this paper, we explore the effectiveness of N-gram features in clickstream data and model the MOOC learning achievement prediction problem as a multiclass classification task which classifies learners into four achievement levels. With extensive experiments on four real-world MOOC datasets, we empirically demonstrate that our methods outperform the state-of-the-art methods significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.coursera.org.

  2. 2.

    https://www.edx.org.

  3. 3.

    https://www.udacity.com.

  4. 4.

    The minimal progress marks for attending final examinations are usually set in the range from 20% to 30% in Mengke. They can be changed by instructors before courses are opened.

References

  1. Jiang, Z., Zhang, Y., Li, X.: Learning behavior analysis and prediction based on MOOC data. J. Comput. Res. Dev. 52(3), 614–628 (2015). (in Chinese)

    Google Scholar 

  2. Whitehill, J., Williams, J., Lopez, G., Coleman, C., Reich, J.: Beyond prediction: first steps toward automatic intervention in MOOC student stopout. In: Proceedings of the 8th International Conference on Educational Data Mining, pp. 222–230. EDM Press, Madrid (2015)

    Google Scholar 

  3. Xing, W., Chen, X., Stein, J., Marcinkowskid, M.: Temporal predication of dropouts in MOOCs: reaching the low hanging fruit through stacking generalization. Comput. Hum. Behav. 58, 119–129 (2016)

    Article  Google Scholar 

  4. Qiu, J., Tang, J., Liu, T., Gong, J., Zhang, C., Zhang, Q., Xue, Y.: Modeling and predicting learning behavior in MOOCs. In: Proceedings of the 9th ACM International Conference on Web Search and Data Mining, pp. 93–102. ACM Press, San Francisco (2015)

    Google Scholar 

  5. Brooks, C., Thompson, C., Teasley, S.: A time series interaction analysis method for building predictive models of learners using log data. In: Proceedings of the 5th International Conference on Learning Analytics and Knowledge, pp. 126–135. ACM Press, Poughkeepsie (2015)

    Google Scholar 

  6. Robinson, C., Yeomans, M., Reich, J., Hulleman, C., Gehlbach, H.: Forecasting student achievement in MOOCs with natural language processing. In: Proceedings of the 6th International Conference on Learning Analytics and Knowledge, pp. 383–387. ACM Press, Edinburgh (2016)

    Google Scholar 

  7. He, J., Bailey, J., Rubinstein, I., Zhang, R.: Identifying at-risk students in massive open online courses. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 1749–1755. AAAI Press, Austin (2015)

    Google Scholar 

  8. Chaplot, D., Rhim, E., Kim, J.: Predicting student attrition in moocs using sentiment analysis and neural networks. In: Proceedings of AIED 2015 Fourth Workshop on Intelligent Support for Learning in Groups, pp. 7–12. AIED Press, Madrid (2015)

    Google Scholar 

  9. Crossley, S., Paquette, L., Dascalu, M., McNamara, D., Baker, R.: Combining click-stream data with NLP tools to better understand MOOC completion. In: Proceedings of the 6th International Conference on Learning Analytics and Knowledge, pp. 6–14. ACM, Edinburgh (2016)

    Google Scholar 

  10. Zhou, Q., Mou, C., Yang, D.: Research Progress on Educational Data Mining A Survey. J. Softw. 26(11), 3026–3042 (2015). (in Chinese)

    Google Scholar 

  11. Xing, Z., Pei, J., Keogh, E.: A brief survey on sequence classification. SIGKDD Explor. 12, 40–48 (2010)

    Article  Google Scholar 

  12. Romero, C., Ventura, S.: Educational data mining: a review of the state of the art. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 40(6), 601–618 (2010)

    Article  Google Scholar 

  13. Jiang, S., Williams, A., Schenke, K., Warschauer, M., Dowd, D.: Predicting MOOC performance with week 1 behavior. In: Proceedings of the 7th International Conference on Educational Data Mining, pp. 273–275. EDM Press, London (2014)

    Google Scholar 

  14. Kumar, M., Kan, M., Tan, B.: Learning instructor intervention from MOOC forums: early results and issues. In: Proceedings of the 8th International Conference on Educational Data Mining, pp. 218–225. EDM Press, Madrid (2015)

    Google Scholar 

  15. Yang, D., Piergallini, M., Howley, I., Rose, C.: Forum thread recommendation for massive open online courses. In: Proceedings of the 7th International Conference on Educational Data Mining, pp. 257–260. EDM Press, London (2014)

    Google Scholar 

  16. Davis, D., Chen, G., Hauff, C., Houben, G.: Gauging MOOC learners’ adherence to the designed learning path. In: Proceedings of the 9th International Conference on Educational Data Mining, pp. 54–61. EDM Press, Raleigh (2016)

    Google Scholar 

  17. Ren, Z., Rangwala, H., Johri, A.: Predicting performance on MOOC assessments using multi-regression models. In: Proceedings of the 9th International Conference on Educational Data Mining, pp. 484–489. EDM Press, Raleigh (2016)

    Google Scholar 

  18. Kennedy, G., Coffrin, C., Barba, P., Corrin, L.: Predicting success how learners’ prior knowledge, skills and activities predict MOOC performance. In: Proceedings of the 5th International Conference on Learning Analytics and Knowledge, pp. 136–140. ACM Press, Poughkeepsie (2015)

    Google Scholar 

  19. Sanchez-Santillan, M., Cerezo, R., Paule-Ruiz, M., Nuñez, J.: Predicting students’ performance: incremental interaction classifiers. In: Proceedings of the Third ACM Conference on Learning @ Scale, pp. 217–220. ACM Press, Edinburgh (2016)

    Google Scholar 

  20. Peter, J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Comput. Appl. Math. 20, 53–65 (1987)

    Article  MATH  Google Scholar 

  21. Tong, Y., She, J., Meng, R.: Bottleneck-aware arrangement over event-based social networks: the max-min approach. World Wide Web J. 19(6), 1151–1177 (2016)

    Article  Google Scholar 

Download references

Acknowledgments

We would like to thank the anonymous reviewers for their helpful comments to improve this paper. We also thank the support of Key Program of National Natural Science Foundation of China (61432020, 61532001) and Research Fund of Research Center for Online Education of Ministry of Education of China (2016YB150, 2016YB151).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Li, X., Wang, T., Wang, H. (2017). Exploring N-gram Features in Clickstream Data for MOOC Learning Achievement Prediction. In: Bao, Z., Trajcevski, G., Chang, L., Hua, W. (eds) Database Systems for Advanced Applications. DASFAA 2017. Lecture Notes in Computer Science(), vol 10179. Springer, Cham. https://doi.org/10.1007/978-3-319-55705-2_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-55705-2_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-55704-5

  • Online ISBN: 978-3-319-55705-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics