Skip to main content

A New Lifelong Topic Modeling Method and Its Application to Vietnamese Text Multi-label Classification

  • Conference paper
  • First Online:
Intelligent Information and Database Systems (ACIIDS 2018)

Abstract

Lifelong machine learning is emerging in recent years thanks to its ability to use past knowledge for current problem. Lifelong topic modeling algorithms, such as LTM and AMC, are proposed and they are very useful. However, these algorithms focus on learning bias on the topic level not the domain level. This paper proposes a lifelong topic modeling method, which focuses on learning bias on the domain level based on a proposed domain closeness measure, and an application framework for multi-label classification on Vietnamese texts. Experimental results on three previously solved Vietnamese texts, and five different current Vietnamese text datasets in combination with different topic set sizes showed that our proposed method is better than AMC method for all cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://lifelongml.org/.

References

  1. Szymanski, P., Kajdanowicz, T., Kersting K.: How is a data-driven approach better than random choice in label space division for multi-label classification. Entropy 18(8), 282, 30 p. (2016)

    Google Scholar 

  2. Zhou, Z.-H., Zhang, M.-L.: Multi-label learning. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning and Data Mining, pp. 875–881. Springer, Boston (2017). https://doi.org/10.1007/978-1-4899-7687-1

    Chapter  Google Scholar 

  3. Zhang, M.-L., Wu, L.: LIFT: multi-label learning with label-specific features. IEEE Trans. Pattern Anal. Mach. Intell. 37(1), 107–120 (2015)

    Article  MathSciNet  Google Scholar 

  4. Zhang, W., Tang, X., Yoshida, T.: TESC: an approach to text classification using semi-supervised clustering. Knowl.-Based Syst. 75, 152–160 (2015)

    Article  Google Scholar 

  5. Antonie, L., Li, J., Zaiane, O.: Negative association rules. In: Aggarwal, C.C., Han, J. (eds.) Frequent Pattern Mining, pp. 135–145. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07821-2_6

    Google Scholar 

  6. Thrun, S., Mitchell, T.M.: Lifelong robot learning. Robot. Auton. Syst. 15(1–2), 25–46 (1995)

    Article  Google Scholar 

  7. Thrun, S.: Explanation-Based Neural Network Learning: A Lifelong Learning Approach. Springer, US (1996). https://doi.org/10.1007/978-1-4613-1381-6

    Book  MATH  Google Scholar 

  8. Chen, Z., Liu, B.: Lifelong Machine Learning. Morgan and Claypool Publishers, San Rafael (2016)

    Google Scholar 

  9. Chen, Z., Liu, B.: Topic modeling using topics from many domains, lifelong learning and big data. In: ICML 2014, pp. 703–711 (2014)

    Google Scholar 

  10. Chen, Z., Liu, B.: Mining topics in documents: standing on the shoulders of big data. In: KDD 2014, pp. 1116–1125 (2014)

    Google Scholar 

  11. Wang, S., Chen, Z., Liu, B.: Mining aspect-specific opinion using a holistic lifelong topic model. In: WWW 2016, pp. 167–176 (2016)

    Google Scholar 

  12. Chen, Z., Liu, B.: Topic models for NLP applications. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning and Data Mining, 2nd edn, pp. 1276–1280. Springer, Boston (2017). https://doi.org/10.1007/978-1-4899-7687-1

    Google Scholar 

  13. Pham, T.-N., Nguyen, V.-Q., Dinh, D.-T., Nguyen, T.-T., Ha, Q.-T.: MASS: a semi-supervised multi-label classification algorithm with specific features. In: Król, D., Nguyen, N.T., Shirai, K. (eds.) ACIIDS 2017. SCI, vol. 710, pp. 37–47. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56660-3_4

    Chapter  Google Scholar 

  14. Pham, T.-N., Nguyen, V.-Q., Tran, V.-H., Nguyen, T.-T., Ha, Q.-T.: A semi-supervised multi-label classification framework with feature reduction and enrichment. J. Inf. Telecommun. 1(2), 141–154 (2017)

    Google Scholar 

  15. Blei, D.M.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)

    Article  Google Scholar 

  16. Chen, Z., Ma, N., Liu, B.: Lifelong learning for sentiment classification. In: ACL, pp. 750–756 (2015)

    Google Scholar 

  17. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tri-Thanh Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ha, QT. et al. (2018). A New Lifelong Topic Modeling Method and Its Application to Vietnamese Text Multi-label Classification. In: Nguyen, N., Hoang, D., Hong, TP., Pham, H., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2018. Lecture Notes in Computer Science(), vol 10751. Springer, Cham. https://doi.org/10.1007/978-3-319-75417-8_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-75417-8_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-75416-1

  • Online ISBN: 978-3-319-75417-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics