Skip to main content

Topic Detection with Locally Weighted Semi-supervised Collective Learning

  • Conference paper
  • First Online:
Web Information Systems Engineering – WISE 2017 (WISE 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10570))

Included in the following conference series:

  • 1479 Accesses

Abstract

Topic detection and tracking (TDT) under modern media circumstances has been dramatically innovated with the ever-changing social network and some of the inconspicuous connections among participants in the internet communities. Instead of only considering the varied word features of analysing materials, detecting and tracking topics in multi-relational data with incidental information becomes a new trend for prevalent topic models, for example, the use of link structures and time series. In this paper, we employ the users’ groups extracted from Twitter as the social context that accompanied the corresponding news articles and explore the interior links among data points to develop the non-negative factorization methods with semi-supervised information. A locally weighted scheme is applied to original data points to differentiate the proximity of approximate points for a better approximation. We evaluate our proposed method on synthetic data set as well as real news data set combining social information extracted from Twitter. The experimental results show the performance improvement of our method comparing to other baseline methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    http://www.itl.nist.gov/iad/mig/tests/tdt/1998/.

References

  1. Landauer, T.K., Dumais, S.T.: A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104, 211 (1997)

    Article  Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  3. Cohn, D., Hofmann, T.: The missing link-a probabilistic model of document content and hypertext connectivity. Adv. Neural Inf. Process. Syst. 430–436 (2001)

    Google Scholar 

  4. Kalyanam, J., Mantrach, A., Saez-Trumper, D., Vahabi, H., Lanckriet, G.: Leveraging Social Context for Modeling Topic Evolution. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 517–526. ACM, New York, NY, USA (2015)

    Google Scholar 

  5. Khalil, F., Wang, H., Li, J.: Integrating Markov model with clustering for predicting web page accesses. In: Proceeding of the 13th Australasian World Wide Web Conference (AusWeb07), pp. 63–74. AusWeb (2007)

    Google Scholar 

  6. Guillamet, D., Bressan, M., Vitria, J.: A weighted non-negative matrix factorization for local representations. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, vol. 1, pp. I-942–I-947 (2001)

    Google Scholar 

  7. Khalil, F., Li, J., Wang, H.: An integrated model for next page access prediction. Int. J. Knowl. Web Intell. 1, 48–80 (2009)

    Article  Google Scholar 

  8. Singh, A.P., Gordon, G.J.: Relational learning via collective matrix factorization. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 650–658. ACM, New York, NY, USA (2008)

    Google Scholar 

  9. Ye, W., Yanchun, Z., Bin, Z., Yan, J.: Semi-supervised collective matrix factorization for topic detection and document clustering. In: Proceedings of IEEE International Conference on Data Science in Cyberspace, Shenzhen, Guangdong, China (2017)

    Google Scholar 

  10. Luo, X., Xuan, J., Lu, J., Zhang, G.: Measuring the semantic uncertainty of news events for evolution potential estimation. ACM Trans. Inf. Syst. 34, 24:1–24:25 (2016)

    Article  Google Scholar 

  11. Hurtado, J.L., Agarwal, A., Zhu, X.: Topic discovery and future trend forecasting for texts. J. Big Data. 3, 7 (2016)

    Article  Google Scholar 

  12. Sun, X., Wang, H., Li, J., Pei, J.: Publishing anonymous survey rating data. Data Min. Knowl. Discov. 23, 379–406 (2011)

    Article  MathSciNet  Google Scholar 

  13. Deng, L., Xu, B., Zhang, L., Han, Y., Zhou, B., Zou, P.: Tracking the evolution of public concerns in social media. In: Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service, pp. 353–357. ACM (2013)

    Google Scholar 

  14. Wang, H., Cao, J., Zhang, Y.: A flexible payment scheme and its role-based access control. IEEE Trans. Knowl. Data Eng. 17, 425–436 (2005)

    Article  Google Scholar 

  15. Zhu, C., Zhu, H., Ge, Y., Chen, E., Liu, Q., Xu, T., Xiong, H.: Tracking the evolution of social emotions with topic models. Knowl. Inf. Syst. 47, 517–544 (2016)

    Article  Google Scholar 

  16. Nallapati, R.M., Ahmed, A., Xing, E.P., Cohen, W.W.: Joint latent topic models for text and citations. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 542–550. ACM (2008)

    Google Scholar 

  17. Vaca, C.K., Mantrach, A., Jaimes, A., Saerens, M.: A time-based collective factorization for topic discovery and monitoring in news. Presented at the (2014)

    Google Scholar 

  18. Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: NIPS, pp. 556–562. MIT Press (2000)

    Google Scholar 

  19. Cao, B., Shen, D., Sun, J.T., Wang, X., Yang, Q., Chen, Z.: Detect and track latent factors with online nonnegative matrix factorization. In: IJCAI, pp. 2689–2694 (2007)

    Google Scholar 

  20. Saha, A., Sindhwani, V.: Learning evolving and emerging topics in social media: a dynamic Nmf approach with temporal regularization. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, pp. 693–702. ACM, New York, NY, USA (2012)

    Google Scholar 

  21. Suh, S., Choo, J., Lee, J., Reddy, C.K.: L-EnsNMF: boosted local topic discovery via ensemble of nonnegative matrix factorization (2016)

    Google Scholar 

  22. Wang, D., Gao, X., Wang, X.: Semi-supervised nonnegative matrix factorization via constraint propagation. IEEE Trans. Cybern. 46, 233–244 (2016)

    Article  Google Scholar 

  23. Ho, N.D., Van Dooren, P., Blondel, V.: Weighted nonnegative matrix factorization and face feature extraction. Submitt. Image Vis., Comput (2007)

    Google Scholar 

  24. Wheeler, D.D.C.: Geographically weighted regression. In: Fischer, M.M., Nijkamp, P. (eds.) Handbook of Regional Science, pp. 1435–1459. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  25. Cai, D., He, X., Han, J., Huang, T.S.: Graph regularized nonnegative matrix factorization for data representation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 1548–1560 (2011)

    Article  Google Scholar 

Download references

Acknowledgement

This work is supported in part by National Key fundamental Research and Development Program of China (No. 2013CB329601, No. 2013CB329604, No. 2013CB329606), National Natural Science Foundation of China (No. 61502517, No. 61372191, No. 61572492) and Australia Research Council Project (DP140100841). This work is also funded by the major pre-research project of National University of Defense Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ye Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Wang, Y., Quan, Y., Zhou, B., Zhang, Y., Peng, M. (2017). Topic Detection with Locally Weighted Semi-supervised Collective Learning. In: Bouguettaya, A., et al. Web Information Systems Engineering – WISE 2017. WISE 2017. Lecture Notes in Computer Science(), vol 10570. Springer, Cham. https://doi.org/10.1007/978-3-319-68786-5_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68786-5_45

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68785-8

  • Online ISBN: 978-3-319-68786-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics