Abstract
Topic detection and tracking (TDT) under modern media circumstances has been dramatically innovated with the ever-changing social network and some of the inconspicuous connections among participants in the internet communities. Instead of only considering the varied word features of analysing materials, detecting and tracking topics in multi-relational data with incidental information becomes a new trend for prevalent topic models, for example, the use of link structures and time series. In this paper, we employ the users’ groups extracted from Twitter as the social context that accompanied the corresponding news articles and explore the interior links among data points to develop the non-negative factorization methods with semi-supervised information. A locally weighted scheme is applied to original data points to differentiate the proximity of approximate points for a better approximation. We evaluate our proposed method on synthetic data set as well as real news data set combining social information extracted from Twitter. The experimental results show the performance improvement of our method comparing to other baseline methods.
References
Landauer, T.K., Dumais, S.T.: A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104, 211 (1997)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Cohn, D., Hofmann, T.: The missing link-a probabilistic model of document content and hypertext connectivity. Adv. Neural Inf. Process. Syst. 430–436 (2001)
Kalyanam, J., Mantrach, A., Saez-Trumper, D., Vahabi, H., Lanckriet, G.: Leveraging Social Context for Modeling Topic Evolution. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 517–526. ACM, New York, NY, USA (2015)
Khalil, F., Wang, H., Li, J.: Integrating Markov model with clustering for predicting web page accesses. In: Proceeding of the 13th Australasian World Wide Web Conference (AusWeb07), pp. 63–74. AusWeb (2007)
Guillamet, D., Bressan, M., Vitria, J.: A weighted non-negative matrix factorization for local representations. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, vol. 1, pp. I-942–I-947 (2001)
Khalil, F., Li, J., Wang, H.: An integrated model for next page access prediction. Int. J. Knowl. Web Intell. 1, 48–80 (2009)
Singh, A.P., Gordon, G.J.: Relational learning via collective matrix factorization. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 650–658. ACM, New York, NY, USA (2008)
Ye, W., Yanchun, Z., Bin, Z., Yan, J.: Semi-supervised collective matrix factorization for topic detection and document clustering. In: Proceedings of IEEE International Conference on Data Science in Cyberspace, Shenzhen, Guangdong, China (2017)
Luo, X., Xuan, J., Lu, J., Zhang, G.: Measuring the semantic uncertainty of news events for evolution potential estimation. ACM Trans. Inf. Syst. 34, 24:1–24:25 (2016)
Hurtado, J.L., Agarwal, A., Zhu, X.: Topic discovery and future trend forecasting for texts. J. Big Data. 3, 7 (2016)
Sun, X., Wang, H., Li, J., Pei, J.: Publishing anonymous survey rating data. Data Min. Knowl. Discov. 23, 379–406 (2011)
Deng, L., Xu, B., Zhang, L., Han, Y., Zhou, B., Zou, P.: Tracking the evolution of public concerns in social media. In: Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service, pp. 353–357. ACM (2013)
Wang, H., Cao, J., Zhang, Y.: A flexible payment scheme and its role-based access control. IEEE Trans. Knowl. Data Eng. 17, 425–436 (2005)
Zhu, C., Zhu, H., Ge, Y., Chen, E., Liu, Q., Xu, T., Xiong, H.: Tracking the evolution of social emotions with topic models. Knowl. Inf. Syst. 47, 517–544 (2016)
Nallapati, R.M., Ahmed, A., Xing, E.P., Cohen, W.W.: Joint latent topic models for text and citations. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 542–550. ACM (2008)
Vaca, C.K., Mantrach, A., Jaimes, A., Saerens, M.: A time-based collective factorization for topic discovery and monitoring in news. Presented at the (2014)
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: NIPS, pp. 556–562. MIT Press (2000)
Cao, B., Shen, D., Sun, J.T., Wang, X., Yang, Q., Chen, Z.: Detect and track latent factors with online nonnegative matrix factorization. In: IJCAI, pp. 2689–2694 (2007)
Saha, A., Sindhwani, V.: Learning evolving and emerging topics in social media: a dynamic Nmf approach with temporal regularization. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, pp. 693–702. ACM, New York, NY, USA (2012)
Suh, S., Choo, J., Lee, J., Reddy, C.K.: L-EnsNMF: boosted local topic discovery via ensemble of nonnegative matrix factorization (2016)
Wang, D., Gao, X., Wang, X.: Semi-supervised nonnegative matrix factorization via constraint propagation. IEEE Trans. Cybern. 46, 233–244 (2016)
Ho, N.D., Van Dooren, P., Blondel, V.: Weighted nonnegative matrix factorization and face feature extraction. Submitt. Image Vis., Comput (2007)
Wheeler, D.D.C.: Geographically weighted regression. In: Fischer, M.M., Nijkamp, P. (eds.) Handbook of Regional Science, pp. 1435–1459. Springer, Heidelberg (2014)
Cai, D., He, X., Han, J., Huang, T.S.: Graph regularized nonnegative matrix factorization for data representation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 1548–1560 (2011)
Acknowledgement
This work is supported in part by National Key fundamental Research and Development Program of China (No. 2013CB329601, No. 2013CB329604, No. 2013CB329606), National Natural Science Foundation of China (No. 61502517, No. 61372191, No. 61572492) and Australia Research Council Project (DP140100841). This work is also funded by the major pre-research project of National University of Defense Technology.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Wang, Y., Quan, Y., Zhou, B., Zhang, Y., Peng, M. (2017). Topic Detection with Locally Weighted Semi-supervised Collective Learning. In: Bouguettaya, A., et al. Web Information Systems Engineering – WISE 2017. WISE 2017. Lecture Notes in Computer Science(), vol 10570. Springer, Cham. https://doi.org/10.1007/978-3-319-68786-5_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-68786-5_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68785-8
Online ISBN: 978-3-319-68786-5
eBook Packages: Computer ScienceComputer Science (R0)