[ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ] [ 10 ] [ 11 ] [ 12 ] [ 13 ]

Journal of Information Science and Engineering, Vol. 35 No. 4, pp. 749-767

SRTM: A Sparse RNN-Topic Model for Discovering Bursty Topics in Big Data of Social Networks

Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia
School of Computer Science
Beijing University of Posts and Telecommunications
Beijing, 100876 P.R. China
E-mail: leikyshi@qq.com; {junpingdu; meiyu-1210; koufeifei000}@126.com

Social networks such as Twitter, Facebook, and Sina microblog have become major sources for generating big data and bursty topics. As bursty topics discovery is helpful to guide public opinion and control network rumors, it is necessary to design an effective method to detect the quickly-updated bursty topics. However, bursty topics discovery is challenging. This main reason is that big data is both high dimensional and sparse in social networks. In this study, we propose a Sparse RNN-Topic Model (SRTM) model named SRTM, to deal with the task. First, we leverage RNN to learn the inside relationship between words and IDF to measuring high-frequency words. Second, the model distinguishes modeling between the bursty topic and the common topic to detect the variety of word in time. Third, we introduce “Spike and Slab” prior to decouple the sparsity and smoothness of the topic distribution. The burstiness of word pair is leveraged to achieve automatic bursty topics discovery. Finally, to verify the effectiveness of the proposed SRTM method, we collect Sina microblog dataset to conduct various experiments. Both qualitative and quantitative evaluations demonstrate that our proposed SRTM method outperforms favorably against several state-of-the-art methods.

Keywords: social networks, bursty topic discovery, topic model, RNN, big data

  Retrieve PDF document (JISE_201904_03.pdf)