JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]


Journal of Information Science and Engineering, Vol. 32 No. 6, pp. 1613-1634


Multi-document Summarization using Probabilistic Topic-based Network Models


CHENG-ZEN YANG, JHIH-SHANG FAN AND YU-FAN LIU
Department of Computer Science and Engineering
Yuan Ze University
Chungli, 32003 Taiwan
E-mail: czyang@syslab.cse.yzu.edu.tw; {s1003305, s1001447}@mail.yzu.edu.tw


    Multi-document summarization has obtained much attention in the research domain of text summarization. In the past, probabilistic topic models and network models have been leveraged to generate summaries. However, previous studies do not investigate different combinations of various topic models and network models. This paper describes an integrated approach considering both probabilistic topic models and network models. Two probabilistic topic models and four network models are investigated. We have conducted experiments to evaluate the effectiveness of the proposed approach with the DUC 2004-2007 datasets and make a systematic comparison between two representative topic models, PLSA and LDA. The results show that the PLSA-based network approach outperforms the TF-IDF baseline on all datasets. Moreover, PLSA has better ROUGE performance than LDA for multi-document summarization.


Keywords: multi-document summarization, probabilistic topic models, network models, ex-traction-based summarization, performance evaluation

  Retrieve PDF document (JISE_201606_12.pdf)