JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20]


Journal of Information Science and Engineering, Vol. 30 No. 5, pp. 1585-1600


Bayesian Bridging Topic Models for Classification


MENG-SUNG WU
Computational Intelligence Technology Center
Industrial Technology Research Institute
Hsinchu, 310 Taiwan

 


    We study the problem of constructing the topic-based model over different domains for text classification. In real-world applications, there are abundant unlabeled documents but sparse labeled documents. It is challenging to construct a reliable and adaptive model to classify a large amount of documents containing different domains. The classifiers trained from a source domain shall perform poorly for the test data in a target domain. Also, the trained model is vulnerable to the weakness of classification among ambiguous classes. In this study, we tackle the issues of domain mismatch and confusing classes and conduct the discriminative transfer learning for text classification. We propose a Bayesian bridging topic models (BTM) from a variety of labeled and unlabeled documents and perform the transfer learning for cross-domain text classification. A structural model is built and its parameters are estimated by maximizing the joint marginal likelihood of labeled and unlabeled data via a variational inference procedure. We also construct the discriminative learning on our proposed model for adjust parameters by using the minimum classification error criterion. We show that improvements over cross-domain text classification using the proposed model can be achieved better performance than other models.


Keywords: transfer learning, topic model, cross-domain classification, latent Dirichlet allocation, Bayesian

  Retrieve PDF document (JISE_201405_16.pdf)