[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]

Journal of Information Science and Engineering, Vol. 34 No. 2, pp. 535-550

A Game Theory based Feature Word Selection Model for Chinese Texts

1School of Computer Science and Technology
Xi'an University of Posts and Telecommunications
Xi'an, 710121 P.R. China

2School of Computer and Communication
Lanzhou University of Technology
Lanzhou, 730050 P.R. China
E-mail: sunjingtao@xupt.edu.cn; zhangqylz@163.com

Feature word selection plays an important role in the classification of Chinese texts, and its result has a direct influence on the precision of text classification. Existing methods are generally deficient in processing the fuzzy and uncertain information contained in natural language. To overcome such a deficiency, a novel game theory based feature word selection model for Chinese texts was proposed in this paper. This model applies game theory method to the selection of feature words, by using the combined contribution functions of the feature subsets to text classification and the fuzzy membership functions of samples defined by compatibility measurement, compatibility feature payoff functions are constructed in order to select the optimal feature subset with Nash equilibrium. Through comparative experiments on datasets from the CDSCE corpus, it is validated that the proposed model is able to perform effective spam email feature word selection, and its generalization performance is better than those of other commonly used feature word selection methods.

Keywords: game theory, Chinese feature word selection, compatibility measurement, fuzzy membership function, combined contribution function

  Retrieve PDF document (JISE_201802_14.pdf)