JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21]


Journal of Information Science and Engineering, Vol. 26 No. 1, pp. 183-196


Variant Methods of Reduced Set Selection for Reduced Support Vector Machines


LI-JEN CHIEN, CHIEN-CHUNG CHANG AND YUH-JYE LEE
Department of Computer Science and Information Engineering 
National Taiwan University of Science and Technology 
Taipei, 106 Taiwan 
E-mail: {D8815002; D9115009; yuh-jye}@mail.ntust.edu.tw


    In dealing with large datasets the reduced support vector machine (RSVM) was proposed for the practical objective to overcome the computational difficulties as well as to reduce the model complexity. In this paper, we propose two new approaches to generate representative reduced set for RSVM. First, we introduce Clustering Reduced Support Vector Machine (CRSVM) that builds the model of RSVM via RBF (Gaussian kernel) construction. Applying clustering algorithm to each class, we can generate cluster centroids of each class and use them to form the reduced set which is used in RSVM. We also estimate the approximate density for each cluster to get the parameter used in Gaussian kernel which will save a lot of tuning time. Secondly, we present Systematic Sampling RSVM (SSRSVM) that incrementally selects the informative data points to form the reduced set while the RSVM used random selection scheme. SSRSVM starts with an extremely small initial reduced set and adds a portion of misclassified points into the reduced set iteratively based on the current classifier until the validation set correctness is large enough. We also show our methods, CRSVM and SSRSVM with smaller size of reduced set, have superior performance than the original random selection scheme.


Keywords: kernel methods, kernel width estimation, Nystrom approximation, reduced set, sampling methods, support vector machines

  Retrieve PDF document (JISE_201001_13.pdf)