JISE

[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]

Journal of Information Science and Engineering, Vol. 20 No. 5, pp. 903-922

Classification of Chinese Characters Using Pseudo Skeleton Features

Ming-Gang Wen^*+, Kuo-Chin Fan^* and Chin-Chuan Han
^*Institute of Computer Science and Information Engineering
National Central University
Chungli, 320 Taiwan
E-mail: kcfan@csie.ncu.edu.tw
⁺Department of Information Management
Department of Computer Science and Information Engineering
National United University
Miaoli, 360 Taiwan
E-mail: mgwen@nuu.edu.tw

In this paper we present a novel method to classify machine printed Chinese characters by matching the code strings generated from pseudo skeleton features. In our approach, the pseudo skeletons of Chinese characters are extracted rather than using skeletons extracted by traditional thinning algorithms. The features of the pseudo skeletons of both input and template characters are then encoded into two code strings. Finally, the edit-distance algorithm is employed to compute the similarity between the two characters based on their corresponding encoded strings. The main contribution of this paper is to effectively classify multi-fonts Chinese characters using a single-font reference database. Experiments were conducted on 5401 daily-used Chinese characters of various fonts and sizes. Experimental results demonstrate the validity and efficiency of our proposed method for classifying Chinese characters.

Keywords: optical character recognition (OCR), coarse classification, pseudo skeleton, projection histogram, edit distance algorithm

Retrieve PDF document (JISE_200405_06.pdf)