JISE


  [1] [2] [3] [4] [5]


Journal of Information Science and Engineering, Vol. 6 No. 2, pp. 101-116


On Segmentation and Recognition of Connected Spoken Digits Based on a Neural Network Model


Chung-Hsien Wu, Jhing-Fa Wang, Ruey-Ching Shyu 
and Jau-Yien Lee

Department of Electrical Engineering 
National Cheng Kung University 
Tainan, Taiwan, Republic of China


    In this paper, an automatic connected digits segmentation and recognition system based on a neural network model is proposed. A backpropagation learning algorithm is employed to train these networks. The main new idea for segmentation is to classify the energy, spectral and pitch-period transitions within a data window to determine the boundaries between syllables. These feature transitions are used as the input patterns for training the segmentation network. The segmented syllables are then used as the basic units in the training and recognition process. In speaker-independent segmentation experiments, ten digits (0-9) and syllables spoken in Mandarin are used as test patterns, while only ten digits are used in speaker-dependent recognition experiments. With an average speaking rate of 160 digits per minute, a coincidence rate of 95.7% and a recognition rate of 97.2%, can be achieved.


Keywords: segmentation, recognition, neural network, backpropagation

  Retrieve PDF document (JISE_199002_02.pdf)