JISE

In this paper, HNM (harmonic plus noise model) is enhanced and used to design a scheme for synthesizing a Mandarin Chinese singing voice. Enhancements made include a Lagrange-interpolation based estimation of spectral envelope, piecewise linear mapping of time axes, fixed-pace placement of control points, and other modifications for analyzing HNM parameters and efficient execution. In terms of the enhancements and the signalsynthesis equations rewritten here, a Mandarin singing-voice synthesis system is built. In the system, each Mandarin syllable is recorded just once for analyzing HNM parameters. Then, the HNM parameters of a source syllable are used to synthesize singing syllables of diverse pitches and durations. This system can parse a song score file and synthesize its lyric syllables’ signals in real-time. Also, the skill of portamento (pitch gliding) singing is implemented. According to the perception tests, our system can indeed synthesize signals of singing voice that are consistent in timbre, of no reverberation, and much clearer than a PSOLA (pitch synchronous overlap add) based scheme.