Dissertation

Based on artificial neural network model voice conversion algorithm

Author: ChenZhi
Tutor: ZhangLingHua
School: Nanjing University of Posts and Telecommunications
Course: Signal and Information Processing
Keywords: Voice conversion Spectral envelope conversion Pitch frequency conversion Artificial Neural Network Model Quantum Genetic Algorithm
CLC: TN912.34
Type: Master's thesis
Year: 2011
Downloads: 30
Quote: 2
Read: Download Dissertation


Voice conversion is performed by a speaker (source speaker) voice personality characteristics into another speaker (target speaker) voice personality characteristics, while maintaining the content of speech and emotional features of a technology change. After conversion to get the sound you want to sound like the target speaker's voice, while keeping the contents of the source speaker's speech and emotional characteristics unchanged. The technology has important theoretical value and good prospects. This paper mainly focus on voice conversion of baseband conversion and spectral envelope trajectory parameters into these two key technologies research, the main work and innovations are as follows: (a) the current trajectory of existing baseband conversion algorithms compare the experimental study, Current conversion algorithms are found in a simple linear transformation, and in fact between the two speakers baseband nonlinear relationship trajectory. For the problems of traditional conversion algorithm (linear transform baseband conversion), proposed RBF neural network baseband track conversion algorithm, divided by the fundamental frequency trajectory data segments of equal length, and then be modeled separately, looking between the source and target mapping rules in order to achieve the fundamental frequency track conversions. Subjective and objective tests show that: the algorithm not only improves the accuracy of the conversion parameters, but also enhance the naturalness of synthesized speech. (2) The traditional voice conversion algorithms are the segment information parameters and suprasegmental information open to independent parameters, namely the conversion, and finally with the synthesized speech translation. And more and more studies show that hidden between large correlation parameters can be extracted from one another parameter, be open to independent conversion method, respectively, is bound to undermine both the links between affect conversion results. For this problem, the fundamental frequency parameter and the spectral information parameters combine to form short-term spectral parameters combined together, as the characteristic parameter for training and conversion. Experiments show that: the improved transition effects are superior to the traditional algorithm under the same conditions. (3) through the traditional neural networks for voice conversion algorithm, transition effects found to affect the neural network is a key factor in the central value of the hidden layer and the weight matrix. Algorithms for solving the key factor to improve and enhance the accuracy of the conversion is bound to improve neural network transition effects, based on this analysis, the use of quantum genetic algorithm optimization neural network, thus achieving voice conversion. Experimental results show that the subjective and objective tests: improved algorithm based on the conversion of voice both to improve the degree of similarity between the target voice, but also enhances the synthesized speech intelligibility.

CLC: > Industrial Technology > Radio electronics, telecommunications technology > Communicate > Electro-acoustic technology and speech signal processing > Speech Signal Processing > Speech Recognition and equipment
