Dissertation > Excellent graduate degree dissertation topics show

Research on DBN-Based Continuous Speech Recognition

Author: XueXiaoYan
Tutor: ZhangLianHai
School: PLA Information Engineering University
Course: Military Intelligence
Keywords: Continuous Speech Recognition Dynamic Bayesian Networks Times sound sub ??DBN model Control layer changes the times sound sub ??DBN model Triphone DBN model Discrete noise variables Hidden Markov Model
CLC: TN912.34
Type: Master's thesis
Year: 2010
Downloads: 95
Quote: 0
Read: Download Dissertation

Abstract


Hidden Markov model (HMM) is a simple, effective statistical model, successfully used in speech recognition, but due to its assumption does not match the conditions and the actual modeling, it is difficult to describe the dynamic characteristics of the voice. Dynamic Bayesian network (DBN) are easy to explain, easy to expand, easy decomposition characteristics, has strong reasoning and learning ability the voice timing modeling ability, can better describe the dynamic nature of the voice. Research the DBN reasoning and learning methods based on the improved model of the four kinds of continuous speech recognition based on dynamic Bayesian network (DBN), specific research results are as follows: (1) for the sound sub DBN model Alto subunits particles degree distinction poor, insert more errors lead to recognition, this paper proposed a sub tone sub DBN model. Firstly, the audio and sub-unit subdivision second syllable, increased sound sub-variables and sub-syllable transfer variable in the the sound sub DBN model; then determined by analyzing variables related variation. Therefore the times sound sub DBN model can better describe the details of the multi-level structure of the speech chain, to achieve the accurate modeling of the syllable internal dynamic variability. The experimental results show that, compared with the audio and sub DBN model, the model is preferably improved continuous speech recognition system to identify the correct rate and accuracy. (2) to build word transfer variable decision tree complexity, lack of adaptability model vocabulary, this paper presents a control layer changes the times tone sub DBN model for large vocabulary speech recognition. The model in building a dictionary for each word set a closing tag, change the parent node of the word transfer variables in the model structure. Reduce the differences caused by the difference of the number of the word alto sub word end mark set, reducing the complexity of building decision trees, and the read time of training and recognition parameter is also a corresponding reduction. The experimental results show that the model does not reduce the recognition performance in the case of, in some extent, improve the speed of the training, recognition. (3) for the the coarticulation phenomenon prevalent in continuous speech, the paper proposes a novel triphone DBN model. The model control layer changes the times sound sub DBN model based on the introduction of the voice in the context of the correlation between the front and rear sound sub-variables can be well described. As the number of triphone many different triphone clustering based on the decision tree method of pronunciation features to ensure that the the triphone get sound parameter estimates. The experimental results show that the model can improve the performance of large vocabulary continuous speech recognition. (4) for the training environment and to identify environmental mismatch led to the decline in performance of the model problem, this paper proposes a way to introduce the discrete noise variables DBN model. The model introduced a discrete noise variables in the DBN framework, the implicit variable classification training mixed voice training set of different signal-to-noise ratio. Experimental results show that the model improves DBN model can effectively improve the robustness and adaptability of the different signal-to-noise ratio environment, identification recognition performance mixed on the training set.

Related Dissertations

  1. Packet Loss Recovering Technology for Speech Transmission over Network,TN912.3
  2. Research on Domain Entity Attribute and Event Extraction Technology,TP391.1
  3. Multi-threaded fusion soccer video semantic analysis and event detection,TP391.41
  4. Chinese Speech Synthesis System Improvement and Implementation,TN912.33
  5. Research on Characteristic Analysis and Recognition Algorithm of Heart Sound Signal,R318.04
  6. Extended Hidden Markov Models and Parameter Estimation Based on Genetic Algorithm,O211.62
  7. Research of Multi-Sensory Myoelectric Prosthetic Hand with Hardness and Thermal Conductivity,TP242
  8. A Research on Chinese Word Segmentation Based on Phonetic Annotation,TP391.1
  9. Research on the Key Technologies fo Speech Recognition for Robot Communication,TN912.34
  10. The LVCSR system based on adaptive methods of semi-supervised learning,TN912.34
  11. Some Strong Laws for Markov Chain Fields Indexed by a Nonhomogeneous Tree of Module M,O211.62
  12. Research on Automatic Notation of Word for Tibetan Corpus Based on HMM,H214
  13. Research on Information Awareness Technology Oriented to Cognitive Networks,TP393.02
  14. Research and Implementation on Community Discovery from Network Based on Data Mining,TP393.094
  15. Statistical Image Modeling and Image Segmentation in Contourlet Domain,TP391.41
  16. Event Detection Modeling and Optimization in Intelligent Video Surveillance,TP391.41
  17. Research on Virtual Human Motion Synthesis Techniques and Engineering Application,TP391.41
  18. Research of Segmentation Based Chinese Continuous Speech Recognition Technology,TN912.34
  19. Prediction of Stock Price Based on Hidden Markov Model,F830.91
  20. The Research of Compliance Testing Technology of Traffic Terminology and Standards,TP391.1

CLC: > Industrial Technology > Radio electronics, telecommunications technology > Communicate > Electro-acoustic technology and speech signal processing > Speech Signal Processing > Speech Recognition and equipment
© 2012 www.DissertationTopic.Net  Mobile