Dissertation > Excellent graduate degree dissertation topics show

Research on the Using of Tone Information in Mandarin Automatic Speech Recognition System

Author: QiangZuo
Tutor: XuCongFu;QianYao;SongGePing
School: Zhejiang University
Course: Computer Applications
Keywords: Speech Recognition Mandarin Tone modeling MSD Twice decoding
CLC: TP391.42
Type: Master's thesis
Year: 2007
Downloads: 198
Quote: 0
Read: Download Dissertation

Abstract


In Chinese, the tone is one of the key information to distinguish between the meaning of words and to determine the context word. Streamlined tone model used in the acoustic model can significantly improve the performance of modern speech recognition system. However, since the baseband usual HMM can not be applied to some of the features used to characterize the tone parameter baseband modeling. The fundamental frequency for the tone modeling mainly includes the following two characteristics: 1) the base frequency to be detected only in a voiced segment, and in the voiceless mute paragraph does not exist. Positive baseband envelope discontinuity of the fundamental frequency values ??in the the voiceless silence segment voiced segments exist, ordinary continuous density hidden Markov model is unable to carry out the modeling of the tone. 2) tone performance trend of the fundamental frequency, the baseband parameters can not be extracted in accordance with the short-time window length to meet the changing requirements in the expression and therefore requires a able to express long model of the fundamental frequency. In this paper, the two problems were put forward corresponding solutions: 1) MSD (multi-space distribution) applied to the tone modeling. Unlike traditional baseband difference method requires distributional assumptions in voiceless silence segment on baseband and create a the baseband value, this approximation the perfect solution to the problem of real-time tone modeling. This article uses the fundamental frequency of the tone modeling, using the MSD model. The method assumes that there are two possible space, a discrete symbol indicates that the fundamental frequency of the unvoiced silent segment, a Gaussian density function (pdf) Characterization of the fundamental frequency of the voiced speech segment, and using a probability value to identify the respectively on the two spaces possibilities. 2) use a two-pass search framework of the recognition results to improve the recognition accuracy of the tonal syllables. In the first pass of the search, using the MSD-HMM decoding obtained which means that the compression means of the search space lattice (Lattice), can be calculated using the information obtained by the first-pass search long baseband parameters; in the second pass search. Long tone model import the parameters calculated tone score, you can use the Song recognition results in the second pass reordering. In this paper, we identify parameters using only the MFCC parameters, and add the IBM interpolation approach the baseband parameters of the two systems as a baseline system algorithm comparison. Experimental results show that the MSD method to identify the correct rate is significantly higher than the above two methods. At the same time, using twice the decoding method in tonal syllables recognition experiments, the method can be decreased an additional 8.7% error rate.

Related Dissertations

  1. Study on the Cultivation of Core Competences in MSD Pharmaceutical Corporation,F426.72
  2. Appear in the Middle of the Northeast Dialect of Consonants "(?)" Research,H55
  3. The acoustic fusion part of speech information modeling and research,TN912.34
  4. A Comparative Study of Southwestern Mandarin Phonetics at the Turn of Sichuan and Shaan’xi Province,H17
  5. Baoying idiom words,H17
  6. \,H114
  7. Topic Classification of Speech Documents Based on the Word Fragment Network,TN912.3
  8. Research and Implementation of embedded speech recognition system,TN912.34
  9. Study on Mandarin Duck and Butterfly School in the View of Modern Media,I207.42
  10. The Horse Pass County Mie Factory Country Dialect Investigates and Study,H17
  11. Analyse Tibetan Putonghua (Weizang Dailect Area) Pronounciation Characteristic,H102
  12. Analysis for the Characteristics of Yi-Han Chinese Communicative Interlanguage Phonetics,H217
  13. Anxiang Phonetic Study,H17
  14. Study of Keyword Spotting System Based on Lattice,TN912.34
  15. Experimental Studies on Mandarin Speakers’ Knowledge of the Sonority Sequencing Principle--Evidence from Speech Perception and Segmentation,H0-05
  16. Comparisons on Acoustic Features of Chinese Initial Consonants Produced by Chinese and Japanese,H116.1
  17. "Standard Spoken Chinese Variant Pronunciation Word Table of Variant Pronunciations" Pronunciation Inspection,H102
  18. Research on Test Paper Design of the Chinese Mandarin Language Test System,H102
  19. Han Burman Tetrasyllabic Comparison of the Words,H4
  20. Tianjin students' English vowels pronunciation feature case studies,H319
  21. Citrus peel oil extraction , analysis and purification of,R284.1

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Pattern Recognition and devices > Voice recognition device
© 2012 www.DissertationTopic.Net  Mobile