Dissertation > Excellent graduate degree dissertation topics show

Research on Mandarin Speech Retrieval Technique Based on Confusion Network

Author: HuangXiangSong
Tutor: ZhaoChunZuo
School: Harbin Engineering University
Course: Signal and Information Processing
Keywords: Voice Search Confusion network Grid Tone Recognition Rhythm detection
CLC: TN912.3
Type: PhD thesis
Year: 2010
Downloads: 222
Quote: 2
Read: Download Dissertation

Abstract


With the Internet and the rapid development of multimedia technology, the emergence of massive daily audio files, and how these voice documents efficiently retrieve information processing and classification has become a hot issue in the field. Currently speech retrieval technology research is mainly based on the basic theory of statistical pattern recognition, signal processing for continuous speech and language from the acoustic layer layer two aspects to consider. Grid information retrieval as an emerging technology, its structure just to describe these two aspects. It is based on the form of the text obtained in the process a plurality of search result candidates retained, especially suitable for the task-independent voice files retrieval. The mesh obtained after pruning confusion network, more compact in structure, to improve recognition accuracy. Therefore, the grid as a speech retrieval system input is very promising, grid-based speech retrieval and confusion network technology is being more and more attention. Create and query grid search phase indexing strategy speech retrieval techniques constitute two important components. This paper first voice signal retrieval for confusion network generation, the retrieval process of the search strategy and the calculation of confidence and other issues were studied. After focusing on how to further enrich the acoustic confusion network level as well as linguistic aspects of the information presented in the acoustic models Additional information model and tone in the language model, additional prosodic information model. Thesis of the work focused on the following aspects: First, because in low SNR environment, continuous speech signal segmentation result is not ideal, and therefore presents a selection mechanism based on voting continuous speech signal segmentation. This method results in several different split vote choice, in order to improve the accuracy of speech segmentation. Experimental results show that the method in low SNR of the speech signal segmentation results closer to human-annotated segmentation results. Secondly, the structure of the speech retrieval based on grid technology is proposed based on confusion network hub path generation method, without lowering the retrieval performance of the premise, the mesh structure is more compact, reducing the index size, and additional information to make more abundant. While for the search strategy proposed to improve the DMLS method used during the retrieval process to compensate for the minimum edit distance syllable recognizer insertion, deletion, substitution errors. In addition, the confidence level for voice retrieval computational problems, proposed a mutual information method as confidence, combined with posterior probability to get a whole new level of confidence. Finally, simulation experiments verify the effectiveness of the proposed method. Again, in order to confuse the more comprehensive information in the network, thus improving the overall performance of the speech retrieval system, integrated into the proposed model confusion tone network. And replaced with a full nuclear syllable tone for tone feature extraction, on this basis, established based on the tone tone nucleon MSD-HMM model. The model to be confused with the original acoustic model combines a network, in the language model is carried out under the same speech retrieval experiments. Simulation results proved that feature in the voice tone as auxiliary information retrieval effectiveness. Finally, in the confusion network attached prosody characteristic information to improve speech retrieval performance. Rhythm event detection for the first problem is studied, using the acoustic characteristics respectively, lexical and syntactic features characteristic rhythm event detection. The rhythm will build up to the existing confusion network model fusion acoustic model and language model. Speech retrieval conducted simulation experiments show that additional features help to improve speech prosody event retrieval performance. In summary, this thesis confusion network-based continuous Chinese speech retrieval problems, mainly for confusion network generation and retrieval stage search strategy was improved. Proposed based on confusion network hub path generation algorithm and retrieval method based on improved DMLS. Additionally, respectively, for confusion network acoustic model and language model, to take additional feature information retrieval methods to improve the performance of voice. In the acoustic model combines tone of information, in the language model combines aspects of prosody information. The results obtained show that the proposed method in this article were able to get a better effect on the voice document retrieval with performance improvements and enhancements.

Related Dissertations

  1. Grid-Side Converter Control and Wind Turbine Emulator in Direct Drive Wind Power System,TM46
  2. BioLab a Bioinformatics Oriented Grid Portal,TP399-C8
  3. Design and Realize of Family Cleaning Robot Path-Coverage System,TP242
  4. Micro- grid with distributed power control strategy research,TM61
  5. The Grid-Connected Wind-solar Hybrid Generation System and Maximum Power Point Tracking,TM61
  6. Research of Scheduling Algorithm Based on Hybrid Adaptive Genetic Algorithm in Computing Grid,TP393.09
  7. The Establishment of Grid Platform on Agricultural Supply Chain System,S126
  8. The Study on the Management of the Labor Security Inspection,F249.27
  9. Remote sensing data processing grid platform design and initial implementation,TP79
  10. The implication structural study interval set,O159
  11. Study for the Organizational Structure Based on the Development Mode Transformation and the Grid Update of Gansu Electric Power Corporation,F426.61
  12. Research of Path Planning for Small-Size Intelligent Soccer Robot in Complex and Dynamic Environment,TP242
  13. Single phase photovoltaic grid-connected inverter control technology research,TM464
  14. Research on Control of Direct Driving Type WTG Grid Converter Based on DSP,TM46
  15. Research and Implementation of Bot Detection Based on API Hook Technology,TP393.08
  16. The Operation Management System Based on SOA Technology,TM73
  17. Research on the Control Strategy for Cascade Multilevel STATCOM under Unbalanced Condition,TM761.1
  18. The Research of Electronic Current Transformer and Its Communication Interface,TM452
  19. The Evaluation of Power Enterprises’ Life Cycle Asset Management,F406.7;F426.61
  20. Research on Adaptive Grid Workflow Scheduling Based on Domain Trust Mechanism,TP311.52
  21. Study on Application of Performance Audit in Power Supply Enterprise,F239.4

CLC: > Industrial Technology > Radio electronics, telecommunications technology > Communicate > Electro-acoustic technology and speech signal processing > Speech Signal Processing
© 2012 www.DissertationTopic.Net  Mobile