Dissertation > Excellent graduate degree dissertation topics show

Research on Word Alignment Based on Statistics and Linguistics and Correlation Fusion Strategy

Author: QuXiaoHang
Tutor: LiuTing
School: Harbin Institute of Technology
Course: Computer Science and Technology
Keywords: statistical method word alignment linguistical knowledge multiple classifiers combination
CLC: TP391.2
Type: Master's thesis
Year: 2008
Downloads: 49
Quote: 0
Read: Download Dissertation


The boom of the Internet and the growth of information available all over the world has led to a great demand for understanding and spread context in different languages. In this backstops, the classical topic, machine translation, has been provided with new horizons for development. As an intermediate result in statistical machine translation, word alignment plays an important role in machine translation. Besides that, it has been applied widely in many natural language processing fields such as word sense disambiguation and translation lexicon building.Traditionally, a statistic based word alignment requires high size of corpus. How to deal with the data sparseness so as to improve alignment on small size of corpus is one of the hot topics in word alignment. This paper proposed a method combining statistical and linguistic knowledge to solve the question raised above.We adopt the classic IBM Model as a basic model. By combining dictionaries, rules and syntactic structures, taking position information and part of speech as constraint, we achieve the target by adding potential correct alignment, delete potential error alignment and disambiguate the uncertain alignment that more than one same words links to one word. Experiments show that combining dictionaries and syntactic structures method improve in precision and recall respectively. The rule based method works excellent in both aspects, reaching the lowest alignment error rate (AER) 0.2503.Additionally, we employ the concept of classic study. Regarding the alignment models as independent classifiers, we use simple voting and weight voting strategies to combine them. Experiments show that all strategies increased precision compare with the sole alignment classifiers. The weight voting strategy gets the highest recall and lowest AER, increasing by 17.22%and decreasing by 36.47% respectively.

Related Dissertations

  1. Analysis and Research on Synchronous-asynchronous Encounter Probability in the Middle Route of South-to-North Water Transfer Project,TV68
  2. Study on Word Alignment Technology and Construction of Statistical Machine Translation Platform,TP391.2
  3. Research on Machine Translation System Combination Based on Confusion Network,TP391.2
  4. Research on Chinese-uyghur Word-alignment for Statistical Machine Translation,TP391.2
  5. Based on feature fusion method of single- word aligned,TP391.2
  6. Identification and Extraction of Phrasal Paraphrase,TP391.1
  7. Rule-based and statistical - based combination the bilingual parallel to sentence the phrase alignment method,TP391.2
  8. Multi-word Expression Extraction Based on Chinese-English Bilingual Corpus,TP391.1
  9. Interactive Aided Translation Technology Research Based on Internet,TP391.2
  10. A Corpus-based Study on the Translation of Chinese Political Documents at the Sentence and Word Levels,H059
  11. Research of English-Chinese Word Alignment Based on Multi-Strategy,TP391.1
  12. Bilingual Named Entity Recognition Based Word Alignment and Machine Translation Research,TP391.2
  13. Morphology-Processing in Chinese-Mongolian Statistical Machine Translation,TP391.2
  14. Semi-Supervised Discriminative English-Chinese Word Alignment,H313
  15. Research on Bilingual Alignment,TP391.1
  16. Access to research associate degree and word -aligned bilingual chunks,TP391.1
  17. Research on Chinese-English Word Alignment,TP391.1
  18. Protein structure prediction based on the combination of multiple classifiers research,Q51
  19. Automatically extract statistical - based bilingual terminology,TP391.1
  20. The Characteristic and Difference of Public Prosecution and Private Prosecution in Qing Dynasty,D929

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Translator
© 2012 www.DissertationTopic.Net  Mobile