Dissertation > Excellent graduate degree dissertation topics show

Research on Filter Algorithms for Approximate String Matching

Author: SunDeCai
Tutor: SunXingMing
School: Hunan University
Course: Computer Science and Technology
Keywords: Similar string matching Filtering algorithms Q-gram index Index compression Match the regional characteristics Filter criteria
CLC: TP391.1
Type: Master's thesis
Year: 2009
Downloads: 55
Quote: 0
Read: Download Dissertation

Abstract


Similar string matching is a fundamental problem in computer science , it has a wide range of applications in many fields , such as information retrieval, computational biology , and pattern recognition . Research is fast , accurate and low consumption similar to string matching algorithms have a certain role in promoting the development of these directions . q-gram index with language independence and high fault tolerance , suitable for Chinese processing . Filtering algorithm based on filter criteria to quickly abandon the text matched unrelated text fragments , suitable for large libraries find . q-gram index and filtering algorithms are often used in combination , the q - gram filter algorithm obtained because of its simple, rapid and extensive application . Similar string matching Chinese corpus , to improve the speed of matching q-gram filtering algorithm , mainly from the Chinese index structure , index optimization , matching regional characteristics of mining . On the Chinese corpus Similar string matching , a Chinese the Bigram two hash index structure index using a hash function to map all the characters in the Chinese GB2312 encoding table to the one-dimensional continuous integer space and secondary storage mode storage Chinese the Bigram items . Improve indexing speed and reduce the index space , index optimization. The linked list memory management program to manage the list of addresses of the memory allocation , this approach to improve the efficient use of memory . Index compression techniques to reduce the index occupy memory space , the experiment for a variety of compression algorithms compare obtained for address list of triples Chinese the Bigram index compression method . By improving the filtration efficiency of the filter algorithm to accelerate the speed of matching and filtering algorithm is proposed based on the matching area features . This algorithm, the pattern string and the text string are divided into logical blocks of a fixed length , and a new matching region feature is extracted from each block . Take advantage of new features in the new algorithm to optimize the basic filter criteria to improve the filtration efficiency of the algorithm , and improved based partitioning strategies filtration area to determine the program . Experimental results show that when the error rate is low , the new algorithm significantly improve before the algorithm better than new algorithm has a better application prospects in smaller similar string matching system , the error rate requirements .

Related Dissertations

  1. A Study on Compression Algorithm Performance Based Inverted Index,TP391.3
  2. A Study on Query-by-Continous-Humming System Using Repeating Patterns,TN912.3
  3. Appllication of Nonlinear Filter Algorithms Based on Bayes Estimation in the Autonomous Navigation System of Spacecraft,TN713
  4. An Index Structure and Query Algorithm for XML Documents with Duplicate Labels,TP311.13
  5. The Research and Implement of Index Technology in Search Engine,TP391.3
  6. Research on OpenMP Loop Scheduling Algorithm and Parallel Sparse Matrix-Vector Multiplication Algorithm on Multi-core Processors,TP311.11
  7. Design and Implementation of Electro-optical Tracking System,TN29
  8. Unmanned helicopter state estimation algorithm,V279
  9. Research and Application of System of Spam Filtering Based on Semantic Grid,TP393.098
  10. Research and Improvement of adaptive filtering algorithm,TN713
  11. The Research about the Education Resources Search Engine Based on the Content,TP391.3
  12. On Service Triggering in IMS Network,TN919.8
  13. Organism database system based on digital search localization,TP391.3
  14. A Research of Full-Text Retrieval Based on Inverted Index,TP311.13
  15. To Research on the Transfer Alignment Problems of Inertial Navigation Systems,U675.73
  16. Research of Index in Chinese Full-text Retrieval System,TP391.3
  17. Research on Basic Algorithms of Digital Image Processing and Implementation with FPGA,TP391.41
  18. Research on Facial Feature Extraction and Matching Algorithms for Image Retrieval,TP391.41
  19. Research of High Speed Image Pre-processing System Based on FPGA,TP391.41
  20. Research on Algorithms of 2D Face Template Protection,TP391.41
  21. Research of Visualization Technology in the Virtual Test of Missile,TP391.9

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Text Processing
© 2012 www.DissertationTopic.Net  Mobile