Dissertation > Excellent graduate degree dissertation topics show

Research and Application of Information Extraction Based on Query Expansion

Author: WangLi
Tutor: QianPeiDe;ZhuQiaoMing;LiPeiFeng
School: Suzhou University
Course: Applied Computer Technology
Keywords: Query Expansion Keyword Expansion LDA Model Clustering Topical Information Extraction
CLC: TP391.1
Type: Master's thesis
Year: 2011
Downloads: 43
Quote: 0
Read: Download Dissertation

Abstract


With the rapid development of the Internet, the information about certain topics has been increasing explosively through different channels across the internet. In face of a large number of redundant information upon a certain topic, it becomes increasingly difficult on how to access the really needed information effectively. Therefore, how to provide users with comprehensive and concise information on specific topic and improve the efficiency information extraction have attracted the attention of many researchers. This thesis focuses on the query expansion and information extraction technologies which are needed by information extraction and aggregation for the side effects information of Traditional Chinese medicine.The main fruits are listed as follow:Firstly, information extraction need more to the comprehensive information, so that this thesis proposes a novel topic-related and query-based keyword expansion approach to solve the problem of information deficiency in the original query. Our method analyses the feedback pages obtained from the query on a specified keyword related to a topic, and then calculates those weight of topic-related keywords using the TF*PSF measure with the semantic weighting to filter those extracted keywords and achieves the purpose of information collection. Otherwise, it also designs an iterative keyword query expansion algorithm and adopts keyword combination method to improve the overall strategy for the web topical information.Secondly, according to the noisy,sparely, redundancy, less structural features of the network information, it proposes a topic sentence extraction approach based on reliability calculation to extract fine granularity on the subject, which can increase the reliability of the certain topical information and achieve the goal of information screening. On several sub-topics against a target topic, it extracts those topic sentences by means of the reliability calculation according to the smoothness of the topic-sentence probability distribution. In addition, the AP(Affinity Propagation) clustering is applied to eliminate redundant information ,and then it proposes a method to organize the topical information in hierarchy and structure form based on information ratio evaluation.Finally, It tests the performance of the information Retrieval and extraction experiments which based on the side effects information of three drugs. Experiments show that our approach achieves good results in the special application of information extraction on web topic.

Related Dissertations

  1. Research and Implementation of Mining Implicit User Interest,TP311.13
  2. Establishment and Update of Similar Users’ Cluster in Personalized Information Retrieval,TP391.3
  3. Research on Removal Algorithm of Shadows in Image Segmentation,TP391.41
  4. The Research of the Text Extraction Method Based on Spectral Cut,TP391.41
  5. Research on Query Expansion Technique of Retrieval System in Biomedical Field,TP391.3
  6. Gao Zhong-ying academic thought and experience and use of Bufei Decoction treatment of common diseases of the respiratory system drug law,R249.2
  7. Research and Improvement on K-Means Clustering Algorithm,TP311.13
  8. Research on Peer-to-Peer Traffic Identification Algorithm Based on Cluster Analysis,TP393.02
  9. Research of Scheduling Algorithm Based on Hybrid Adaptive Genetic Algorithm in Computing Grid,TP393.09
  10. Evaluation of Photosynthetic Efficiancy of Seedlings of the Hybrid Progenies (F1) in Peach,S662.1
  11. The Load Research and Comprehensive Evaluation on the Agricultural Non-Point Source Pollution in Nantong,X592
  12. BF-FCM Clustering Algorithm and Its Application in the Image Segmentation,TP391.41
  13. The Application of Ant Colony Algorithm in Meteorological Satellite Cloud Pictures Segmentation,TP391.41
  14. Research on Clustering Algorithm Based on Mutation Particle Swarm Optimization,TP18
  15. Research on K-means Optimization Clustering Algorithm,TP311.13
  16. Research on Fuzzy C-Mean Clustering Algorithm Based on Particle Swarm Optimization and Shuffled Frog Leaping Algorithm,TP18
  17. Research on Clustering Algorithm Based on Genetic Algorithm and Rough Set Theory,TP18
  18. Study on Photosynthetic Characteristics of Peach Based on Heterosis of Assimilation Capacity,S662.1
  19. The Research on Routing Protocol of Agricultural Environmental Monitoring System Based on Wir Eless Sensor Networks,TN915.04
  20. Multilayer structure based WSN routing protocol for heterogeneous clusters,TP212.9
  21. Evolutionary Clustering Algorithm and Its Application,TP311.13

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Text Processing
© 2012 www.DissertationTopic.Net  Mobile