Dissertation > Excellent graduate degree dissertation topics show

Research and Implementation of Vertical Search Engine of Blog Oriented

Author: WangJiaJie
Tutor: JinYueHuiï¼›ZhaoFang
School: Beijing University of Posts and Telecommunications
Course: Software Engineering
Keywords: Vertical search engine Web spider Cache policy Inverted index
CLC: TP391.3
Type: Master's thesis
Year: 2009
Downloads: 464
Quote: 6
Read: Download Dissertation

Abstract


With the network information resources is growing exponentially, the use of traditional search engine technology to accurately and quickly find the information has become increasingly difficult. Face daily mass increment amount of data, the general search engine (also referred to as the horizontal search engine) and is difficult to timely update the index database; face of hundreds of millions of pages general search engine is difficult to crawl depth information. The presence of general search engines search is not fast enough, and not deep enough, the next-generation search technology - vertical search engine came into being. Vertical search for a certain industry, professional search engine, search engine subdivision and extension, general search engines is relatively large amount of information, the query is not accurate enough depth to a new generation of search engine service mode. Provide some valuable information and related services for a particular area, a particular population or a particular demand. Information collection technology with the general search engines, vertical search engine web spiders (also known as web crawlers) only collect information related to the topic. Forecast and judgment on the topic of web, professional web spider crawling to avoid a large number of irrelevant to the subject area. Only capture the theme relevant pages, vertical search engines on the accuracy and efficiency of the query has significantly improved. This paper introduces the research status and development direction of the vertical search engine technology, and then focuses on the theme of the vertical search engine search strategies and themes related to the degree of discrimination algorithm; analysis of the general search engines and vertical search engine in the system architecture, different characteristics on the basis of the working principle, key technology and other aspects of the design of the blog search engine core modules - indexing and retrieval module; and blog vertical search engine concrete realization and implementation of the system is described in detail, the test results verify of this paper the design blog vertical search engine search results. This innovation: (1) In accordance with the principle of topic-based vertical search Web crawler is independently developed MySpider network reptiles, it has a multi-threaded concurrent capacity, you can efficiently to the download page, it based on TopicPageRank the fetching strategy should pay attention to The pages somehow related to the degree of discrimination, to determine whether to download the page; (2) In order to improve the efficiency of users to retrieve, developed a corresponding index caching strategy. In this paper, the results of the project, the further development of theme-based vertical search engine technology to strengthen the ability of topic-based information retrieval, to further improve the level of information retrieval, to make better use of the vast amounts of information to provide some help, and a useful discussion .

Related Dissertations

  1. Study on Hadoop-based Inverted Index,TP391.3
  2. WEB topic information gathering technology research,TP391.3
  3. On the Research and Elementary Design of Deep Web Network Spider,TP393.092
  4. The Research and Design on Intelligent Vertical Search Engine,TP391.3
  5. Design and Implementation of the tours evaluation recommendation system based on vertical search engine,TP391.3
  6. Research and Implementation of Query Technology for SVG,TP391.41
  7. Dual index based XML query optimization,TP311.13
  8. Professional search engine key technologies,TP391.3
  9. Research and Implementation of Web Crawler on Vertical Search Engine,TP393.092
  10. Research and Implementation of Focus Crawling Spider Based on A. T. C and Optimzied Hyperlink Chosen Strategy,TP311.52
  11. Design and implementation of search engine based on the shape of the icon,TP391.3
  12. Research on Key Techniques of Vertical Search Engine Based on Lucene,TP391.3
  13. Search Engine Research and Implementation,TP391.3
  14. Information retrieval system based on words associated with the degree of,TP391.3
  15. Analysis and Research of Network Video on Demand System Based on P2P Technic,TN948.64
  16. Key Technology Study of P2P-based Search Engine,TP391.3
  17. Research on Full-text Retrieval Technology in Education Resource Sharing System,TP391.3
  18. Ontology-Guided Resource Crawling for Object-Level Vertical Search,TP391.1
  19. The Research of Distributed Search Engine System Based on MPI,TP391.3
  20. Research and implementation of a vertical search engine based on Lucene technology,TP391.3

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Retrieval machine
© 2012 www.DissertationTopic.Net  Mobile