Dissertation > Excellent graduate degree dissertation topics show

Design and Implementation of Information Acquisition System Based on News and Forums

Author: KongLiYuan
Tutor: LiuPeiYu; LiuDengFeng
School: Shandong Normal University
Course: Computer technology
Keywords: Information Acquisition System Web Crawler Data processing
CLC: TP274.2
Type: Master's thesis
Year: 2014
Downloads: 10
Quote: 0
Read: Download Dissertation

Abstract


With the development of the Internet, our modern society is now in an era of informationexplosion, and any people can post any message through the network at any time,any place aslong as he wants,. Admit of no doubt, the network has gone deep into every aspect of our lives.In the face of the complex information of the Internet, how to effectively deal with and make useof the huge amount of data becomes a great challenge that we have to face. Therefore, the onlineinformation collection, analysis, publishing and information processing has increasingly becomethe focus of scholars and institutions at home and abroad. Therefore, the research on informationacquisition system has great significance and practical value.By reading a lot of literature, this paper analyzes the present situation and developmenttrend of information acquisition system, and describes the research significance and practicalvalue in detail. In addition, this paper studies the technology of information acquisition system indetail, including web crawler, proxy server technology, seed, URL extraction and normalizationprocessing, regular expression technology and Chinese segmentation technology and so on.These technologies are all the key technology of information acquisition system, and theresearch on these technologies plays an important role in the design of this informationacquisition system based on news and forums.This system uses the C#programming language developing a comprehensive informationacquisition system based on news and forum. This system achieved the collection of Sina News,Tencent News, Sohu News, Netease, Tianya forum and Mop forum. Different from theinformation collection system for a single site, this system can realize the collection on multiplesites, at the same time has no effect on the acquisition speed and accuracy. The system can add ordelete acquisition channel according to the needs of the users at any time, increasing theflexibility of the system. The system uses the MySQL database, the name of the database isMSD0, and the database has three main data tables: final, news and AdminInfo.The overall structure of this information acquisition system mainly includes five modules:system login interface, data capture modules, data access module, data processing module andadd URL module. By introducing the design of the data acquisition system, this paper describesin detail the design and implementation of data processing module and the adding URL moduleand information collection module. The core part of this system is the information collectionmodule, this part can collect information of different sites according to the choice of users for thesource of collection and sampling depth, and at the same time display the collected results. The data processing module has the function of word segmentation and speech tagging according tothe needs of users. The part of adding URL has the function of adding or deleting URL for theusers at any time if they need.This paper also uses sina news, Tencent news, Netease and Sohu news as an example tocarry out a detailed demonstration, and based on the four news websites as the test site and withthe “primary and secondary school textbooks” as the acquisition theme, test and analyze thesystem. Through the test, this paper analyzes the performance of the system acquisition speedand quasi rate, found that the system has a good grasp of the general effect of the static WEBpages, and the speed is relatively faster.

Related Dissertations

  1. Data Collecting and Processing of Multi-Linear-CCD Visual Measuring System,TP274.2
  2. Application of the Modern Surveying Data Processing Technology in Estimation of Energy Demand,P25
  3. Study the Methods on City DLG Data Processing and Data Loading,P208
  4. Radar data processing research and its software implementation,TN957.52
  5. Fractal Processing of the Geological Exploration Data and Its Application in the Tongling-Anqing District,P618.41
  6. Study on Filtering Algorithm of the Aerial Gravity Measurement Data Processing,P223
  7. Research on the Inspection and the Application of GPS Kenimatic Surveying Technology in the Project of FAST,P228.4
  8. Research on the Way of Solid-propellant Ducted Rocket Performance Estimate and Simulation of Missile Trajectory,V435.1
  9. Guizhou run ATC equipment design and implementation of integrated management system,TP311.52
  10. The goaf 3D laser scanning data analysis and processing,P234.4
  11. Design and Realization of Environmental Protection GIS,P208
  12. Investigation on the Theory and Application of GPS Fitting Height Model,P228.4
  13. Research on DEM Generation Algorithm of Distributed-based InSAR,P225.1
  14. The artillery range weapons geodesic Security data calculation method and processing system,P228.4
  15. Quality Control of Baseline Solution and Reliable Examination of Datum Points in GPS Network,P228.4
  16. RTK operating system and in the city of Surveying Engineering Applications,P228.4
  17. Research of Mass Data Processing in the Telecom Business Analysis Support System (BASS) Based on Cloud Computing Platform,TP274
  18. Test indicators of different types of dynamic reservoir characterization method applied research,P618.13
  19. Research on the Methods and Application of Building Urban Cadastral Database,P208
  20. Application Analysis of GAMIT and Bernese Software in Precise GPS Data Processing,P228.4
  21. Research and Development of the Real-time Monitoring System of Ship’s Motions and Stresses,U661.7

CLC: > Industrial Technology > Automation technology,computer technology > Automation technology and equipment > Automation systems > Data processing, data processing system > Data collection and processing systems
© 2012 www.DissertationTopic.Net  Mobile