Dissertation > Excellent graduate degree dissertation topics show

Design and Implementation of a Distributed Microblog Crawling System

Author: YangYiFan
Tutor: HeiXiaoJun
School: Huazhong University of Science and Technology
Course: Communication and Information System
Keywords: microblog open platform login simulation distributed system data collection
CLC: TP393.092
Type: Master's thesis
Year: 2013
Downloads: 11
Quote: 0
Read: Download Dissertation

Abstract


As a rising internet platform, microblog has great influence on the way people usemedia and the information propagation pattern. It has become a very important mediaplatform with the most instant news and the most active users among all the socialmedia. To the end of December2012, the microblog user number in China has reached309million, which is54.7percentages of all the Internet users in China. The researcheson microblog have great significance to society and research by helping understandingthe trends of public opinion, tracing hot topics and dividing social groups in socialnetworking services. All these studies require large amount of microblog data forsupport.Although there are already many organizations focusing on microblog datacollection, there is still no mature collecting method as for the traditional internetapplications. Therefore, research on microblog data collection is of great significance.This research designs and implements a distributed microblog crawling system,including the following parts:1) Designing and implementing the method of microblogdata collection through application programming interface of open platform, mainlyfocusing on the research and use of authorization in open platform and programminginterface.2) Designing and implementing the method of microblog data collectionthrough login simulation and webpage parsing, mainly focusing on the research and useof single sign on and webpage structure.3) Combining the two methods above,designing the general framework, modules and database, and implementing an efficientand expandable microblog data collecting system by using a distributed strategy. Usingthis system, the user can simply input the microblog accounts that need to be collected,and select the type of data to be collected, the results will be feedback quickly. It’s alsoconvenient to adjust the collection rate by modifying the amount of crawlers.After functional testing and data acquisition rate testing, it is proved that the systemis stable and efficient in microblog data collection, supporting dynamic extension. It haslaid a solid foundation for the research work carried out on the microblog data.

Related Dissertations

  1. Research on Power Plants Data Acquisition System Based on MODBUS,TM621
  2. Energy-saving and new energy vehicle data acquisition and plateau fit technology to explore,U469.7
  3. Mobile WSN data collection based on the virtual cluster head Strategy,TP212.9
  4. CAN bus technology in the meteorological monitoring System Research and Implementation,P409
  5. Research on the Control System of Miniature Airship for Search and Rescue,V249.1
  6. Study on Real Time Information Shareing Platform of Travel Based on 3G and Web2.0,F592
  7. The Design and Implementation of Public Facilities Surveying and Inspecting System Basic on Mobile GIS,P208
  8. Labview-based Simulation of the Temperature Monitoring System Design,TP277
  9. The Design and Realization of Employee Social Networking System Based on SharePoint,TP311.52
  10. A Public Opinion Analysis Model Based on Microblog Social Relation Network,TP311.52
  11. Communicative Action Research in the Chinese We Media Age,G206
  12. Research and Implementation of Defect Correction System of Plastic Injection Molding,TQ320.662
  13. The Measuring System of Three-direction Cutting Force,TG51
  14. A Study on the Construction and Application of Customer Loyalty Model Based on Microblog,F49
  15. The Research of Sentiment Analysis Techniques for Short-Texts,TP391.1
  16. Distribution network based on GPRS Remote Data Acquisition System,TM76
  17. Small Unmanned Helicopter Modeling and Control System Design,V249.1
  18. PTC starter integrated parameter test methods Research and Implementation,TH87
  19. Pulsed power source control system design,TJ03
  20. Study on the Market Structure of Video Game Industry Based on Two-Sided Markets Theory,F224
  21. User Behavior Analysis of Social TV,TP393.09

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Computer network > General issues > The application of computer network > Web browser
© 2012 www.DissertationTopic.Net  Mobile