Dissertation > Excellent graduate degree dissertation topics show

Design and Realization of Parallel File IO Based on Hadoop Distributed File System

Author: JinSongChang
Tutor: FangBinXing;YangShuQiang
School: National University of Defense Science and Technology
Course: Computer Science and Technology
Keywords: Massive data management Distributed file system Hadoop Parrallel file IO
CLC: TP338.6
Type: Master's thesis
Year: 2010
Downloads: 385
Quote: 4
Read: Download Dissertation

Abstract


With the rapid development of computer networks and its applications, especially since Google proposed Internet-based mass data storage and Map-reduce parallel computing ideas, data storage management based on network and parallel analysis and processing has become the focus of academia and industry. As one of the reference implementation of the idea, Hadoop has been widespread concern.In order to control file parallel IO, the core of Hadoop—Hadoop Distributed File System(HDFS) use lock mechanism, but does not support multiple users read and write in parallel on the same file. So, this paper proposes a parallel file IO model based on Block granularity, and finally experiments to verify the availability of this model.In this paper, the main works are:(1) Related work on Hadoop was deeply analyzed, particularly on Hadoop distributed file system (HDFS), because of the deficiency of Hadoop on multi-user file parallel IO, improvement ideas was taken out in this paper.(2) By analyzing the implementation of Hadoop, A multi-user parallel IO model without mutual exclusion mechanism was proposed for distributed file system, based on the model, under the right condition of reducing the integrity of the data reading, multi-user reading and writing in parallel on the same file was realized.(3) By modifying the source code, we implement the function described in the model designed, and then carry out experiments to verify the function and performance of the model.

Related Dissertations

  1. Research and Application of Map/Reduce Based Distributed Log Analyzer,TP311.52
  2. Design and Implementation of Online Shopping Prototype System Based on Hadoop,TP311.52
  3. The Research of Software Service Platform Based on Cloud Computing,TP311.52
  4. An Intrusion Detection System for High-Speed Networks,TP393.08
  5. Incremental Learning Method Based on Cloud Computing,TP311.13
  6. Hadoop-based video transcoding system design and implementation,TN919.81
  7. Weak consistency of distributed data maintenance strategy study,TP311.13
  8. Fault Tolerance for MapReduce in the Cloud Environment,TP302.8
  9. Cloud-based mobile data storage backup system,TP309.3
  10. Cloud storage system for mass data,TP333
  11. Massive Video Conversion Platform Design and Implementation Based on Cloud Computing,TP311.52
  12. IaaS cloud computing - based Web application technology research,TP393.09
  13. Study on Hadoop-based Inverted Index,TP391.3
  14. Research on Technology of Massive Data Stores Based on Cloud Computing,TP333
  15. Research on the Key Techniques of Massive Image Data Management Based on Hadoop,TP751
  16. Research on Unified Access Plantform for Unstructured Data and Index Technology,TP311.52
  17. Distributed Image Search Engine Design and Implementation,TP391.41
  18. Research and Implement of Job Scheduling Method for Multi_User MapReduce Clusters,TP311.13
  19. A Study on Word Document Decryption Using Time-memory Trade-Off Algorithm,TP391.12
  20. Reach on Map-Reduce Application Based on Hadoop,TP338.8

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Electronic digital computer (not a continuous role in computer ) > A variety of electronic digital computer > Parallel computer
© 2012 www.DissertationTopic.Net  Mobile