Dissertation > Excellent graduate degree dissertation topics show

Research on Efficient Provenance Storage

Author: ChenLei
Tutor: TanZhiPeng
School: Huazhong University of Science and Technology
Course: Computer System Architecture
Keywords: provenance graph compress storage
CLC: TP333
Type: Master's thesis
Year: 2013
Downloads: 7
Quote: 0
Read: Download Dissertation

Abstract


With the development of information technology, people concerns not only the dataitself, but also need to know the origin and evolution of the data. These historyinformation of data is also known as provenance. In scientific research field, provenance iswidely used, because the data quality is extremely important for scientists. There are lotsof information systems that produce and collect provenance, including physical astronomy,chemistry, biology and Marine meteorological research fields. In addition,provenanceapplication in data reconstruction, debug tracking, safety and search areas also begin toappear. But nowadays in many provenance system, the provenance space occupancy is farmore than the data itself, which consumes too much resource, it is greatly affects theavailability and efficiency of the provenance system.In order to reduce the space occupied of provenance, and not affect the provenanceintegrity, Chapman puts forward the factorization and inheritance (FAI) algorithm. FAIjust extracts the common information from provenance nodes and optimize them. In thispaper,web dictionary encoding method not only extracts and optimizes commoninformation, but also optimizes the identity information of data itself, and at the sametime mining internal similarity of provenance nodes:use web algorithm to optimize thecode of provenance ancestors to further reduce the storage cost of provenance and ensureperformance of searching provenance information.This method is on the micro level. Andon the macro level, provenance quantity increases over time, leading to the infinite spacegrowth and inquiring time growth of provenance.According to this problem, this papertakes PASS system for example, dividing the provenance information, establishing index,compressing divided provenance files etc. Then use local principle of provenance data toimprove the storage and search mechanism of PASS. The experimental results show thatthe web dictionary encoding algorithm is better than the FAI algorithms both in storagespace occupancy, or the query time of identity or ancestral information; In theoptimization of PASS, the optimization method of dividing database, establishing index,compressing divided database files is better than the original method in the spaceoccupancy and inquired time.

Related Dissertations

  1. Study on LC Resonance Charge and POS Based Pulsed Power Supply,TN782
  2. Research on Compression, Operation and Query Processing Methods of Massive Datasets,TP311.13
  3. Research on the Image Real-Time Acquisition, Storage and Image Processing System,TP391.41
  4. Implementation of Data Compression, Operation and Query Processing System Based on BAP,TP311.13
  5. Development of High Speed Data Acquisation and Playback Module Based on PCI Express Bus,TP274.2
  6. Research on Data Acquisition and Transmission Technology in High Speed Digital Image Acquisition System,TP274.2
  7. Design and Implementation of the HL7 Message Parsing and Store in the Medical Information Integration Platform,TP311.52
  8. Studies on Prediction Models for Fruit Decay and Shelf-Life of Postharvest Strawberry,TS255.4
  9. Seed Proteomic Analysis of TrxS Transferred Barley Lines,S512.31
  10. Sucrose Metabolic Mechanism of Lilium Asiatic Bulb during Bulb Development and Cooling Storage,S682.29
  11. Temperature and pressure explosives temperature field memory test technology,TQ560.7
  12. Periodic testing of the storage system reliability model,O213.2
  13. Effect on the Quality of Waxy Rice Flour for Tang-yuan,TS212
  14. Molecular Cloning and Sequence Analysis of Seed Storage Protein Genes from Leymus Mollis,S512.1
  15. Research on the Morphology and Fractal Growth of Electrolytic Zinc for Zinc-Air Fuel Cells,TM911.41
  16. Design and Realization of Prenatal Ultrasonic Workstation & Telemedicine System,R445.1
  17. The Design of Reconstruction Scheme of TV Network Maintenance,TN948.1
  18. The Control Design of the AS/RS Based on the Embedded Motion Controller,TP273.5
  19. Research in Trusted Computing Based on Bometric Intelligent Terminal System,TP309
  20. Study on High Performance Dealcoholized RTV-1 Silicone Sealant,TQ436.6
  21. Value Engineering in Cold Storage Project Decision,TU249.8

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Electronic digital computer (not a continuous role in computer ) > Memory
© 2012 www.DissertationTopic.Net  Mobile