Dissertation > Excellent graduate degree dissertation topics show

The Research of Distributed Indexing Scheme for Large-scale Semantic Data Based on Linked Data

Author: LiXu
Tutor: ShiHong
School: Tianjin University
Course: Computer Science and Technology
Keywords: RDF distributed indexing MapReduce semantic ranking
CLC: TP391.1
Type: Master's thesis
Year: 2012
Downloads: 5
Quote: 0
Read: Download Dissertation

Abstract


With the development of the Linked Data project, enormous RDF data have beenpublished on the Web. A scalable system is required to provide an efficient retrievalfor large-scale RDF data.This paper presents a distributed inverted indexing scheme for large-scale RDFdata, and adds a semantic factor to the traditional information retrieval ranking model,in order to provide users with the keyword search service of RDF data. A scalableinverted index is built using the underlying data structure of Cassandra which is adistributed key-value storage system. We optimize the indexing scheme with thecharacteristics of RDF data model to effectively support the fast keyword search. Theloading, encoding and indexing procedures are implemented for RDF datasimultaneously using the MapReduce framework. The query mode with secondarykeywords enables the system can intelligently identify the user’s query intent.Encoding classes in owl ontologies using ORDPATHs directly reflect the inheritancerelationship between classes in the coding level. Create the distributed inverted indexfor TBox, which can rank classes according to the secondary keyword, and thedefinition of TreeRank semantic ranking algorithm and formula is given.In summary, the retrieval scheme can create indexes efficiently for RDF data andsort the query results using semantic ranking algorithm, to provide users withsemantic retrieval services for large-scale RDF data. This work has a certain guidingsignificance for the research of semantic web.

Related Dissertations

  1. Design and Research of a Tree Data Structure for Mass Data’s Comprehensive Evaluation,TP311.12
  2. Research on Mapping RDF/RDFS to Relational Database Schema,TP311.13
  3. Research on Mapping Relational Database to RDF (S),TP311.13
  4. Molecular Simulation of the Miscibility for Polymer Blends,O631.3
  5. 3D Mannequins Generating Engine Based on eMTM with MapReduce,TP391.41
  6. Storage Optimized Model Based RDF Data Query Mechanism,TP311.13
  7. An Intrusion Detection System for High-Speed Networks,TP393.08
  8. The Research of Text Classification Based on Hadoop,TP391.1
  9. Research and Implementation on a Distributed Service Registry Based on HADOOP Platform,TP393.09
  10. Resarch of Task-level Data Processing Based on Multicore CPU and Test of Its Performance on Cluster Platform,TP274
  11. Hadoop data center deployment and tracking systems research,TP308
  12. Fault Tolerance for MapReduce in the Cloud Environment,TP302.8
  13. Plugin-based semantic data visualization system and its applications,TP391.41
  14. Distributed Image Management System Design and Implementation,TP311.52
  15. A scalable prototype design and implementation of MapReduce,TP311.52
  16. The Research of Distributed Text-based Data Filtering Technology and System Implementation Based on MapReduce,TP391.1
  17. Large-scale approximation paragraph fingerprint - based page detection algorithm research,TP393.092
  18. The Optimization of High Performance MapReduce FairScheduler and the Implementation on Simulator of Huge Scale Cluster,TP311.13
  19. Research and Application of column storage management technology based on RFID data,TP315
  20. Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture,TP338
  21. Information Flow Control Model in Distributed Systems,TP316.4

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Text Processing
© 2012 www.DissertationTopic.Net  Mobile