Dissertation > Excellent graduate degree dissertation topics show

Research on Parallel Frequent Graph Pattern Mining

Author: LiuWei
Tutor: GaoHong
School: Harbin Institute of Technology
Course: Computer Science and Technology
Keywords: frequent subgraph pattern mining frequent closed graph pattern mining parallel algorithm dynamic load balancing
CLC: TP311.13
Type: Master's thesis
Year: 2008
Downloads: 80
Quote: 0
Read: Download Dissertation

Abstract


As a general data structure, graphs have become increasing important in modeling sophisticated structures and their interactions, with broad applications including chemical informatics, bioinformatics, computer vision, video indexing, text retrieval, and Web analysis. Mining frequent subgraph patterns for further characterization, discrimination, classification, and cluster analysis becomes an important task.Among the various kinds of graph patterns, frequent substructures are the vary basic patterns that can be discovered in a collection of graph. They are useful for characterizing graph sets, discriminating different groups of graphs, classifying and clustering graph, building graph indices, and facilitating similarity search in graph databases. Recent studies have developed several graph mining methods and applied them to the discovery of interesting patterns in various applications. For example, there have been reports on the discovery of active chemical structures in HIV-screening datasets by contrasting the support of frequent graphs between different classes. However, the frequent graph mining algorithms can’t achieve good performance when the minimum support is very low. We present the parallel graph mining algorithm base on cluster parallel environment.The main results are follows:Mining frequent graph patterns plays an important role in data mining. Base on the spirit of algorithm gSpan, an algorithm of parallel frequent graph pattern mining using dynamic load balance strategy on cluster is proposed in this paper. The algorithm effectively implements parallel frequent graph pattern mining by splitting the DFS lexicographic tree, maintaining the overload queue, and restricting the granularity. The theoretical analysis and experiment results show that our method on parallel frequent graph pattern mining improved the performance remarkably.According to the algorithm CloseGraph which mining frequent closed graph pattern, we give a method which substitute detecting failure of early termination. We implement the CloseGraph, and give the parallel frequent closed graph mining algorithm. The experiment results prove the analysis of the algorithm, this algorithm has good performance.

Related Dissertations

  1. Visual Feedback and Memory Behavior Based GPU Parallel Ant Colony Algorithm,TP301.6
  2. Dynamic load balancing technology based training system design and implementation,TP311.52
  3. Research on Key Techniques of High Productivity GPGPU Architecture,TP391.41
  4. Parallel Algorithms Research of Particle Transport on Heterogeneous Architecture,TP338
  5. Research on Petri Net Based Dynamic Load Balancing Double-Decked Scheduling Model,TP311.52
  6. Research and Implementation of Dynamic Distributed Strategy Load Balancing of Grid Service,TP393.01
  7. Dynamic Load Balancing with Uncertain Factor Based on Cluster,TP338
  8. The Study of Incomplete Projection CT Reconstruction Based on Gray System,TP391.41
  9. Electromagnetic particle simulation software parallel algorithm,O53
  10. Implementation and Optimization of the Missing Call Notice System in the Mobile Communication Network,TN929.5
  11. The Research of Architecture and Very Important Technologies for Parallel Graphics Rendering System,TP338.6
  12. The Research of Dynamic Load Balancing Strategy with PVM,TP338.6
  13. Studying on Parallel Clustering Algorithms Based on the Density,TP301.6
  14. Large-scale databases, association rule mining algorithm research,TP311.13
  15. Study on Parallel Algorithms for Approximate String Matching with Single Pattern and Single Text on Heterogeneous Cluster Computing Systems,TP301.6
  16. Based on Dynamic Communication Contention List Scheduling Algorithm for Arbitrary Network System,TP393.01
  17. Research of Genetic Algorithm for Allied Vehicle Routing Problems with Transfer Stations,TP18
  18. Research on Compression Algorithm of Seismic Data and Parallelization Based on Wavelet Transform,TN911.7
  19. Research of K-means Algorithm and Parallelism Based on Hybrid Particle Swarm Optimization,TP301.6
  20. Research and Implementation of Parallel Logic Simulation System Based on VHDL,TP391.9

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer software > Program design,software engineering > Programming > Database theory and systems
© 2012 www.DissertationTopic.Net  Mobile