Dissertation > Excellent graduate degree dissertation topics show

Research on Clustering Algorithm Based on Genetic Algorithm and Rough Set Theory

Author: HongLiangLiang
Tutor: LuoKe
School: Changsha University of Science and Technology
Course: Communication and Information System
Keywords: data mining K-means clustering algorithm genetic algorithm rough set incremental categorical data
CLC: TP18
Type: Master's thesis
Year: 2011
Downloads: 17
Quote: 0
Read: Download Dissertation

Abstract


With the rapid development of computer technology and database technology, a large amount of data has been produced in various fields, many important information hidden behind these data, people want to analyze them in order to extract useful knowledge. Thus, data mining was proposed. Data mining is one of the most forward lines of database and information decision area. Cluster analysis is an important branch of data mining, and its basic purpose is to discover the natural group characteristics of the data by analyzing the similarity between them.This paper discussed the clustering algorithm and its incremental algorithm, both of which based on the genentic algorithm and rough set theory, then discussed the clustering algorithm for clustering the categorical data. The main research of this paper is as follows:1、It analyzes the advantages and disadvantages of the existing rough K-means clustering algorithm, according to the genetic evolution of the genetic algorithm and the maximum minimum distance algorithm, proposes a optimized method of rough K-means, the algorithm can determine the initial center dynamic and non-random, while the boundary object can be dealt very well. Experimental results show the effectiveness and correctness of the algorithm.2、It analyzes the advantages and disadvantages of the existing non-incremental rough clustering algorithm, based on incremental thinking and neighbors thought, proposes an incremental clustering method. Experiments show that the algorithm can make full use of the previous mining results to improve the utilization of existing information and clustering efficiency, it also can deal with large data sets under dynamic environments.3、An efficient categorical data clustering method is proposed, it extends the K-means algorithm to categorical data domain to overcome the shortcomings of the traditional K-means algorithm which can only deal with numerical data. In accordance with the information of data distribution correlated to each value of each categorical attribute,and at the same time combined with the vertical and horizontal distribution of the data to measure the difference between data object and the class,it defines a new distance metric. Experiments show that this method can find the intrinsic relationship between the different values of the same attribute,and could measure the difference between objects effective.

Related Dissertations

  1. Development of the Platform for Compressor Optimization Design and Aerodynamic Optimization Design in the Transonic Compressor,TH45
  2. Fault Diagnosis Method Based on Support Vector Machine,TP18
  3. A Study on Healthcare Product Marketing Based on Data Mining Technology,F426.72
  4. Gao Zhong-ying academic thought and experience and use of Bufei Decoction treatment of common diseases of the respiratory system drug law,R249.2
  5. Bing- thick academic thought and clinical experience and empirical studies apply to turtle soups treatment of chronic kidney disease,R249.2
  6. The Study of Incremental Democratic Research in the Perspective of Socialist Harmonious Society,D621
  7. The Application of Fuzzy Comprehensive Evaluation Based on Genetic Algorithm in Vocational Evaluation of Classroom Teaching,G712
  8. Study on Taste Characteristic of Taste Peptide Enzymatic Production from Oyster Base on A Neural Network Method,TS254.4
  9. Design and Realization of the Magnetic Antenna in MW and SW Bands Based on Genetic Algorithm,TN820
  10. Citrus Image Segmentation Based on Genetic Algorithm,TP391.41
  11. Research of Scheduling Algorithm Based on Hybrid Adaptive Genetic Algorithm in Computing Grid,TP393.09
  12. Context of Globalization of Contemporary China’s Development Road,D616
  13. Public Transport Optimal Dispatching Based on the Genetic-Newton Algorithm,TP18
  14. BP network optimization based on genetic algorithm optimization of the biodiesel process,TE667
  15. The Design and Implementation of Bicluster Data Analyzing Software,TP311.52
  16. The Research on Texture Synthesis Technology from Cloud Theory & Been Evolution Genetic Algorithm,TP391.41
  17. Research on Clustering Algorithm Based on Mutation Particle Swarm Optimization,TP18
  18. Research on Fuzzy C-Mean Clustering Algorithm Based on Particle Swarm Optimization and Shuffled Frog Leaping Algorithm,TP18
  19. Research on Traceability and Incremental Consistency of the MDA Model Transformation,TP311.5
  20. Based on Rough Set of Urban Areas When Traffic Green Control System Research,TP18

CLC: > Industrial Technology > Automation technology,computer technology > Automated basic theory > Artificial intelligence theory
© 2012 www.DissertationTopic.Net  Mobile