Dissertation > Excellent graduate degree dissertation topics show

Research of Database Access Log Based on Weka

Author: FanALin
Tutor: RenShuHua
School: Dalian University of Technology
Course: Pattern Recognition and Intelligent Systems
Keywords: Data Mining Cluster analysis k-means coefficient of variation Weka
CLC: TP311.13
Type: Master's thesis
Year: 2012
Downloads: 2
Quote: 0
Read: Download Dissertation

Abstract


With the accelerated pace of social information, the database of human precipitation of large amounts of data, how to extract the implied, unknown and potentially useful information with data mining technology has become the research focus.This paper described Weka, which is an open data mining platform and collects lots of machine learning algorithms which are able to take data mining tasks. And did a detailed study of database access log data preprocessing, cluster analysis methods and carrying out the mining of University Library lending information Log. Package CV-k-means—k-means clustering algorithm based on coefficient of variation to Weka platform, then analyze the lending information of library by using the improved clustering algorithm and dig out the implicit knowledge in order to provide reference data for the library purchasing department. The main content and contribution of this paper id focused in the following aspects:1. The performance of k-means clustering algorithm depends on the selection of distance metrics. The Euclid distance is commonly chosen as the similarity measure in k-means clustering algorithm, which treats all features equally and dose not accurately reflect the dissimilarity among samples. k-means clustering algorithm based on coefficient of variation (CV-k-means) is proposed in this paper to solve this problem. The CV-k-means clustering algorithm uses variation coefficient weight vector to decrease the affects of irrelevant features. The experimental results show that the proposed algorithm can generate better clustering results than k-means algorithm.2. As an open data mining platform, Weka collects lots of machine learning algorithms which are able to take data mining tasks. However, the real world problem which to be solved become complicated and variety of data mining algorithms are showing limitations. In this paper, running Weka on the CV-k-means algorithm to get the new personalized data mining platform, in order to ease the contradiction between the general-purpose data mining tools and areas of expertise in mining.3. Analyze the lending information of Dalian Polytechnic University by using the improved personalized Weka data mining platform. Make CLC as a cluster object, three properties lend times, renew times and average lending time to participate in cluster computing to mining the pretreatment data, after analyze the clustering results we get the degree of reader interest, then provide appropriate recommendations for the procurement for the library.

Related Dissertations

  1. Development of EST-SSR Primers and Application in Analysis of Genetic Realtionships in Tree Peony,S685.11
  2. A Study on Healthcare Product Marketing Based on Data Mining Technology,F426.72
  3. Gao Zhong-ying academic thought and experience and use of Bufei Decoction treatment of common diseases of the respiratory system drug law,R249.2
  4. Bing- thick academic thought and clinical experience and empirical studies apply to turtle soups treatment of chronic kidney disease,R249.2
  5. Research and Improvement on K-Means Clustering Algorithm,TP311.13
  6. ISSR Analysis of Genetic Diversity on 21 Lotus(Nelumbo Nucifera) Cultivars,S682.32
  7. The Research of "Ant Group" Phenomenon in the Harmonious Society,D669.5
  8. Research on the Soil Environmental Function Zoning,X321
  9. Comparison of Gene Expression Data Cluster Methods and Gene Network Construction for Phytophthora Sojae Genes,S435.651
  10. The Design and Implementation of Bicluster Data Analyzing Software,TP311.52
  11. Research on Clustering Algorithm Based on Mutation Particle Swarm Optimization,TP18
  12. Research on K-means Optimization Clustering Algorithm,TP311.13
  13. Research on Fuzzy C-Mean Clustering Algorithm Based on Particle Swarm Optimization and Shuffled Frog Leaping Algorithm,TP18
  14. Research on Clustering Algorithm Based on Genetic Algorithm and Rough Set Theory,TP18
  15. The Study about the Select Strategies of Sportswear Brand Communication Means,G206
  16. A Snoring Detector for OSAHS Based on Formant,R766
  17. Case Analysis of Road Traffic Accidents for Yindi Company against Yuanlong Company,D913
  18. Investments Projects Evaluation Supporting System Based on Optimizing Model of Industrial Parameters,F283
  19. The Research and Application of Data Mart in the Telecommunication Business Analysis,TP311.13
  20. Design and Implementation of Course Assessment and Analysis of Decision System Based on Data Mining,TP311.13
  21. Design to E-learning System in Senior Vocational School Base on Moodle,TP311.52

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer software > Program design,software engineering > Programming > Database theory and systems
© 2012 www.DissertationTopic.Net  Mobile