Dissertation > Excellent graduate degree dissertation topics show

Based on the maximum frequent set data mining association rules algorithm

Author: SongWeiLin
Tutor: XuHuiMin
School: Beijing University of Posts and Telecommunications
Course: Circuits and Systems
Keywords: Data mining KDD(Knowledge Discovery in Databases) Association rules sequence pattern the DMFIA algorithm the ISS_DM algorithm maximum frequent item sets maximum frequent item sequence sets maximum frequent customer sequence sets
CLC: TP311.13
Type: PhD thesis
Year: 2006
Downloads: 935
Quote: 6
Read: Download Dissertation

Abstract


Data mining is a technique that aims to analyze and understand large source data and reveal knowledge hidden in the data. It has been viewed as an important evolution in information processing. During the past decade or over, the concepts and techniques on data mining have been presented, and some of them have been discussed in higher levels for the last few years. Like the other new techniques, however, data mining must develop gradually from concept creation, accepted importance, wide discussion, few usage attempts to a large applications. Most experts consider it as the phase of wide discussion today. It still needs theoretic studies and algorithm exploring.Association rule mining is an important branch of data mining that it has obtained many valuable results but there still are a deal of more challenging problems to discuss.For large databases, the research on improving the mining performance and precision is necessary, so many focuses of today on association rule mining are about new mining theories, algorithms and improvement to old methods.In this paper, the main researches involve the actuality and the trend of development of data mining technology and association rules. On base of maximum frequent item sets of association rules,the paper deploy the correlative work.The paper use for reference the correlative idea of the DMFIA algorithm for mining of maximal frequent item sets based on FP-tree. and put forward a new maximum frequent itemsets algorithm based on customer database by using different analysis method of data and adjusting the minimal support number neatly. The new algorithm can analyse data in different manner and reduce the time of execution of the algorithm for mining vast datum validly. Obviously, the new algorithm can improve the mining efficiency and satisfy many requirements of users. Through further analysis, the DMFIA algorithm and the above new algorithm can not solve the problem of data mining about customer sequence view database validly. The paper use for reference the correlative idea of the above algorithms, and put forword another new algorithm combining with sequential patterns (the item level maximum frequent sequence sets algorithm based on sequential patterns). The item which of the support number is not less than the minimal support number (s) start to operate circularly. The taxis of the items is arranged by the support number which changes from small to large. If the element of MFCS_d containes the items of transactions which of the support number is not less than the support number of a frequent item operating circularly, then the elements are picked up to form MFCSk. The support number of the element of MFCSk (flag) is worked out in backup table of MFCS. If flag>=s’ (usually s=s’), then the element (customer sequence sets) is outputted to maximum frequent sequence sets MFS_d. If the condition is not satisfied, the transactions of customer sequence sets are assembled reciprocally to create concourse, the element of concourse is picked up to operate circularly.The execution time of the item level maximum frequent sequence sets algorithm based on sequential patterns is decided when MFCS_d is empty. On the base of the item level maximum frequent sequence sets algorithm based on sequential patterns, the paper put forword the transaction level maximum frequent sequence sets algorithm based on sequential patterns.The transactions of per customer sequence sets which of the support number is not less than the minimal support number (s) start to operate circularly. The taxis of the transactions is arranged by the support number which changes from small to large. The data is picked up to operate circularly the same as that of the item level maximum frequent sequence sets algorithm based on sequential patterns by and large. On the other hand, the paper then describes the ISS_DM algorithm for mining of maximal frequent item sequence sets. Because the algorithm can not solve the problem of data mining about customer sequence view database validly, the paper put forword the improved ISS_DM algorithm combining with sequential patterns. The algorithms were validated accordingly. It shows that the execution time of the improved algorithm is reduced and efficiency is good when both algorithms are applied to mine the same datum. In the end, The paper describes the problem of data mining for multi-dimension model of data warehouse. The item level maximum frequent sequence sets algorithm based on sequential patterns and the improved ISS_DM algorithm are combined with multi-dimension model of data warehouse accordingly. The paper put forword the item level maximum frequent sequence sets algorithm based on sequential patterns and the improved ISS_DM algorithm on base of multi-dimension model of data warehouse.In conclusion, Through the study of maximum frequent item sequence sets of the DMFIA algorithm and the ISS_DM algorithm, the paper put forword a series of new algorithms. The results of experimentation validate the validity and practicability of the new algorithms. It shows the better creativity and value of theory of the new algorithms. In the same time, the new algorithms possess better application foreground in the efficiency of data mining and the usability of mining large-scale database.

Related Dissertations

  1. A Study on Healthcare Product Marketing Based on Data Mining Technology,F426.72
  2. Gao Zhong-ying academic thought and experience and use of Bufei Decoction treatment of common diseases of the respiratory system drug law,R249.2
  3. Bing- thick academic thought and clinical experience and empirical studies apply to turtle soups treatment of chronic kidney disease,R249.2
  4. The Design and Implementation of Bicluster Data Analyzing Software,TP311.52
  5. Research on Clustering Algorithm Based on Mutation Particle Swarm Optimization,TP18
  6. Research on Fuzzy C-Mean Clustering Algorithm Based on Particle Swarm Optimization and Shuffled Frog Leaping Algorithm,TP18
  7. Research on Clustering Algorithm Based on Genetic Algorithm and Rough Set Theory,TP18
  8. Based on data mining research tax audit case selection,F812.42
  9. Community-oriented education, personalized learning system and its implementation,TP391.6
  10. Association rule mining based Intrusion Detection System Research and Implementation,TP393.08
  11. Data warehouse technology in the banking customer management systems research and implementation,TP315
  12. Design to E-learning System in Senior Vocational School Base on Moodle,TP311.52
  13. Design and Development of Teaching Quality Assessment System Based on Data Mining,TP311.13
  14. The Application of Association Rules Algorithm in Higher Vocational Colleges’ Endorsement of Impoverished Students,G717
  15. Based on Data Mining Technologies in Urban Water Supply Analysis and Decision,F299.24;F224
  16. Research on Application of Data Mining Technology in Degree of Satisfaction Analysis of Television Customers,TP311.13
  17. Web Usage Mining and the Research of Personalized Recommendation,TP311.13
  18. Data Mining of Application in the School Management and Training Students,TP311.13
  19. Research on Employment Monitoring System of University Graduate,G647.38
  20. Design and Implementation for Decision Support System of Drug Administration Based on Data Warehouse,TP311.13
  21. A Research on the Credit Card Client Activating and Response Extent Based on Data Mining,F832.2

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer software > Program design,software engineering > Programming > Database theory and systems
© 2012 www.DissertationTopic.Net  Mobile