Dissertation > Excellent graduate degree dissertation topics show

Research and Application on Decision Tree in Data Mining

Author: SunChaoLi
Tutor: ZongPing
School: Hohai University
Course: Applied Computer Technology
Keywords: KDD (Knowledge Discovery in Databases) DM (Data Mining) Decision tree Information Gain Information Entropy Attribute-value Pairs
CLC: TP311.13
Type: Master's thesis
Year: 2003
Downloads: 645
Quote: 9
Read: Download Dissertation

Abstract


With the quick development of the database technique and the abroad uses of DBMA, people have had more and more data. There are abundance of knowledge in these huge data, although current database technique can do many functions with high efficiency, for example, do query or do statistic, it still cannot find the relationship and rule among data, it still cannot predict the development trend of the future in these data. There are amount of the data in the database, but there has little technique that can find out the knowledge with these data, so the current situation is that "too much data, too little knowledge".In this situation, there appears KDD (Knowledge Discovery in databases) and its core technique-DM (Data Mining). Decision tree algorithm is one of the core technique algorithm of DM, it is often used to predict models, and it can divide amount of data into different types purposefully, so that it can let others find out some valuable and potential information. In decision tree algorithm, the famous one is ID3 algorithm, which was presented by Quinlan in 1986. It is not a algorithm increasing by degrees, and it uses information entropy as a standard to select attribute, but the disadvantage of this algorithm is that it is easy to select those attributes whose values is more, while attributes whose values is more are not always the best. To solve this problem, we present a new approach on IDS algorithm-the information gain of attribute-value pairs in two levels-to optimize the decision tree.Comparing with the decision tree built by other algorithm with the same example, we can know that the tree built by the information gain of attribute-value pairs in two levels algorithm is better. We also took tests to compare our optimization algorithm with ID3 algorithm using the data set FAMn providing, and did experiment on the standard data UCI providing, the result show that the information gain of attribute-value pairs in two levels optimization algorithm is more excellent than IDS algorithm indeed.

Related Dissertations

  1. Quantitative evaluation model based on information entropy of the classroom observation,G632.4
  2. Fault Diagnosis Method Based on Support Vector Machine,TP18
  3. A Reduction Method for Artificial Neural Network Inputs Based on An Improved Genetic Algorithm,TP18
  4. Studay on Virtual Logistics Alliance Risk Control,F252
  5. A Study on Satisfaction Measurement on Residensial Quarters from Agriculture to Non-Agriculture in Hangzhou,F293.3
  6. Research and Application of enterprise decision support system based on data warehouse,TP311.13
  7. Design and Development of Teaching Quality Assessment System Based on Data Mining,TP311.13
  8. Cash flow -based corporate credit rating of,F275
  9. Research on Theapplication of Data Mining Technology in Banking CRM,F830.49
  10. The Research and Implementation of an Aircraft Simulation Training System with QoS,TP391.9
  11. The Research and Implementation of Chinese Text Classification Technology Based on Decision Tree,TP391.1
  12. Application of Data Mining Technology in Human Resources Management,TP311.13
  13. Design and Implementation of Failure Diagnosis Expert System in Military Radio Communications,TP311.52
  14. The Design and Implementation of the Data Mining System for the Officers and Men Psychological Problems,TP311.13
  15. Design and Implementation of credit risk regulatory system based on data mining,TP311.52
  16. Decision tree - based data mining algorithms and their practice,TP311.13
  17. Application and Research of Vocational College Students Psychological data mining system,TP311.13
  18. Research of Hospital Information System Based on Decision Tree,TP311.52
  19. The Application Research of Data Mining in Undergraduate Employment Information Management,TP311.13
  20. A Software Effort Estimation Method Based on Fuzzy Decision Tree,TP311.5
  21. Research on Applications of Data Mining Technology in the Banking Credit Management,TP311.13

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer software > Program design,software engineering > Programming > Database theory and systems
© 2012 www.DissertationTopic.Net  Mobile