Dissertation > Excellent graduate degree dissertation topics show

Research on Improving Naive Bayes Classification Model

Author: ZhuXiaoDan
Tutor: DongHuaiLin
School: Xiamen University
Course: Computer Software and Theory
Keywords: Naive Bayes Classification Model Validity of Single Attribute Validity of Double Attributes
CLC: TP311.13
Type: Master's thesis
Year: 2014
Downloads: 6
Quote: 0
Read: Download Dissertation

Abstract


Classification is an important task of data mining. The purpose of classification is to construct a classification function or classification model, which can map the unclassified sample in the database to a given class. Classification can be used to extract a model which describes important data or predicts the trend of data. Naive Bayes classification model is one of the research hotspots in current classification algorithms, and compared with other methods, Naive Bayes classification model owns features of simple structure, high classification accuracy and high speed, etc. Training set is used in Naive Bayes classification model to build a classification model, and if there are noise samples in the training set, the performance of the classification will be reduced. Taken optimizing the training set as research content, improved Naive Bayes classification model based on validity of single attribute and combined validity of double attributes are proposed. The noise samples in the training set are eliminated by validity of single attribute and validity of double attributes to achieve the goal of optimizing training set and improving classification accuracy.The main jobs are as follows:1. The basic theory of Bayes classification and the Naive Bayes classification model are introduced.2. Several common improved Naive Bayes classification model are analyzed: Semi Naive Bayes Classifiers (SNBC), Bayes Belief Network (BBN) and Tree Augments Naive Bayes (TAN).3. Based on Bayes theory, the noises of the training examples are eliminated by validity of single attribute to optimize the training set before it is used to build classifiers.4. Under the premise of Naive Bayes classification model based on validity of single attribute, an improved model combined validity of double attributes are proposed in order to discover and delete more noise samples.Experiment results based on mass data show that the proposed method in the dissertation is feasible, and they can effectively improve the classification accuracy.

Related Dissertations

  1. Research and Application of Naive Bayesian Classification Model,TP183
  2. Fraudulent Financial Statements Derection Based on Time Series Information,F234.4
  3. Query Processing and Optimization in Massive Multi-Database Integration,TP311.13
  4. The Design and Implement of Mediator and Wrapper Mechanism in Massive Multi-Database Intergration,TP311.13
  5. Implementation of Data Compression, Operation and Query Processing System Based on BAP,TP311.13
  6. The Design and Implementation of DICOM Middle Software and Access Control Model in Formation Integration Platform,TP311.13
  7. Research and Improvement on K-Means Clustering Algorithm,TP311.13
  8. Study of Data Reduction Technique Based on Manifold Learning,TP311.13
  9. Research on K-means Optimization Clustering Algorithm,TP311.13
  10. Public Security 110 Command Decisions Business Systems,TP311.13
  11. Clustering Method Research Based on Divided and Conquered Method,TP311.13
  12. Design and Implementation of Course Assessment and Analysis of Decision System Based on Data Mining,TP311.13
  13. Research on Application of Data Mining Technology in Degree of Satisfaction Analysis of Television Customers,TP311.13
  14. Web Usage Mining and the Research of Personalized Recommendation,TP311.13
  15. Mining User Traversal Sequential Patterns Based on User Traversal Interest from Web Log,TP311.13
  16. An Algorithm on Clustering and Anomaly Detection for Multiple Data Streams,TP311.13
  17. Design and Implementation of Network Teaching and Research OLAP Analysis System Based on Data Warehouse,TP311.13
  18. Outlier Detection Techniques on Uncertain Moving Objects,TP311.13
  19. Association Rule Mining Algorithm and It’s on Vocational School Teaching Evaluation System Applied Research,TP311.13
  20. The Research of Data Mining and OLAP about Medical Insurance in Social Protection System,TP311.13
  21. Improved FP-Tree Based Algorithm for Adaptive Learning System in the Characteristics of Learners in the Research Model,TP311.13

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer software > Program design,software engineering > Programming > Database theory and systems
© 2012 www.DissertationTopic.Net  Mobile