Dissertation > Excellent graduate degree dissertation topics show

Research on Text Classification Based on Biomimetic Pattern Recongnition

Author: HuangQiHu
Tutor: WangYuYing
School: Harbin Institute of Technology
Course: Computer Science and Technology
Keywords: Text Classification Biomimetic Pattern Recognition Feature Selection Hyper Sausage Neuron Network
CLC: TP391.1
Type: Master's thesis
Year: 2008
Downloads: 95
Quote: 0
Read: Download Dissertation

Abstract


With the advent of the Internet era, the amount of electronic data increases dramatically. Thus the problem on how to obtain, manage and make full use of the text data has become an urgent issue in information science. And Text classification(TC) is a very important research field of information technology, which categorize natural language texts according to given topics. Biomimetic Pattern Recognition(BPR) is based on“matter cognition”instead of“matter classification”, it is better closer to the function of human being, rather than traditional text classification (or traditional pattern recognition) using“optimal separating”as its main principle. So we apply BPR principle to text classification in this paper.BPR is a new theory which is different from traditional pattern recognition. The basic idea of this theory is based on the fact of the continuity in the feature space of any one of the certain kinds of samples. It identifies samples by the method of optimally covering the high dimensional geometrical distribution of the sample set in the feature space. This paper takes up a depth study on the mathematical tools and realizing way of BPR, and a novel text classification algorithm based Hyper Sausage Neuron Network is proposed.Further, we present three improved methods on the new classification algorithm. Firstly, the research on the noise and redundancy of train data enabled us to present an integration of cluster method and HSN classifier. Secondly, according to the research on the mistaken identification of border samples, we propose k-best identification algorithm based on HSN network. Thirdly, we also give a twice-feature selection method to solve the noise problem of feature. Furthermore, we present an integration of HSN and SVM.The experimental results on English corpus show that the improved HSN classification algorithm contrasted to KNN and SVM achieve a better performance. On Chinese corpus the improved HSN classification algorithm also have more advantages than KNN, and the integration of HSN and SVM performs better than either of them.

Related Dissertations

  1. Tourism Comments on the Internet’s Semantic Analysis and Usefulness Research,TP391.1
  2. Feature Extraction, Selection and Combination in Lipreading,TP391.41
  3. Research on Feature Selection and Construction in Emotion Speech Recognition,TP18
  4. Based on Data Distribution Characteristics of Text Classification,TP391.1
  5. Research on Improved K Neighbor Support Vector Machine Algorithm Faced Text Classification,TP391.1
  6. Research and Implementation of a Dynamic Feature Selection Method for Vehicle Recognition System,TP391.41
  7. Research on Face Recognition Based on AdaBoost Algorithm,TP391.41
  8. Research on Feature Extraction, Selection and Classification Algorithms for Pulmonary CAD,TP391.41
  9. Research for Event Extraction Method in Specific Domain Based on Tree Conditional Random Field,TP391.1
  10. Online Education News Text Categorization System Design and Implementation,TP391.1
  11. One kind of empirical data on the workload of a software bug fixes Prediction Model,TP311.53
  12. Research on cross-language text categorization,TP391.1
  13. Based on swarm intelligence optimization algorithm for medical image feature,TP391.41
  14. FSVM -based data mining method and its application to intrusion detection research,TP393.08
  15. Classification model based monitoring of e-commerce Prohibited Research and Implementation,TP393.09
  16. Manifold learning variance minimization,TP181
  17. Palmprint main feature selection method and imaging system,TP391.41
  18. Chinese folk music feature extraction and classification technology research,J607
  19. Random Forests Feature Selection,TP311.13
  20. Based on semantic analysis of text mining research,TP391.1

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Text Processing
© 2012 www.DissertationTopic.Net  Mobile