Dissertation > Excellent graduate degree dissertation topics show

Entity Recognition Research and Application on Hotspot Information of Internet Web

Author: DaiSiMing
Tutor: WangZhenYu; WangFeng
School: South China University of Technology
Course: Software Engineering
Keywords: Named Entity Recognition Rules and Probability and Statistics Web Retrieval Named Entity Recognition of Product
CLC: TP393.09
Type: Master's thesis
Year: 2012
Downloads: 72
Quote: 0
Read: Download Dissertation

Abstract


The task of Named entity recognition is recognizing the entity that has a specific meaningin the text, including people names, place names, organization names, proper nouns and so on.In today’s world, with the proliferation of computers and the rapid development of the Internet,a large amount of information presented in the form of electronic document in front of people.In order to deal with the serious challenges posed by the explosion of information, peopleurgently need some automated tools to help them quickly find the really importantinformation in massive information sources, so the information extraction technology cameinto being. And named entity recognition is an important part of information extractiontechnology. Meanwhile, it can also be applied to the field of natural language processing suchas Question answering, Machine translation, Information Retrieval and so on, contribute tothe improvement of their performance.However, due to Chinese restrictions of its own characteristics, the Chinese named entityrecognition has been quite difficult. In order to promote the development of othertechnologies and applications, study the Chinese named entity recognition technology is ofgreat significance, and is also very important.In this paper, we do research on the Chinese name named entity recognition, includingpeople names, place names, organization names and electronic products. Also, experimentsare done to verify the algorithm, and submit their application. The main works in this paperare as follows:(1) A Chinese people names entity double recognition method based on rules andregulations, probability and Statistics is proposed in this paper. Firstly, this method completeof the initial recognition of Chinese people names entity by the entity Knowledge Base ofpeople names, lexical rules of people names entity, the boundary conditions of people namesentity. Secondly, this method complete the final recognition of Chinese people names entityby boundary characteristics of people names entity and the credibility of the statisticalidentification model of people names.(2) A place names entity and organization names entity recognition method based on rules and regulations, web retrieval. This method find the trigger position of places entity andorganizations entity by the entity Knowledge Base of place names entity, the entityKnowledge Base of organization names entity, lexical rules of place names, lexical rules oforganization names.And then use the method based on web retrieval to complete the entityrecognition of place names and organization names. Among the method, using a place namesentity recognition method based on Baike retrieval strategy, using a organization names entityrecognition method based on Baidu retrieval strategy, finally proposed a abbreviation oforganization recognition method based on rules.(3) Complete the recognition of electronic products entity,including product names、product attribute、values of product attribute and comments of product attribute, about thenamed entity of the product names proposed the named entity automatic recognition model ofthe product based on the areas of seed words self-learning, about the attribute of productproposed the product attributes automatic recognition method based on the associatedprobability and statistics, about the attribute values of product proposed automatic recognitionmethod based on related rules-based of product attributes and units of product attributesvalues; about the attribute comments of product proposed Chinese grammar pattern matchingmethod based on the product of seed attributes.

Related Dissertations

  1. Chinese study nested entity recognition method named,TP391.1
  2. The Research for Named Entity Recognition and Relation Extraction in Text,TP391.1
  3. Ontology-based medicine named entity recognition technology research,TP391.1
  4. CRF -based named joint extraction of entities and relationships,TP391.4
  5. Click data and search results based on fragments excavated named entities,TP391.3
  6. Chinese named entity recognition and disambiguation of,TP391.1
  7. Study on Chinese Name Entity Recognition and Some Related Issues,TP391.41
  8. The Research of Conditional Random Fields Based Chinese Named Entity Recognition,TP391.4
  9. Chinese Named Entity Recognition Based on Conditional Random Fields,TP391.43
  10. The Study of POI Abbreviations Dictionary in the Filed of Location Search,TP391.3
  11. Research on Product Named Entity Recognition and Normalization,TP391.1
  12. Business Information Extraction Based on Internet,TP399-C2
  13. Named Entity Recognition Based on the Subject of Intelligent Answering Model,TP311.52
  14. Research on Key Technologies of Automatic Domain Ontology Construction,TP391.1
  15. Research on Named Entity Processing of Statistical Machine Translaton,TP391.2
  16. The Research of Semantic Annotation System for Scientific Literature,TP391.1
  17. Ontology-based protein-protein interaction information text mining method,Q51
  18. Bilingual Named Entity Recognition Based Word Alignment and Machine Translation Research,TP391.2
  19. Based on Hierarchical Hidden Markov model of Chinese lexical analysis and named entity recognition technology,TP391.4
  20. Internet-based Chinese Question Answering System,TP393.09

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Computer network > General issues > The application of computer network
© 2012 www.DissertationTopic.Net  Mobile