Dissertation > Excellent graduate degree dissertation topics show

On the Information Extraction of the Sudden Events

Author: YangErHong
Tutor: ZhangPu
School: Beijing Language and Culture University
Course: Linguistics and Applied Linguistics
Keywords: Information Extraction sudden event Named Entity Recognition pattern acqusition information frame specific information characteristics analyzing
CLC: G202
Type: PhD thesis
Year: 2005
Downloads: 1785
Quote: 37
Read: Download Dissertation

Abstract


Corresponding with the rapid development of the Internet, we are surrounded by an immense sea of information. How to get accurate and valid information from this vast information sea is the goal that Information Extraction (IE) intends to achieve. In other words, Information Extraction involves extracting the interesting information from a mass of text, and representing it in a structured format. Its basic objectives are to increase the speed and improve the quality of information processing, and ultimately release manpower from the burden of intensive and inefficient text reading.Information Extraction, Information Retrieval and Text Summarization fall under the same text information processing research area, belonging to the domain of Natural Language Processing (NLP). Since the end of 1980’s, Information Extraction has been a hot research issue in NLP. It has been driven to a remarkable degree by the construction of a text processing scheme by the U. S. and Europe. Information Extraction technology and evaluation are among the important factors in its plan. With regard to Chinese Information Extraction, research had started lately but is still in the exploration phase.The world has been experiencing an increasing number of "sudden events". A test of efficient government is how the organizations correspond to these spontaneous events. The exponential increase in the quantity of textual information held in digital archives has fuelled growing government interest in computer-assisted techniques for Information Extraction. Handling sudden outbursts is indeed a multifaceted effort, and one of the important tasks is the collecting, categorizing, processing and promulgating of event information. The major criteria for increasing and scaling the corresponding ability to handle sudden events are: collecting information in a timely, impersonal and accurate way; extracting information with great efficiency and speed; and providing the full and accurate reference data.This thesis focuses on extracting information regarding the sudden events, a.k.a. Event Information Extraction, based on analyses of various press reports. The study consists of the following tasks: analyzing the various texts concerning the event to obtain its relevant characteristics; applying the research method of Named Entity Recognition; examining the means of automatic pattern acquisition for information; and probing into the feasible models of Event Information Extraction to acquire essential information structures and specific information.Information Extraction is an organic unity of resources and techniques, customized to practical uses. This research is primarily constructed on the basis of Part-Of-Speech (POS) tagging. Since less work on extraction has been done with Chinese than with English, there are, without a doubt, wide gaps between Chinese and English in the accuracy of the extracting process, the amassment of the knowledge resources, etc. Therefore, in each processing step, we carefully analyze pros and cons created by the existing resources and the accuracy of the extracting process in order to lay a foundation for further research and to find a way to close the gaps in Event Information Extraction.This research will attempt:1. To propose a practical Information Extraction model for the Events.Starting with carefully analyzing the raw data, next employing the related information provided by the different media for the same event, and finally by observing the developmental peculiarities of the event, we will probe into a feasible Event Information Extraction model. This model is grounded on analysis of the text characteristics, applying clustering techniques to extract the event information structure automatically and then calculating properties values in order to obtain the specific information. This method, provided with better robustness, can be applied to any text aggregation of a sudden event.2. To implement a unsupervised pattern acquisition method with strong adaptability, and further identify predetermined relevant information in text

Related Dissertations

  1. Research on Domain Entity Attribute and Event Extraction Technology,TP391.1
  2. Research on Temporal Information Recognition and Normalization,TP391.1
  3. Study on Growth Monitoring Technique Based on Pixel Un-Mixing Method and HJ Remote Sensing Images in Paddy Rice,S511
  4. Land Desertification in Qinghai Lake Landscape Pattern Change,X171
  5. Active faults based radar image information extraction method applied research and demonstration,P542.3
  6. Based on high-resolution remote sensing data mining houses information extraction,TP751
  7. Ontology-Based Hazard Information Extraction from Chinese Food Complaint Documents,TP391.1
  8. Tracking Events for Food Complaint Documents Based on Ontology,TP391.1
  9. Web Page Attribute Extraction Method Research,TP391.1
  10. The Research for Named Entity Recognition and Relation Extraction in Text,TP391.1
  11. The key component vertical search engine technology research,TP391.3
  12. Reptiles theme for Education News Design and Implementation,TP391.3
  13. GPU-based image search Chinese Research on key technologies of the retrieval,TP391.1
  14. Home Academic Information Extraction System,TP393.092
  15. Engineering News reported information extraction and applied research,G212
  16. Ontology-based medicine named entity recognition technology research,TP391.1
  17. CRF -based named joint extraction of entities and relationships,TP391.4
  18. Topic search engine key technology research,TP391.3
  19. Hull section robotic welding path planning and offline programming,TP242
  20. Click data and search results based on fragments excavated named entities,TP391.3
  21. Printers based on natural language HCI Research and implementation,TP11

CLC: > Culture, science,education, sports > Information and knowledge dissemination > Information and communication theory > Information processing technology
© 2012 www.DissertationTopic.Net  Mobile