Dissertation > Excellent graduate degree dissertation topics show

Domain adaptive Chinese entity relation extraction research

Author: WangLiFeng
Tutor: QinBing
School: Harbin Institute of Technology
Course: Computer Science and Technology
Keywords: Relation extraction Domain adaptive Relationship found Relations seed extract Describe the relationship between pattern mining
CLC: TP391.1
Type: Master's thesis
Year: 2011
Downloads: 111
Quote: 1
Read: Download Dissertation

Abstract


With the rapid popularization of computers, the Internet, the rapid development of a wide range of explosion of information increases, how from massive data accurately and quickly access the information users really need to become a topic of concern. The main purpose of information extraction is unstructured natural language text into structured or semi-structured data, convenient for people to accurately and quickly access critical information. Relation extraction as an information extraction subtasks and key technologies, has evolved into many natural language processing tasks important support technology. Traditional relation extraction methods require pre-defined relationship types, depending on the training of a large number of human-annotated corpus, it is difficult to meet the demand for information processing massive Internet. This paper presents a new framework for relation extraction studies, exploring the greatest extent avoid human involvement, and has a strong relationship between the field of adaptive extraction solutions to improve the relation extraction degree of automation, enhanced portability. First, by analyzing the relationship instance context linguistic phenomena found that most entities generate semantic relationships are possible in the context of its general verbs and nouns trigger general description (collectively referred to as feature words), whereby the word proposed feature-based poly- class methods, at a certain scale unlabeled corpus types of relations on the automatic discovery, experiment reached a predefined type of relationship with the artificial equivalent effect; Second, for a large number of pending relationship type, this paper presents the relationship-based Web Mining Seed set automatic extraction methods to collect and make full use of the search engine's ability to handle large-scale and advantages of real data to extract a representative entity relationship core network, through the selection of nine kinds of experiments on the relationship type, the average accuracy rate of 90.91 %; Again, according to Chinese linguistics, this paper defines the context model and its generalization heuristic strategy, the introduction of Bootstrapping approach to core network entity relationships as input in the unlabeled corpus mining iteratively describe the relationship between model and extract relationships tuples, through the relationship between the sampling tuples manual evaluation, the average accuracy rate of 88.24%, to meet the demand for a practical system. Finally, this paper designs and implements field of adaptive relation extraction platform XInfo, on the platform, researchers can focus on algorithm and research, rapid tests, for the relevant research in the field of natural language processing and application support. In addition, this paper figures as the application of social relation extraction task of developing a set of social relations people online demo system, an intuitive and clear way of showing relation extraction results.

Related Dissertations

  1. The Research for Named Entity Recognition and Relation Extraction in Text,TP391.1
  2. Research on Feature-based Semantic Relation Extraction between Entities,TP391.1
  3. CRF -based named joint extraction of entities and relationships,TP391.4
  4. Based on self-learning social relation extraction research,TP391.1
  5. Research on Control Algorithm of PT Visual Turntable with Servo System,TP273.4
  6. Research of Protein-Protein Interaction Extraction Based on Rich Feature and Multiple Kernels Learning,Q51
  7. Chinese Entity Relation Extraction Based on Multi-Agent Strategy,TP391.1
  8. A Conceptual Query Based Multi-Document Summarization in Biomedical Domain,TP391.1
  9. Based on Maximum Entropy Model Research on extracting Chinese entity relation,TP391.1
  10. A Study of the Fusion Algorithms of Muti-source Gravity Data in Coastal Areas,P223
  11. Research on Construct of Domain Ontology Based on Hierarchical Demand and Method of Semantic Annotation,TP391.1
  12. Research on Machine Learning-based Protein-Protein Interaction Extraction,Q51-3
  13. Business Information Extraction Based on Internet,TP399-C2
  14. Research on Key Technologies of Automatic Domain Ontology Construction,TP391.1
  15. Research of Chinese Relation Extraction in the Field of Music,TP391.1
  16. Automatic Construction of Domain Ontology Based on Semi-Structured Documents,TP391.1
  17. Tree-based nuclear unsupervised Chinese semantic relation extraction research,TP391.1
  18. Kernel-based Chinese new entity relation extraction method,TP391.1
  19. Research on Domain Ontology Concept Extraction and Relation Extraction,TP391.1
  20. The Research of Modeling Multi-Networks Based on Unstructured Data,TP391.1

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Text Processing
© 2012 www.DissertationTopic.Net  Mobile