Dissertation > Excellent graduate degree dissertation topics show

Ontology-based Semantic Web Financial Report an automatic construction method

Author: LaiZhongHua
Tutor: ChenQingCai
School: Harbin Institute of Technology
Course: Computer Science and Technology
Keywords: Semantic Web Financial Report Automatic semantic annotation Information Extraction
CLC: TP391.1
Type: Master's thesis
Year: 2008
Downloads: 70
Quote: 1
Read: Download Dissertation


Sustained to take the heat's universal search engines brought a massive information retrieval's greatly convenience, but regard to the specific a particular field concerned, retrieval capabilities Shang not sufficient to satisfactory. Thus, the various fields vertical search as if the hundred flowers blossom. Specific to the finance sector, and investment users often need to read multifarious's annual report data, while the general search engines In this regard you can help on's busy extremely limited. Thus, the Based on Financial Annual Report ontology library a financial Annual Report Semantic Web information retrieval system came into being, this system can give the investment users to accurately's query results, and can carry out automated reasoning, to the user push may need to the information. While the massive financial Annual Report of the Semantic Web automatically build problem is The system's biggest bottleneck lies. Of this paper main research purpose is to will ontology with the information extraction technology combine to achieve financial Annual Report of the Semantic Web automatically build. Main research contents are as follows: (1) right Annual Report text, through the the smallest Callout block's cut points, exact match and fuzzy matching, and to Based on Financial Annual Report ontology library carried out semantics automatic annotation. (2) right Annual Report non-labeled Form carry out Form Structure Recognition, mainly including sub-table Geqie points, columns segmentation and columns span of identification, row segmented, expand way to identify and table titles positioning these five areas. After the three aspects both use of the financial Annual Report ontology library the information. (3) right After a Structure Recognition's non-labeled Form carry out tacticity, and receive a with a clear ranks of the information's standard forms, after which and then ontology-based carried out Forms semantic automatic annotation. (4) research used in finance Annual Report the Semantic Web automated build system the accuracy of the evaluation methods, and to evaluation of the ontology library on the system the degree of influence. In the table Structure Recognition stage, the sub-table Geqie points, columns segmentation and OK segmented have very high degree of accuracy. Columns Span identify the too dependent on text platoon cloth, expand the way of too dependent on ontology, so they the accuracy of the somewhat less some, but also need further improvement. Because ontology library's consummate or not, tables Structure Recognition the various stages the accuracy of the as well as fuzzy matching algorithm the choice of will affect the the entire system performance, therefore present the system the accuracy of the was 63.1%, still has a certain room for improvement. This paper, the research methods can also be appropriate ground used in other areas semantic automatic annotation, because the ontology library of the switch, you can let it quickly switch to other areas, which also to a certain extent, showcase the system's good scalability. Addition, this article's Algorithm Research can also be for the other semi-structured structure of the document oriented and non-mark Forms's information extraction bring certain reference value.

Related Dissertations

  1. Research on Domain Entity Attribute and Event Extraction Technology,TP391.1
  2. Research on Temporal Information Recognition and Normalization,TP391.1
  3. Financial reporting changes under the New Accounting Standards,F233
  4. Study on Growth Monitoring Technique Based on Pixel Un-Mixing Method and HJ Remote Sensing Images in Paddy Rice,S511
  5. An Approach for Business Process Oriented Service Dynamic Composition,TP393.09
  6. Semantic Retrieval Research Based on Ontology,TP391.3
  7. Land Desertification in Qinghai Lake Landscape Pattern Change,X171
  8. Active faults based radar image information extraction method applied research and demonstration,P542.3
  9. Based on high-resolution remote sensing data mining houses information extraction,TP751
  10. Ontology-Based Hazard Information Extraction from Chinese Food Complaint Documents,TP391.1
  11. Tracking Events for Food Complaint Documents Based on Ontology,TP391.1
  12. Object-Based Automatic Extraction of Change Information Based on High-Resolution Remote Sensing Image Research,P237
  13. Web Page Attribute Extraction Method Research,TP391.1
  14. Topic search engine key technology research,TP391.3
  15. Based on semi- structured text transporter protein substrate information extraction system,Q811.4
  16. Object-oriented Information Extraction of woodland,P237
  17. Study on Information Extraction and the Dynamic Monitoring of Grassland Coverage in Three River Source Area,S812
  18. Internet-based personalized health information customized system build,TP311.52
  19. The Relation and Analysis of Relation-network Analysis Technology Based on MongoDB,TP311.13
  20. Study on Remote Sensing Geological Mineral Information Exrraction for Tibet Xiongcun Porphyry Coper-gold Deposit,P627
  21. Research and Design on a Semantic Web-based Access Control Model,TP393.08

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Text Processing
© 2012 www.DissertationTopic.Net  Mobile