Přístupnostní navigace
E-application
Search Search Close
Publication detail
BURGET, R.
Original Title
HTML Document Analysis for Information Extraction
Type
article in a collection out of WoS and Scopus
Language
English
Original Abstract
The today's World Wide Web contains a vast amount of information stored in HTML documents. However, the HTML language primarily describes the look of the documents and it doesn't contain facilities for the description of contained data structure. In this paper we propose a model of a Web site that describes logical structure of contained data. Furthermore, we propose methods for creating such a model by analyzing the look and the structure of HTML documents.
Keywords
HTML Analysis, Information Extraction
Authors
Released
25. 4. 2002
Publisher
Faculty of Information Technology BUT
Location
Brno
ISBN
80-214-2116-9
Book
Proceedings of 8th EEICT conference
Pages from
426
Pages to
430
Pages count
5
BibTex
@inproceedings{BUT10014, author="Radek {Burget}", title="HTML Document Analysis for Information Extraction", booktitle="Proceedings of 8th EEICT conference", year="2002", pages="426--430", publisher="Faculty of Information Technology BUT", address="Brno", isbn="80-214-2116-9" }