Detail publikace

HTML Document Analysis for Information Extraction

BURGET, R.

Originální název

HTML Document Analysis for Information Extraction

Typ

článek ve sborníku mimo WoS a Scopus

Jazyk

angličtina

Originální abstrakt

The today's World Wide Web contains a vast amount ofinformation stored in HTML documents. However, the HTML languageprimarily describes the look of the documents and it doesn't containfacilities for the description of contained data structure. In thispaper we propose a model of a Web site that describes logical structureof contained data. Furthermore, we propose methods for creating such a model by analyzing the look and the structure of HTML documents.

Klíčová slova

HTML Analysis, Information Extraction

Autoři

BURGET, R.

Vydáno

25. 4. 2002

Nakladatel

Faculty of Information Technology BUT

Místo

Brno

ISBN

80-214-2116-9

Kniha

Proceedings of 8th EEICT conference

Strany od

426

Strany do

430

Strany počet

5

BibTex

@inproceedings{BUT10014,
  author="Radek {Burget}",
  title="HTML Document Analysis for Information Extraction",
  booktitle="Proceedings of 8th EEICT conference",
  year="2002",
  pages="426--430",
  publisher="Faculty of Information Technology BUT",
  address="Brno",
  isbn="80-214-2116-9"
}