Přístupnostní navigace
E-application
Search Search Close
Publication detail
BURGET, R. BURGETOVÁ, I.
Original Title
Automatic annotation of online articles based on visual feature classification
Type
journal article in Scopus
Language
English
Original Abstract
When applying the traditional data mining methods to World Wide Web documents, the typical problem is that a normal web page contains a variety of information of different kinds in addition to its main content. This additional information such as navigation, advertisement or copyright notices negatively influences the results of the data mining methods as for example the content classification. In this paper, we present a method of interesting area detection in a web page. This method is inspired by an assumed human reader approach to this task. First, basic visual blocks are detected in the page and subsequently, the purpose of these blocks is guessed based on their visual appearance. We describe a page segmentation method used for the visual block detection, we propose a way of the block classification based on the visual features and finally, we provide an experimental evaluation of the method on real-world data.
Keywords
automatic annotation, online articles, page segmentation; document preprocessing, visual features, visual analysis, data mining, classification
Authors
BURGET, R.; BURGETOVÁ, I.
RIV year
2011
Released
1. 7. 2011
ISBN
1751-5858
Periodical
International Journal of Intelligent Information and Database System
Year of study
5
Number
4
State
Swiss Confederation
Pages from
338
Pages to
360
Pages count
23
URL
https://www.fit.vut.cz/research/publication/9692/
BibTex
@article{BUT76405, author="Radek {Burget} and Ivana {Burgetová}", title="Automatic annotation of online articles based on visual feature classification", journal="International Journal of Intelligent Information and Database System", year="2011", volume="5", number="4", pages="338--360", doi="10.1504/IJIIDS.2011.041322", issn="1751-5858", url="https://www.fit.vut.cz/research/publication/9692/" }