Přístupnostní navigace
E-application
Search Search Close
Publication detail
HLOSTA, M. STRÍŽ, R. KUPČÍK, J. ZENDULKA, J. HRUŠKA, T.
Original Title
Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm
Type
journal article - other
Language
English
Original Abstract
Imbalance in data classification is a frequently discussed problem that is not well handled by classical classification techniques. The problem we tackled was to learn binary classification model from large data with accuracy constraint for the minority class. We propose a new meta-learning method that creates initial models using cost-sensitive learning by logistic regression and uses these models as initial chromosomes for genetic algorithm. The method has been successfully tested on a large real-world data set from our internet security research. Experiments prove that our method always leads to better results than usage of logistic regression or genetic algorithm alone. Moreover, this method produces easily understandable classification model.
Keywords
Imbalanced data, classification, genetic algorithm, logistic regression
Authors
HLOSTA, M.; STRÍŽ, R.; KUPČÍK, J.; ZENDULKA, J.; HRUŠKA, T.
RIV year
2013
Released
18. 5. 2013
ISBN
2010-3700
Periodical
International Journal of Machine Learning and Computing
Year of study
Number
3
State
Republic of Singapore
Pages from
214
Pages to
218
Pages count
5
URL
http://www.ijmlc.org/index.php?m=content&c=index&a=show&catid=36&id=304
BibTex
@article{BUT103468, author="Martin {Hlosta} and Rostislav {Stríž} and Jan {Kupčík} and Jaroslav {Zendulka} and Tomáš {Hruška}", title="Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm", journal="International Journal of Machine Learning and Computing", year="2013", volume="2013", number="3", pages="214--218", issn="2010-3700", url="http://www.ijmlc.org/index.php?m=content&c=index&a=show&catid=36&id=304" }