Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
DROTÁR, P. GAZDA, J. SMÉKAL, Z.
Originální název
An experimental comparison of feature selection methods on two-class biomedical datasets
Typ
článek v časopise ve Web of Science, Jimp
Jazyk
angličtina
Originální abstrakt
Feature selection is a significant part of many machine learning applications dealing with small-sample and high-dimensional data. Choosing the most important features is an essential step for knowledge discovery in many areas of biomedical informatics. The increased popularity of feature selection methods and their frequent utilisation raise challenging new questions about the interpretability and stability of feature selection techniques. In this study, we compared the behaviour of ten state-of-the-art filter methods for feature selection in terms of their stability, similarity, and influence on prediction performance. All of the experiments were conducted on eight two-class datasets from biomedical areas. While entropy-based feature selection appears to be the most stable, the feature selection techniques yielding the highest prediction performance are minimum redundance maximum relevance method and feature selection based on Bhattacharyya distance. In general, univariate feature selection techniques perform similarly to or even better than more complex multivariate feature selection techniques with high-dimensional datasets. However, with more complex and smaller datasets multivariate methods slightly outperform univariate techniques.
Klíčová slova
Feature selection, Stability, Classification performance, Univariate FS, Multivariate FS
Autoři
DROTÁR, P.; GAZDA, J.; SMÉKAL, Z.
Rok RIV
2015
Vydáno
1. 11. 2015
ISSN
0010-4825
Periodikum
COMPUTERS IN BIOLOGY AND MEDICINE
Ročník
66
Číslo
1
Stát
Spojené státy americké
Strany od
Strany do
10
Strany počet
BibTex
@article{BUT118697, author="Peter {Drotár} and Juraj {Gazda} and Zdeněk {Smékal}", title="An experimental comparison of feature selection methods on two-class biomedical datasets", journal="COMPUTERS IN BIOLOGY AND MEDICINE", year="2015", volume="66", number="1", pages="1--10", doi="10.1016/j.compbiomed.2015.08.010", issn="0010-4825" }