Title
A novel ensemble method for high-dimensional genomic data classification
Date Issued
21 January 2019
Access level
metadata only access
Resource Type
conference paper
Author(s)
Publisher(s)
Institute of Electrical and Electronics Engineers Inc.
Abstract
Classifier ensembles have shown to be an attractive approach for dealing with the curse of dimensionality problems in genomic data. The common idea of this approach is to integrate diverse and accurate base predictors in order to obtain a classification system better than its members. Many methods pursue it by introducing perturbations in some aspect of the learning process (examples, features, base learners, etc.). However, many of the existing methodologies do so in a completely random way, without having control of the perturbation process, which can generate unhelpful base predictors that can affect the final performance or the need to use some pruning strategy. In this paper we introduce tEnsemble, a new and simple approach that seeks an adequate balance between diversity and accuracy. This is done by using a previously optimized template feature set, which serves to guide the perturbation process on the feature space in a controlled manner. Experiments carried out on 39 gene expression public data sets showed that this methodology has the potential to produce effective classifier ensemble systems, showing a frequent superiority in relation to Random Forest, a well-established methodology in the area.
Start page
2229
End page
2236
Language
English
OCDE Knowledge area
Genética humana
Genética, Herencia
Subjects
Scopus EID
2-s2.0-85062519039
ISBN
9781538654880
Resource of which it is part
Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018
ISBN of the container
978-153865488-0
Conference
IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018
Sponsor(s)
INNOVATE PERU
Sources of information:
Directorio de Producción Científica
Scopus