Abstract
Hybrid feature ranking is a feature selection method which combines the quickness of the filter approach and the accuracy of the wrapper approach. The main idea consists in a two steps procedure: building a sequence of feature subsets using an informational criterion, independently of the learning method; selecting the best one with a cross-validation error rate evaluation, using explicitly the learning method. In this paper, we show that in the protein discrimination domain, few examples but numerous descriptors, compared to a traditional approach where each descriptor is evaluated separately in the first step, to take account of their redundancy in the construction of candidate subsets of features reduces the size of the optimal subset and improves, in certain cases, the accuracy.