A study of feature selection algorithms for accuracy estimation
Tutor / director / evaluatorBelanche Muñoz, Luis Antonio
Document typeMaster thesis
Rights accessOpen Access
The main purpose of Feature Subset Selection is to find a reduced subset of attributes from a data set described by a feature set. The task of a feature selection algorithm (FSA) is to provide with a computational solution motivated by a certain defi nition of relevance or by a reliable evaluation measure. Feature weighting is a technique used to approximate the optimal degree of influence of individual features using a training set. When successfully applied relevant features are attributed a high weight value, whereas irrelevant features are given a weight value close to zero. Feature weighting can be used not only to improve classi cation accuracy but also to discard features with weights below a certain threshold value and thereby increase the resource efi ciency of the classifier. In this work several fundamental feature weighting algorithms (FWAs) are studied to assess their performance in a controlled experimental scenario. A measure to evaluate FWAs score is devised that computes the degree of matching between the output given by a FWAs and the known optimal solutions. A study of relation between the score obtained from the di fferent classi fiers, variance of the score in the di fferent sample size is carried out as well as the relation between the score and the estimated probability of error of the model (Pe) for the classification problems and the square error (e2) for the regression problem.