The effect of noise and sample size in the performance of an unsupervised feature relevant determination method for manifold learning
Tutor / director / evaluatorVellido Alcacena, Alfredo
Document typeMaster thesis
Rights accessOpen Access
The research on unsupervised feature selection is scarce in comparison to that for supervised models, despite the fact that this is an important issue for many clustering problems. An unsupervised feature selection method for general Finite Mixture Models was recently proposed and subsequently extended to Generative Topographic Mapping (GTM), a manifold learning constrained mixture model that provides data clustering and visualization. Some of the results of previous research on this unsupervised feature selection method for GTM suggested that its performance may be affected by insuficient sample size and by noisy data. In this thesis, we test in detail such limitations of the method and outline some techniques that could provide an at least partial solution to the negative effect of the presence of uninformative noise. In particular, we provide a detailed account of a variational Bayesian formulation of feature relevance determination for GTM.
SubjectsData mining, Pattern recognition systems, Mineria de dades, Reconeixement de formes (Informàtica)
ProvenanceAquest document conté originàriament altre material i/o programari no inclòs en aquest lloc web