Feature selection (FS) has long been studied in classification and regression problems, following diverse approaches and resulting on a wide variety of methods, usually grouped as either /filters /or /wrappers/. In comparison, FS for unsupervised learning has received far less attention. For many real problems concerning unsupervised multivariate data clustering, FS becomes an issue of paramount importance as results have to meet interpretability and actionability requirements. A FS method for Gaussian mixture models was recently defined in Law et al. (2004). Mixture models are well established as clustering methods, but their multivariate data visualization capabilities are limited. The Generative Topographic Mapping (Bishop et al. 1998a), a constrained mixture of distributions, was originally defined to overcome such limitation. In this brief report we provide the theoretical development of a feature relevance determination method for Generative Topographic Mapping, based on that defined in Law et al. (2004); with this method, the clustering results can be visualized on a low dimensional latent space and interpreted in terms of a reduced subset of selected relevant features.
[This documend has been revised (8/11/2006)]
CitationVellido, A. "Preliminary theoretical results on a feature relevance determination method for Generative Topographic Mapping". 2005.
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder. If you wish to make any use of the work not provided for in the law, please contact: firstname.lastname@example.org