Cartogram representations of self-organizing virtual geographies
Tutor / director / evaluatorVellido Alcacena, Alfredo
Document typeMaster thesis
Rights accessOpen Access
Model interpretability is a problem for multivariate data in general and, very specifically, for dimensionality reduction techniques as applied to data visualization. The problem is even bigger for nonlinear dimensionality reduction (NLDR) methods, to which interpretability limitations are consubstantial. Data visualization is a key process for knowledge extraction from data that helps us to gain insights into the observed data structure through graphical representations and metaphors. NLDR techniques provide flexible visual insight, but the locally varying representation distor- tion they generate makes interpretation far from intuitive. For some NLDR models, indirect quantitative measures of this mapping distortion can be calculated explicitly and used as part of an interpretative post-processing of the results. In this Master Thesis, we apply a cartogram method, inspired on techniques of geographic representation, to the purpose of data visualization using NLDR models. In particular, we show how this method allows reintroducing the distortion, measured in the visual maps of several self-organizing clustering methods. The main capabilities and limitations of the cartogram visualization of multivariate data using standard and hierarchical self-organizing models were investigated in some detail with artificial data as well as with real information stemming from a neuro-oncology problem that involves the discrimination of human brain tumor types, a problem for which knowledge dis- covery techniques in general, and data visualization in particular should be useful tools.