Using genetic algorithms for attribute grouping in multivariate microaggregation
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder
Anonymization techniques that provide k-anonymity suffer from loss of quality when data dimensionality is high. Microaggregation techniques are not an exception. Given a set of records, attributes are grouped into non-intersecting subsets and microaggregated independently. While this improves quality by reducing the loss of information, it usually leads to the loss of the k-anonymity property, increasing entity disclosure risk. In spite of this, grouping attributes is still a common practice for data sets containing a large number of records. Depending on the attributes chosen and their correlation, the amount of information loss and disclosure risk vary. However, there have not been serious attempts to propose a way to find the best way of grouping attributes. In this paper, we present GOMM, the Genetic Optimizer for Multivariate Microaggregation which, as far as we know, represents the first proposal using evolutionary algorithms for this problem. The goal of GOMM is finding the optimal, or near-optimal, attribute grouping taking into account both information loss and disclosure risk. We propose a way to map attribute subsets into a chromosome and a set of new mutation operations for this context. Also, we provide a comprehensive analysis of the operations proposed and we show that, after using our evolutionary approach for different real data sets, we obtain better quality in the anonymized data comparing it to previously used ad-hoc attribute grouping techniques. Additionally, we provide an improved version of GOMM called D-GOMM where operations are dynamically executed during the optimization process to reduce the GOMM execution time.
CitationBalasch, J.; Muntés, V.; Nin, J. Using genetic algorithms for attribute grouping in multivariate microaggregation. "Intelligent data analysis", 01 Gener 2014, vol. 18, núm. 5, p. 819-836.