Efficient k-anonymous microaggregation of multivariate numerical data via principal component analysis

View/Open
Cita com:
hdl:2117/166168
Document typeArticle
Defense date2019-07-09
Rights accessOpen Access
Except where otherwise noted, content on this work
is licensed under a Creative Commons license
:
Attribution-NonCommercial-NoDerivs 3.0 Spain
ProjectCIPSEC - Enhancing Critical Infrastructure Protection with innovative SECurity framework (EC-H2020-700378)
MICROAGREGACION ANONIMA EN ENCUESTAS DEMOGRAFICAS A GRAN ESCALA (MINECO-TIN2014-58259-JIN)
MONITORIZACION DE INCIDENTES EN COMUNIDADES INTELIGENTES (MINECO-TEC2014-54335-C4-1-R)
MICROAGREGACION ANONIMA EN ENCUESTAS DEMOGRAFICAS A GRAN ESCALA (MINECO-TIN2014-58259-JIN)
MONITORIZACION DE INCIDENTES EN COMUNIDADES INTELIGENTES (MINECO-TEC2014-54335-C4-1-R)
Abstract
k-Anonymous microaggregation is a widespread technique to address the problem of protecting the privacy of the respondents involved beyond the mere suppression of their identifiers, in applications where preserving the utility of the information disclosed is critical. Unfortunately, microaggregation methods with high data utility may impose stringent computational demands when dealing with datasets containing a large number of records and attributes.
This work proposes and analyzes various anonymization methods which draw upon the algebraic-statistical technique of principal component analysis (PCA), in order to effectively reduce the number of attributes processed, that is, the dimension of the multivariate microaggregation problem at hand. By preserving to a high degree the energy of the numerical dataset and carefully choosing the number of dominant components to process, we manage to achieve remarkable reductions in running time and memory usage with negligible impact in information utility. Our methods are readily applicable to high-utility SDC of large-scale datasets with numerical demographic attributes.
Description
© <2019>. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/
CitationRebollo-Monedero, D. [et al.]. Efficient k-anonymous microaggregation of multivariate numerical data via principal component analysis. "Information sciences", 9 Juliol 2019, vol. 503, p. 417-443.
ISSN0020-0255
Publisher versionhttps://www.sciencedirect.com/science/article/pii/S0020025519306474
Files | Description | Size | Format | View |
---|---|---|---|---|
INS-D-18-1455R1-38-65.pdf | 1,792Mb | View/Open |