dc.contributor.author | Rebollo Monedero, David |
dc.contributor.author | Forné Muñoz, Jorge |
dc.contributor.author | Soriano Ibáñez, Miguel |
dc.contributor.author | Hernández Baigorri, César |
dc.contributor.other | Universitat Politècnica de Catalunya. Departament d'Enginyeria Telemàtica |
dc.date.accessioned | 2018-10-31T19:32:27Z |
dc.date.available | 2018-10-31T19:32:27Z |
dc.date.issued | 2018-10-15 |
dc.identifier.citation | Rebollo-Monedero, D., Hernández-Baigorri, C., Forne, J., Soriano, M. Incremental k-Anonymous microaggregation in large-scale electronic surveys with optimized scheduling. "IEEE access", 15 Octubre 2018, vol. 6, p. 60016-60044. |
dc.identifier.issn | 2169-3536 |
dc.identifier.uri | http://hdl.handle.net/2117/123435 |
dc.description.abstract | Improvements in technology have led to enormous volumes of detailed personal information made available for any number of statistical studies. This has stimulated the need for anonymization techniques striving to attain a difficult compromise between the usefulness of the data and the protection of our privacy. k-Anonymous microaggregation permits releasing a dataset where each person remains indistinguishable from other k–1 individuals, through the aggregation of demographic attributes, otherwise a potential culprit for respondent reidentification. Although privacy guarantees are by no means absolute, the elegant simplicity of the k-anonymity criterion and the excellent preservation of information utility of microaggregation algorithms has turned them into widely popular approaches whenever data utility is critical. Unfortunately, high-utility algorithms on large datasets inherently require extensive computation. This work addresses the need of running k-anonymous microaggregation efficiently with mild distortion loss, exploiting the fact that the data may arrive over an extended period of time. Specifically, we propose to split the original dataset into two portions that will be processed subsequently, allowing the first process to start before the entire dataset is received, while leveraging the superlinearity of the microaggregation algorithms involved. A detailed mathematical formulation enables us to calculate the optimal time for the fastest anonymization, as well as for minimum distortion under a given deadline. Two incremental microaggregation algorithms are devised, for which extensive experimentation is reported. The theoretical methodology presented should prove invaluable in numerous data-collection applications, including largescale electronic surveys in which computation is possible as the data comes in. |
dc.language.iso | eng |
dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) |
dc.subject | Àrees temàtiques de la UPC::Matemàtiques i estadística::Estadística matemàtica::Anàlisi multivariant |
dc.subject | Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Telemàtica i xarxes d'ordinadors::Trànsit de dades |
dc.subject.lcsh | Multivariate analysis |
dc.subject.lcsh | Data protection |
dc.subject.other | Data privacy |
dc.subject.other | statistical disclosure control |
dc.subject.other | k-anonymity |
dc.subject.other | microaggregation |
dc.subject.other | electronic
surveys |
dc.subject.other | large-scale datasets |
dc.title | Incremental k-Anonymous microaggregation in large-scale electronic surveys with optimized scheduling |
dc.type | Article |
dc.subject.lemac | Anàlisi multivariable |
dc.subject.lemac | Protecció de dades |
dc.contributor.group | Universitat Politècnica de Catalunya. SISCOM - Smart Services for Information Systems and Communication Networks |
dc.contributor.group | Universitat Politècnica de Catalunya. ISG - Grup de Seguretat de la Informació |
dc.identifier.doi | 10.1109/ACCESS.2018.2875949 |
dc.description.peerreviewed | Peer Reviewed |
dc.relation.publisherversion | https://ieeexplore.ieee.org/document/8491270 |
dc.rights.access | Open Access |
local.identifier.drac | 23450836 |
dc.description.version | Postprint (published version) |
dc.relation.projectid | info:eu-repo/grantAgreement/MINECO//TEC2014-54335-C4-1-R/ES/MONITORIZACION DE INCIDENTES EN COMUNIDADES INTELIGENTES/ |
dc.relation.projectid | info:eu-repo/grantAgreement/MINECO//TEC2015-68734-R/ES/ANALISIS FORENSE AVANZADO/ |
dc.relation.projectid | info:eu-repo/grantAgreement/EC/H2020/700378/EU/Enhancing Critical Infrastructure Protection with innovative SECurity framework/CIPSEC |
local.citation.author | Rebollo-Monedero, D.; Hernández-Baigorri, C.; Forne, J.; Soriano, M. |
local.citation.publicationName | IEEE access |