Show simple item record

dc.contributor.authorRodríguez Hoyos, Ana
dc.contributor.authorEstrada Jiménez, José
dc.contributor.authorRebollo Monedero, David
dc.contributor.authorParra Arnau, Javier
dc.contributor.authorForné Muñoz, Jorge
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Enginyeria Telemàtica
dc.date.accessioned2018-10-02T09:25:28Z
dc.date.available2018-10-02T09:25:28Z
dc.date.issued2018-05-16
dc.identifier.citationRodríguez-Hoyos, A., Estrada-Jimenez, J., Rebollo-Monedero, D., Parra-Arnau, J., Forne, J. Does k-anonymous microaggregation affect machine-learned macrotrends?. "IEEE access", 16 Maig 2018, vol. 6, p. 28258-28277.
dc.identifier.issn2169-3536
dc.identifier.urihttp://hdl.handle.net/2117/121730
dc.description.abstractn the era of big data, the availability of massive amounts of information makes privacy protection more necessary than ever. Among a variety of anonymization mechanisms, microaggregation is a common approach to satisfy the popular requirement of k-anonymity in statistical databases. In essence, k-anonymous microaggregation aggregates quasi-identifiers to hide the identity of each data subject within a group of other k - 1 subjects. As any perturbative mechanism, however, anonymization comes at the cost of some information loss that may hinder the ulterior purpose of the released data, which very often is building machine-learning models for macrotrends analysis. To assess the impact of microaggregation on the utility of the anonymized data, it is necessary to evaluate the resulting accuracy of said models. In this paper, we address the problem of measuring the effect of k-anonymous microaggregation on the empirical utility of microdata. We quantify utility accordingly as the accuracy of classification models learned from microaggregated data, and evaluated over original test data. Our experiments indicate, with some consistency, that the impact of the de facto microaggregation standard (maximum distance to average vector) on the performance of machine-learning algorithms is often minor to negligible for a wide range of k for a variety of classification algorithms and data sets. Furthermore, experimental evidences suggest that the traditional measure of distortion in the community of microdata anonymization may be inappropriate for evaluating the utility of microaggregated data.
dc.format.extent20 p.
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectÀrees temàtiques de la UPC::Enginyeria de la telecomunicació
dc.subject.lcshMachine learning
dc.subject.otherdata privacy
dc.subject.othermachine learning
dc.subject.otherprivacy
dc.subject.otherdata models
dc.subject.otherdatabases
dc.subject.otherstandards
dc.subject.othertask analysis
dc.titleDoes k-anonymous microaggregation affect machine-learned macrotrends?
dc.typeArticle
dc.subject.lemacCiències de la computació
dc.contributor.groupUniversitat Politècnica de Catalunya. SISCOM - Smart Services for Information Systems and Communication Networks
dc.identifier.doi10.1109/ACCESS.2018.2834858
dc.relation.publisherversionhttps://ieeexplore.ieee.org/document/8360116/
dc.rights.accessOpen Access
drac.iddocument22959286
dc.description.versionPostprint (published version)
upcommons.citation.authorRodríguez-Hoyos, A.; Estrada-Jimenez, J.; Rebollo-Monedero, D.; Parra-Arnau, J.; Forne, J.
upcommons.citation.publishedtrue
upcommons.citation.publicationNameIEEE access
upcommons.citation.volume6
upcommons.citation.startingPage28258
upcommons.citation.endingPage28277


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder