Preserving empirical data utility in k-anonymous microaggregation via linear discriminant analysis
Visualitza/Obre
10.1016/j.engappai.2020.103787
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/330076
Tipus de documentArticle
Data publicació2020-09-01
EditorElsevier
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
Abstract
Today’s countless benefits of exploiting data come with a hefty price in terms of privacy. -Anonymous microaggregation is a powerful technique devoted to revealing useful demographic information of microgroups of people, whilst protecting the privacy of individuals therein. Evidently, the inherent distortion of data results in the degradation of its utility. This work proposes and analyzes an anonymization method that draws upon the technique of linear discriminant analysis (LDA), with the aim of preserving the empirical utility of data. Further, this utility is measured as the accuracy of a machine learning model trained on the microaggregated data. By transforming the original data records to a different data space, LDA enables -anonymous microaggregation to build microcells more tailored to an intrinsic classification threshold. To do this, first, data is rotated (projected) towards the direction of maximum discrimination and, second, scaled in this direction by a factor that penalizes distortion across the classification threshold. The upshot is that thinner cells are built along the threshold, which ends up preserving data utility in terms of the accuracy of machine learned models for a number of standardized data sets.
Descripció
© <2020>. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/
CitacióRodríguez-Hoyos, A. [et al.]. Preserving empirical data utility in k-anonymous microaggregation via linear discriminant analysis. "Engineering applications of artificial intelligence", 1 Setembre 2020, vol. 94, p. 103787:1-103787:13.
ISSN0952-1976
Versió de l'editorhttps://www.sciencedirect.com/science/article/abs/pii/S0952197620301792
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
Rodriguez - LDA 202006.pdf | 1,858Mb | Visualitza/Obre |