k-anonymous microaggregation with preservation of statistical dependence
Rights accessOpen Access
𝑘�������-Anonymous microaggregation emerges as an essential building block in statistical disclosure control, a field concerning the postprocessing of the demographic portion of surveys containing sensitive information, in order to safeguard the anonymity of the respondents. Traditionally, this form of microaggregation has been formulated to characterize both the privacy attained and the inherent information loss due to the aggregation of quasi-identifiers, which may otherwise be exploited to reidentify the individuals to which a record in a published database refer. Because the ulterior purposes of such databases involves the analysis of the statistical dependence between demographic attributes and sensitive data, we must articulate mechanisms to enable the preservation of the statistical dependence between quasi-identifiers and confidential attributes, beyond the mere degradation of the quasi-identifiers alone. This work addresses the problem of 𝑘�������𝑘�������-anonymous microaggregation with preservation of statistical dependence in a formal, systematic manner, modeling statistical dependence as predictability of the confidential attributes from the perturbed quasi-identifiers. We proceed by introducing a second mean squared error term in a combined Lagrangian cost that enables us to regulate the trade-off between quasi-identifier distortion and the confidential-attribute predictability. A Lagrangian multiplier enables us to gracefully weigh the importance of each of the two competing objectives.
CitationRebollo-Monedero, D., Forne, J., Soriano, M. k-anonymous microaggregation with preservation of statistical dependence. "Information sciences", 10 Maig 2016, vol. 342, p. 1-23.