On the use of integer programming to pursue optimal microaggregation
CovenanteeConsiglio Nazionale delle Ricerche. Istituto di Analisi dei Sistemi ed Informatica “Antonio Ruberti”
Document typeBachelor thesis
Rights accessOpen Access
This document reports a research collaboration in CNR-IASI (Italy) until the 7th of January. Microaggregation is a method for perturbing data in order to avoid individual identification in microdata. In terms of optimization, it is a clustering problem which consists in joining individuals in clusters with a minimal size such that the total spread is minimized. For multivariate data, the problem is NP-Hard and there is no procedure guaranteeing optimality. This document reports the state of the art in this topic on heuristic clustering algorithms and Integer Programming. Besides, inspired by the use of Column Generation in an approximate model, the document proposes a scheme to solve microaggregation with optimality. The block of Column Generation has been deeply developed in polyhedral aspects for the Pricing Problem. A code of this first block has also been implemented with CPLEX and its results are reported too. At the current stage, the procedure achieves optimality in certain instances of data and, in any case, finds a lower bound on the spread in microaggregation. Those results are new contributions and encourage us to follow this line of research.