Privacy protection of user profiles in personalized information systems
ColaboratorRebollo Monedero, David; Forné Muñoz, Jordi; Universitat Politècnica de Catalunya. Departament d'Enginyeria Telemàtica
Document typeDoctoral thesis
PublisherUniversitat Politècnica de Catalunya
Rights accessOpen Access
In recent times we are witnessing the emergence of a wide variety of information systems that tailor the information-exchange functionality to meet the specific interests of their users. Most of these personalized information systems capitalize on, or lend themselves to, the construction of profiles, either directly declared by a user, or inferred from past activity. The ability of these systems to profile users is therefore what enables such intelligent functionality, but at the same time, it is the source of serious privacy concerns. Although there exists a broad range of privacy-enhancing technologies aimed to mitigate many of those concerns, the fact is that their use is far from being widespread. The main reason is that there is a certain ambiguity about these technologies and their effectiveness in terms of privacy protection. Besides, since these technologies normally come at the expense of system functionality and utility, it is challenging to assess whether the gain in privacy compensates for the costs in utility. Assessing the privacy provided by a privacy-enhancing technology is thus crucial to determine its overall benefit, to compare its effectiveness with other technologies, and ultimately to optimize it in terms of the privacy-utility trade-off posed. Considerable effort has consequently been devoted to investigating both privacy and utility metrics. However, most of these metrics are specific to concrete systems and adversary models, and hence are difficult to generalize or translate to other contexts. Moreover, in applications involving user profiles, there are a few proposals for the evaluation of privacy, and those existing are not appropriately justified or fail to justify the choice. The first part of this thesis approaches the fundamental problem of quantifying user privacy. Firstly, we present a theoretical framework for privacy-preserving systems, endowed with a unifying view of privacy in terms of the estimation error incurred by an attacker who aims to disclose the private information that the system is designed to conceal. Our theoretical analysis shows that numerous privacy metrics emerging from a broad spectrum of applications are bijectively related to this estimation error, which permits interpreting and comparing these metrics under a common perspective. Secondly, we tackle the issue of measuring privacy in the enthralling application of personalized information systems. Specifically, we propose two information-theoretic quantities as measures of the privacy of user profiles, and justify these metrics by building on Jaynes' rationale behind entropy-maximization methods and fundamental results from the method of types and hypothesis testing. Equipped with quantifiable measures of privacy and utility, the second part of this thesis investigates privacy-enhancing, data-perturbative mechanisms and architectures for two important classes of personalized information systems. In particular, we study the elimination of tags in semantic-Web applications, and the combination of the forgery and the suppression of ratings in personalized recommendation systems. We design such mechanisms to achieve the optimal privacy-utility trade-off, in the sense of maximizing privacy for a desired utility, or vice versa. We proceed in a systematic fashion by drawing upon the methodology of multiobjective optimization. Our theoretical analysis finds a closed-form solution to the problem of optimal tag suppression, and to the problem of optimal forgery and suppression of ratings. In addition, we provide an extensive theoretical characterization of the trade-off between the contrasting aspects of privacy and utility. Experimental results in real-world applications show the effectiveness of our mechanisms in terms of privacy protection, system functionality and data utility.
- Tesis - TDX-UPC