Evaluating the compliance of current anonymization techniques with GDPR

This thesis examines the GDPR compliance of widely used anonymisation techniques, focusing on the extent to which their outputs can be considered anonymous data rather than pseudonymised personal data. The research is motivated by the growing reliance of organisations on anonymisation and pseudonymi...

Descripción completa

Detalles Bibliográficos
Autor: Fasolato, Gianluca
Tipo de recurso: tesis de maestría
Fecha de publicación:2025
País:España
Institución:Universitat Politècnica de Catalunya (UPC)
Repositorio:UPCommons. Portal del coneixement obert de la UPC
Idioma:inglés
OAI Identifier:oai:upcommons.upc.edu:2117/454941
Acceso en línea:https://hdl.handle.net/2117/454941
Access Level:acceso embargado
Palabra clave:Data protection
Data mining
GDPR
Anonymization
Pseudonymization
Differential privacy
K-anonymity
I-diversity
T-closeness
Data minimization
Privacy engineering
Compliance evaluation
Accountability
Risk-based approach
Protecció de dades
Mineria de dades
Àrees temàtiques de la UPC::Informàtica::Seguretat informàtica
Descripción
Sumario:This thesis examines the GDPR compliance of widely used anonymisation techniques, focusing on the extent to which their outputs can be considered anonymous data rather than pseudonymised personal data. The research is motivated by the growing reliance of organisations on anonymisation and pseudonymisation to balance the analytical value of personal data with legal obligations of minimisation, purpose limitation, and data security as framed in Article 32 GDPR. Against this backdrop, the central question addressed is: under what conditions, and to what extent, do common anonymisation models meet the GDPR’s standard of irreversibility? Methodologically, the thesis combines doctrinal legal analysis with insights from privacy engineering and computer-science literature. Primary EU legal sources—principally the GDPR, especially Recital 26 and Articles 25 and 32—together with interpretative guidance from supervisory authorities are triangulated with privacy engi- neering research to derive a framework of ten evaluation criteria: irreversibility, contextual risk, data minimisation, purpose limitation, transparency and fairness, technical measures, governance and access control, accountability, residual utility, and long-term effectiveness. These criteria are then systematically applied to four major privacy models—k-anonymity, l-diversity (including p-sensitive k-anonymity), t-closeness, and differential privacy—and applied to a case study to demonstrate its practical applicability and evidential value in assessing GDPR compliance. The analysis shows that syntactic models (k-anonymity, l-diversity, t-closeness) can mitigate disclosure risk and contribute to data minimisation, but they do not provide provable guarantees against re-identification when faced with auxiliary information. Without contextual risk assessment and layered organisational controls, their outputs remain within the scope of the GDPR as pseudonymised data. By contrast, differential privacy offers mathematically rigorous guarantees, explicit composition accounting, and a governance-friendly interface to risk management, making it the most GDPR-aligned model when correctly parameterised. The thesis contributes (i) a comparative evaluation of leading anonymisation techniques against GDPR requirements, clarifying the extent to which their outputs can be considered compliant, and (ii) a criteria-based framework that data controllers, processors, and Data Protection Officers can operationalise to evidence compliance in practice. The framework’s applicability is further demonstrated through a case study, which illustrates how its use can support accountability obligations. In doing so, the thesis bridges legal requirements and privacy engineering, offering actionable guidance for organisations seeking both lawful and useful data use.