<i>p</i>-Probabilistic <i>k</i>-anonymous microaggregation for the anonymization of surveys with uncertain participation

We develop a probabilistic variant of k-anonymous microaggregation which we term p-probabilistic resorting to a statistical model of respondent participation in order to aggregate quasi-identifiers in such a manner that k-anonymity is concordantly enforced with a parametric probabilistic guarantee....

Descripción completa

Detalles Bibliográficos
Autores: Rebollo-Monedero D., Forné J., Soriano M., Puiggalí Allepuz J.
Tipo de recurso: artículo
Estado:Versión publicada
Fecha de publicación:2017
País:España
Institución:Centre Tecnològic de Telecomunicacions de Catalunya (CTTC)
Repositorio:r-CTTC. Repositorio Institucional Producción Científica del Centre Tecnològic de Telecomunicacions de Catalunya (CTTC)
OAI Identifier:oai:cttc.fundanetsuite.com:p3442
Acceso en línea:https://cttc.fundanetsuite.com/Publicaciones/ProdCientif/PublicacionFrw.aspx?id=3442
Access Level:acceso abierto
Palabra clave:k-Anonymity
Microaggregation
Probabilistic anonymity
Surveys
Descripción
Sumario:We develop a probabilistic variant of k-anonymous microaggregation which we term p-probabilistic resorting to a statistical model of respondent participation in order to aggregate quasi-identifiers in such a manner that k-anonymity is concordantly enforced with a parametric probabilistic guarantee. Succinctly owing the possibility that some respondents may not finally participate, sufficiently larger cells are created striving to satisfy k-anonymity with probability at least p. The microaggregation function is designed before the respondents submit their confidential data. More precisely, a specification of the function is sent to them which they may verify and apply to their quasi-identifying demographic variables prior to submitting the microaggregated data along with the confidential attributes to an authorized repository. We propose a number of metrics to assess the performance of our probabilistic approach in terms of anonymity and distortion which we proceed to investigate theoretically in depth and empirically with synthetic and standardized data. We stress that in addition to constituting a functional extension of traditional microaggregation, thereby broadening its applicability to the anonymization of statistical databases in a wide variety of contexts, the relaxation of trust assumptions is arguably expected to have a considerable impact on user acceptance and ultimately on data utility through mere availability. (C) 2016 Elsevier Inc. All rights reserved.