Differentially private data publishing via cross-moment microaggregation

Differential privacy is one of the most prominent privacy notions in the field of anonymization. However, its strong privacy guarantees very often come at the expense of significantly degrading the utility of the protected data. To cope with this, numerous mechanisms have been studied that reduce th...

Descripción completa

Detalles Bibliográficos
Autores: Parra Arnau, Javier|||0000-0002-1772-1088, Domingo Ferrer, Josep, Soria Comas, Jordi
Tipo de recurso: artículo
Fecha de publicación:2020
País:España
Institución:Universitat Politècnica de Catalunya (UPC)
Repositorio:UPCommons. Portal del coneixement obert de la UPC
Idioma:inglés
OAI Identifier:oai:upcommons.upc.edu:2117/386400
Acceso en línea:https://hdl.handle.net/2117/386400
https://dx.doi.org/10.1016/j.inffus.2019.06.011
Access Level:acceso abierto
Palabra clave:Data protection
Data privacy
Microaggregation
Differential privacy
Data utility
Protecció de dades
Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Telemàtica i xarxes d'ordinadors
id ES_b2edd06b13c4ccbcf4e31ee460af2ea1
oai_identifier_str oai:upcommons.upc.edu:2117/386400
network_acronym_str ES
network_name_str España
repository_id_str
spelling Differentially private data publishing via cross-moment microaggregationParra Arnau, Javier|||0000-0002-1772-1088Domingo Ferrer, JosepSoria Comas, JordiData protectionData privacyMicroaggregationDifferential privacyData utilityProtecció de dadesÀrees temàtiques de la UPC::Enginyeria de la telecomunicació::Telemàtica i xarxes d'ordinadorsDifferential privacy is one of the most prominent privacy notions in the field of anonymization. However, its strong privacy guarantees very often come at the expense of significantly degrading the utility of the protected data. To cope with this, numerous mechanisms have been studied that reduce the sensitivity of the data and hence the noise required to satisfy this notion. In this paper, we present a generalization of classical microaggregation, where the aggregated records are replaced by the group mean and additional statistical measures, with the purpose of evaluating it as a sensitivity reduction mechanism. We propose an anonymization methodology for numerical microdata in which the target of protection is a data set microaggregated in this generalized way, and the disclosure risk limitation is guaranteed through differential privacy via record-level perturbation. Specifically, we describe three anonymization algorithms where microaggregation can be applied to either entire records or groups of attributes independently. Our theoretical analysis computes the sensitivities of the first two central cross moments; we apply fundamental results from matrix perturbation theory to derive sensitivity bounds on the eigenvalues and eigenvectors of the covariance and coskewness matrices. Our extensive experimental evaluation shows that data utility can be enhanced significantly for medium to large sizes of the microaggregation groups. For this range of group sizes, we find experimental evidence that our approach can provide not only higher utility but also higher privacy than traditional microaggregation.The authors are thankful to A. Azzalini for his clarifications on the sampling of multivariate skew-normal distributions. Partial support to this work has been received from the European Commission (projects H2020-644024 “CLARUS” and H2020-700540 “CANVAS”), the Government of Catalonia (ICREA Academia Prize to J. Domingo-Ferrer), and the Spanish Government (projects TIN2014-57364-C2-1-R “Smart-Glacis” and TIN2016-80250-R “Sec-MCloud”). J. Parra-Arnau is the recipient of a Juan de la Cierva postdoctoral fellowship, FJCI-2014-19703, from the Spanish Ministry of Economy and Competitiveness. The authors are with the UNESCO Chair in Data Privacy, but the views in this paper are their own and are not necessarily shared by UNESCO.Elsevier20202020-01-0120232023-04-19journal articlehttp://purl.org/coar/resource_type/c_6501AMhttp://purl.org/coar/version/c_ab4af688f83e57aainfo:eu-repo/semantics/articleapplication/pdfhttps://hdl.handle.net/2117/386400https://dx.doi.org/10.1016/j.inffus.2019.06.011reponame:UPCommons. Portal del coneixement obert de la UPCinstname:Universitat Politècnica de Catalunya (UPC)InglésengMinisterio de Economía y Competitividad http://doi.org/10.13039/501100003329 FJCI-2014-19703 FJCI-2014-19703open accesshttp://purl.org/coar/access_right/c_abf2info:eu-repo/semantics/openAccessoai:upcommons.upc.edu:2117/3864002026-05-27T15:37:01Z
dc.title.none.fl_str_mv Differentially private data publishing via cross-moment microaggregation
title Differentially private data publishing via cross-moment microaggregation
spellingShingle Differentially private data publishing via cross-moment microaggregation
Parra Arnau, Javier|||0000-0002-1772-1088
Data protection
Data privacy
Microaggregation
Differential privacy
Data utility
Protecció de dades
Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Telemàtica i xarxes d'ordinadors
title_short Differentially private data publishing via cross-moment microaggregation
title_full Differentially private data publishing via cross-moment microaggregation
title_fullStr Differentially private data publishing via cross-moment microaggregation
title_full_unstemmed Differentially private data publishing via cross-moment microaggregation
title_sort Differentially private data publishing via cross-moment microaggregation
dc.creator.none.fl_str_mv Parra Arnau, Javier|||0000-0002-1772-1088
Domingo Ferrer, Josep
Soria Comas, Jordi
author Parra Arnau, Javier|||0000-0002-1772-1088
author_facet Parra Arnau, Javier|||0000-0002-1772-1088
Domingo Ferrer, Josep
Soria Comas, Jordi
author_role author
author2 Domingo Ferrer, Josep
Soria Comas, Jordi
author2_role author
author
dc.subject.none.fl_str_mv Data protection
Data privacy
Microaggregation
Differential privacy
Data utility
Protecció de dades
Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Telemàtica i xarxes d'ordinadors
topic Data protection
Data privacy
Microaggregation
Differential privacy
Data utility
Protecció de dades
Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Telemàtica i xarxes d'ordinadors
description Differential privacy is one of the most prominent privacy notions in the field of anonymization. However, its strong privacy guarantees very often come at the expense of significantly degrading the utility of the protected data. To cope with this, numerous mechanisms have been studied that reduce the sensitivity of the data and hence the noise required to satisfy this notion. In this paper, we present a generalization of classical microaggregation, where the aggregated records are replaced by the group mean and additional statistical measures, with the purpose of evaluating it as a sensitivity reduction mechanism. We propose an anonymization methodology for numerical microdata in which the target of protection is a data set microaggregated in this generalized way, and the disclosure risk limitation is guaranteed through differential privacy via record-level perturbation. Specifically, we describe three anonymization algorithms where microaggregation can be applied to either entire records or groups of attributes independently. Our theoretical analysis computes the sensitivities of the first two central cross moments; we apply fundamental results from matrix perturbation theory to derive sensitivity bounds on the eigenvalues and eigenvectors of the covariance and coskewness matrices. Our extensive experimental evaluation shows that data utility can be enhanced significantly for medium to large sizes of the microaggregation groups. For this range of group sizes, we find experimental evidence that our approach can provide not only higher utility but also higher privacy than traditional microaggregation.
publishDate 2020
dc.date.none.fl_str_mv 2020
2020-01-01
2023
2023-04-19
dc.type.none.fl_str_mv journal article
http://purl.org/coar/resource_type/c_6501
AM
http://purl.org/coar/version/c_ab4af688f83e57aa
dc.type.openaire.fl_str_mv info:eu-repo/semantics/article
format article
dc.identifier.none.fl_str_mv https://hdl.handle.net/2117/386400
https://dx.doi.org/10.1016/j.inffus.2019.06.011
url https://hdl.handle.net/2117/386400
https://dx.doi.org/10.1016/j.inffus.2019.06.011
dc.language.none.fl_str_mv Inglés
eng
language_invalid_str_mv Inglés
language eng
dc.relation.none.fl_str_mv Ministerio de Economía y Competitividad http://doi.org/10.13039/501100003329 FJCI-2014-19703 FJCI-2014-19703
dc.rights.none.fl_str_mv open access
http://purl.org/coar/access_right/c_abf2
dc.rights.openaire.fl_str_mv info:eu-repo/semantics/openAccess
rights_invalid_str_mv open access
http://purl.org/coar/access_right/c_abf2
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Elsevier
publisher.none.fl_str_mv Elsevier
dc.source.none.fl_str_mv reponame:UPCommons. Portal del coneixement obert de la UPC
instname:Universitat Politècnica de Catalunya (UPC)
instname_str Universitat Politècnica de Catalunya (UPC)
reponame_str UPCommons. Portal del coneixement obert de la UPC
collection UPCommons. Portal del coneixement obert de la UPC
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_ 1869417096046182400
score 15,300724