Beyond multivariate microaggregation for large record anonymization
Microaggregation is one of the most commonly employed microdata protection methods. The basic idea of microaggregation is to anonymize data by aggregating original records into small groups of at least k elements and, therefore, preserving k -anonymity. Usually, in order to avoid information loss, w...
| Autor: | |
|---|---|
| Tipo de recurso: | artículo |
| Fecha de publicación: | 2014 |
| País: | España |
| Institución: | Universitat Politècnica de Catalunya (UPC) |
| Repositorio: | UPCommons. Portal del coneixement obert de la UPC |
| Idioma: | inglés |
| OAI Identifier: | oai:upcommons.upc.edu:2117/23297 |
| Acceso en línea: | https://hdl.handle.net/2117/23297 https://dx.doi.org/10.1007/978-3-319-04178-0_8 |
| Access Level: | acceso abierto |
| Palabra clave: | Data protection Database security Microaggregation k-anonymity Privacy in statistical databases Protecció de dades Bases de dades -- Seguretat Àrees temàtiques de la UPC::Informàtica::Seguretat informàtica |
| Sumario: | Microaggregation is one of the most commonly employed microdata protection methods. The basic idea of microaggregation is to anonymize data by aggregating original records into small groups of at least k elements and, therefore, preserving k -anonymity. Usually, in order to avoid information loss, when records are large, i.e., the number of attributes of the data set is large, this data set is split into smaller blocks of attributes and microaggregation is applied to each block, successively and independently. This is called multivariate microaggregation. By using this technique, the information loss after collapsing several values to the centroid of their group is reduced. Unfortunately, with multivariate microaggregation, the k -anonymity property is lost when at least two attributes of different blocks are known by the intruder, which might be the usual case. In this work, we present a new microaggregation method called one dimension microaggregation ( Mic1D-k ). With Mic1D-k , the problem of k -anonymity loss is mitigated by mixing all the values in the original microdata file into a single non-attributed data set using a set of simple pre-processing steps and then, microaggregating all the mixed values together. Our experiments show that, using real data, our proposal obtains lower disclosure risk than previous approaches whereas the information loss is preserved. |
|---|