Semantic microaggregation for the anonymization of query logs using the open directory project

Web search engines gather information from the queries performed by the user in the form of query logs. These logs are extremely useful for research, marketing, or profiling, but at the same time they are a great threat to the user’s privacy. We provide a novel approach to anonymize query logs so th...

Descripción completa

Detalles Bibliográficos
Autores: Erola, Arnau, Castellà-Roca, Jordi, Navarro-Arribas, Guillermo, Torra, Vicenç
Tipo de recurso: artículo
Fecha de publicación:2011
País:España
Institución:Universitat Politècnica de Catalunya (UPC)
Repositorio:UPCommons. Portal del coneixement obert de la UPC
Idioma:inglés
OAI Identifier:oai:upcommons.upc.edu:2099/11415
Acceso en línea:https://hdl.handle.net/2099/11415
Access Level:acceso abierto
Palabra clave:Artificial intelligence
Privacy
Web search engines
Query logs
K-anonymity
Microaggregation
Semantic
Intel·ligència artificial
Classificació AMS::68 Computer science::68T Artificial intelligence
Àrees temàtiques de la UPC::Matemàtiques i estadística
Descripción
Sumario:Web search engines gather information from the queries performed by the user in the form of query logs. These logs are extremely useful for research, marketing, or profiling, but at the same time they are a great threat to the user’s privacy. We provide a novel approach to anonymize query logs so they ensure user k-anonymity, by extending a common method used in statistical disclosure control: microaggregation. Furthermore, our microaggregation approach takes into account the semantics of the queries by relying on the Open Directory Project. We have tested our proposal with real data from AOL query logs