Privacy-preserving machine learning for collaborative data sharing via auto-encoder latent space embeddings

Privacy-preserving machine learning in data-sharing processes is an ever-critical task that enables collaborative training of Machine Learning (ML) models without the need to share the original data sources. It is especially relevant when an organization must assure that sensitive data remains priva...

Descripción completa

Detalles Bibliográficos
Autor: Quintero Ossa, Ana María
Tipo de recurso: tesis de maestría
Estado:Versión aceptada para publicación
Fecha de publicación:2022
País:Colombia
Institución:Universidad de los Andes
Repositorio:Séneca: repositorio Uniandes
Idioma:inglés
OAI Identifier:oai:repositorio.uniandes.edu.co:1992/64121
Acceso en línea:http://hdl.handle.net/1992/64121
Access Level:acceso abierto
Palabra clave:Privacy preserving
Machine learning
Representation learning
Ingeniería
Descripción
Sumario:Privacy-preserving machine learning in data-sharing processes is an ever-critical task that enables collaborative training of Machine Learning (ML) models without the need to share the original data sources. It is especially relevant when an organization must assure that sensitive data remains private throughout the whole ML pipeline, i.e., training and inference phases. This paper presents an innovative framework that uses Representation Learning via autoencoders to generate privacypreserving embedded data. Thus, organizations can share the data representation to increase machine learning models' performance in scenarios with more than one data source for a shared predictive downstream task.