Contract-based data validation framework

In the era of big data, privacy concerns have shifted the paradigm from centralized data management to federated data ecosystems or data spaces, where data ownership and control are decentralized. These ecosystems are structured around data-sharing contracts that specify access, usage, quality, and...

Descripción completa

Detalles Bibliográficos
Autor: Hmimou Ham Man, Achraf
Tipo de recurso: tesis de maestría
Fecha de publicación:2024
País:España
Institución:Universitat Politècnica de Catalunya (UPC)
Repositorio:UPCommons. Portal del coneixement obert de la UPC
Idioma:inglés
OAI Identifier:oai:upcommons.upc.edu:2117/415374
Acceso en línea:https://hdl.handle.net/2117/415374
Access Level:acceso abierto
Palabra clave:Knowledge representation (Information theory)
Model-driven software architecture
Validació de dades
Federació de dades
Espai de dades
Xarxa de Coneixement
Data Validation
Federated Data
Data Spaces
Knowledge Graphs
Representació del coneixement (Teoria de la informació)
Arquitectura dirigida per models
Àrees temàtiques de la UPC::Informàtica::Sistemes d'informació
Descripción
Sumario:In the era of big data, privacy concerns have shifted the paradigm from centralized data management to federated data ecosystems or data spaces, where data ownership and control are decentralized. These ecosystems are structured around data-sharing contracts that specify access, usage, quality, and privacy requirements between organizations. Current frameworks primarily focus on data usage and lack robust mechanisms for data validation, which is critical for maintaining data quality and integrity across federated environments. To address this gap, this master thesis introduces a knowledge graph-based framework to automate data validation in line with data-sharing contracts. The framework introduces policy checkers, which represent data validation workflows that can be dynamically translated into user-defined functions (UDFs) for compliance checking. This method enhances the flexibility and transparency of data validation processes, enabling the accommodation of various data formats. The overall contributions of this master thesis include (1) the design of an advanced architectural framework for federated data management (2) a derived data governance mechanism for automated data validation based on data-sharing contracts and, (3) a proof of concept implementation that demonstrates the practical applicability of the solution. This work aims to address critical gaps in current data ecosystems by describing flexible and robust data governance mechanisms.