Contract-based data validation framework
In the era of big data, privacy concerns have shifted the paradigm from centralized data management to federated data ecosystems or data spaces, where data ownership and control are decentralized. These ecosystems are structured around data-sharing contracts that specify access, usage, quality, and...
| Autor: | |
|---|---|
| Tipo de recurso: | tesis de maestría |
| Fecha de publicación: | 2024 |
| País: | España |
| Institución: | Universitat Politècnica de Catalunya (UPC) |
| Repositorio: | UPCommons. Portal del coneixement obert de la UPC |
| Idioma: | inglés |
| OAI Identifier: | oai:upcommons.upc.edu:2117/415374 |
| Acceso en línea: | https://hdl.handle.net/2117/415374 |
| Access Level: | acceso abierto |
| Palabra clave: | Knowledge representation (Information theory) Model-driven software architecture Validació de dades Federació de dades Espai de dades Xarxa de Coneixement Data Validation Federated Data Data Spaces Knowledge Graphs Representació del coneixement (Teoria de la informació) Arquitectura dirigida per models Àrees temàtiques de la UPC::Informàtica::Sistemes d'informació |
| Sumario: | In the era of big data, privacy concerns have shifted the paradigm from centralized data management to federated data ecosystems or data spaces, where data ownership and control are decentralized. These ecosystems are structured around data-sharing contracts that specify access, usage, quality, and privacy requirements between organizations. Current frameworks primarily focus on data usage and lack robust mechanisms for data validation, which is critical for maintaining data quality and integrity across federated environments. To address this gap, this master thesis introduces a knowledge graph-based framework to automate data validation in line with data-sharing contracts. The framework introduces policy checkers, which represent data validation workflows that can be dynamically translated into user-defined functions (UDFs) for compliance checking. This method enhances the flexibility and transparency of data validation processes, enabling the accommodation of various data formats. The overall contributions of this master thesis include (1) the design of an advanced architectural framework for federated data management (2) a derived data governance mechanism for automated data validation based on data-sharing contracts and, (3) a proof of concept implementation that demonstrates the practical applicability of the solution. This work aims to address critical gaps in current data ecosystems by describing flexible and robust data governance mechanisms. |
|---|