SynthVal: A framework for validating synthetic medical images
Synthetic data is increasingly used in medical imaging to overcome data scarcity and privacy constraints. However, assessing the fidelity of synthetic images remains a critical challenge for ensuring their safe and effective use in clinical and AI applications. We present SynthVal, a Python-based fr...
| Autores: | , , , , |
|---|---|
| Tipo de recurso: | artículo |
| Fecha de publicación: | 2025 |
| País: | España |
| Institución: | Universitat Politècnica de Catalunya (UPC) |
| Repositorio: | UPCommons. Portal del coneixement obert de la UPC |
| Idioma: | inglés |
| OAI Identifier: | oai:upcommons.upc.edu:2117/447083 |
| Acceso en línea: | https://hdl.handle.net/2117/447083 https://dx.doi.org/10.1109/ACCESS.2025.3633780 |
| Access Level: | acceso abierto |
| Palabra clave: | Synthetic data validation Similarity metrics Health data Features extraction Àrees temàtiques de la UPC::Informàtica::Aplicacions de la informàtica::Bioinformàtica Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial |
| Sumario: | Synthetic data is increasingly used in medical imaging to overcome data scarcity and privacy constraints. However, assessing the fidelity of synthetic images remains a critical challenge for ensuring their safe and effective use in clinical and AI applications. We present SynthVal, a Python-based framework for validating the quality of synthetic medical images through statistical comparisons in deep feature space. SynthVal extracts semantic image embeddings using transformer-based models and computes similarity metrics – including Fréchet Distance, Wasserstein Distance, and Kullback-Leibler Divergence – between real and synthetic data distributions. The framework is designed for modularity, scalability, and ease of integration into existing workflows via pip installation. We evaluate SynthVal using real images from the CSAW-CC dataset, and synthetic images produced by a model developed by the Barcelona Supercomputing Center. Our experiments systematically benchmark the influence of different feature extraction models and similarity metrics, providing practical insights for selecting validation strategies in medical image synthesis. SynthVal offers a reproducible and extensible solution for quality control in synthetic data pipelines. |
|---|