Generalising electrocardiogram detection and delineation: training convolutional neural networks with synthetic data augmentation

Introduction: Extracting beat-by-beat information from electrocardiograms (ECGs) is crucial for various downstream diagnostic tasks that rely on ECG-based measurements. However, these measurements can be expensive and time-consuming to produce, especially for long-term recordings. Traditional ECG de...

Descripción completa

Detalles Bibliográficos
Autores: Jimenez-Perez, Guillermo, Acosta, Juan Carlos, Alcaine, Alejandro, Camara, Oscar
Tipo de recurso: artículo
Estado:Versión publicada
Fecha de publicación:2024
País:España
Institución:Universitat Pompeu Fabra
Repositorio:Repositorio Digital de la UPF
OAI Identifier:oai:repositori.upf.edu:10230/71597
Acceso en línea:http://hdl.handle.net/10230/71597
http://dx.doi.org/10.3389/fcvm.2024.1341786
Access Level:acceso abierto
Palabra clave:Digital health
Electrocardiogram
Cconvolutional neural network
Artificial intelligence
Delineation
Multi-centre study
Data augmentation
Segmentation
id ES_10b83bde17afe1bc7aaeb8ea5df5dc7a
oai_identifier_str oai:repositori.upf.edu:10230/71597
network_acronym_str ES
network_name_str España
repository_id_str
dc.title.none.fl_str_mv Generalising electrocardiogram detection and delineation: training convolutional neural networks with synthetic data augmentation
title Generalising electrocardiogram detection and delineation: training convolutional neural networks with synthetic data augmentation
spellingShingle Generalising electrocardiogram detection and delineation: training convolutional neural networks with synthetic data augmentation
Jimenez-Perez, Guillermo
Digital health
Electrocardiogram
Cconvolutional neural network
Artificial intelligence
Delineation
Multi-centre study
Data augmentation
Segmentation
title_short Generalising electrocardiogram detection and delineation: training convolutional neural networks with synthetic data augmentation
title_full Generalising electrocardiogram detection and delineation: training convolutional neural networks with synthetic data augmentation
title_fullStr Generalising electrocardiogram detection and delineation: training convolutional neural networks with synthetic data augmentation
title_full_unstemmed Generalising electrocardiogram detection and delineation: training convolutional neural networks with synthetic data augmentation
title_sort Generalising electrocardiogram detection and delineation: training convolutional neural networks with synthetic data augmentation
dc.creator.none.fl_str_mv Jimenez-Perez, Guillermo
Acosta, Juan Carlos
Alcaine, Alejandro
Camara, Oscar
author Jimenez-Perez, Guillermo
author_facet Jimenez-Perez, Guillermo
Acosta, Juan Carlos
Alcaine, Alejandro
Camara, Oscar
author_role author
author2 Acosta, Juan Carlos
Alcaine, Alejandro
Camara, Oscar
author2_role author
author
author
dc.subject.none.fl_str_mv Digital health
Electrocardiogram
Cconvolutional neural network
Artificial intelligence
Delineation
Multi-centre study
Data augmentation
Segmentation
topic Digital health
Electrocardiogram
Cconvolutional neural network
Artificial intelligence
Delineation
Multi-centre study
Data augmentation
Segmentation
description Introduction: Extracting beat-by-beat information from electrocardiograms (ECGs) is crucial for various downstream diagnostic tasks that rely on ECG-based measurements. However, these measurements can be expensive and time-consuming to produce, especially for long-term recordings. Traditional ECG detection and delineation methods, relying on classical signal processing algorithms such as those based on wavelet transforms, produce high-quality delineations but struggle to generalise to diverse ECG patterns. Machine learning (ML) techniques based on deep learning algorithms have emerged as promising alternatives, capable of achieving similar performance without handcrafted features or thresholds. However, supervised ML techniques require large annotated datasets for training, and existing datasets for ECG detection/delineation are limited in size and the range of pathological conditions they represent. Methods: This article addresses this challenge by introducing two key innovations. First, we develop a synthetic data generation scheme that probabilistically constructs unseen ECG traces from “pools” of fundamental segments extracted from existing databases. A set of rules guides the arrangement of these segments into coherent synthetic traces, while expert domain knowledge ensures the realism of the generated traces, increasing the input variability for training the model. Second, we propose two novel segmentation-based loss functions that encourage the accurate prediction of the number of independent ECG structures and promote tighter segmentation boundaries by focusing on a reduced number of samples. Results: The proposed approach achieves remarkable performance, with a F1 - score of 99.38% and delineation errors of 2.19 ± 17.73 ms and 4.45 ± 18.32 ms for ECG segment onsets and offsets across the P, QRS, and T waves. These results, aggregated from three diverse freely available databases (QT, LU, and Zhejiang), surpass current state-of-the-art detection and delineation approaches. Discussion: Notably, the model demonstrated exceptional performance despite variations in lead configurations, sampling frequencies, and represented pathophysiology mechanisms, underscoring its robust generalisation capabilities. Real-world examples, featuring clinical data with various pathologies, illustrate the potential of our approach to streamline ECG analysis across different medical settings, fostered by releasing the codes as open source.
publishDate 2024
dc.date.none.fl_str_mv 2024
2025
2025
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/10230/71597
http://dx.doi.org/10.3389/fcvm.2024.1341786
url http://hdl.handle.net/10230/71597
http://dx.doi.org/10.3389/fcvm.2024.1341786
dc.language.none.fl_str_mv Inglés
language_invalid_str_mv Inglés
dc.relation.none.fl_str_mv Frontiers in Cardiovascular Medicine. 2024 Jul 19;11:1341786
info:eu-repo/grantAgreement/ES/3PE/PID2022-139143OA-I00
dc.rights.none.fl_str_mv http://creativecommons.org/licenses/by/4.0/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by/4.0/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv Frontiers
publisher.none.fl_str_mv Frontiers
dc.source.none.fl_str_mv reponame:Repositorio Digital de la UPF
instname:Universitat Pompeu Fabra
instname_str Universitat Pompeu Fabra
reponame_str Repositorio Digital de la UPF
collection Repositorio Digital de la UPF
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_ 1869403525409144832
spelling Generalising electrocardiogram detection and delineation: training convolutional neural networks with synthetic data augmentationJimenez-Perez, GuillermoAcosta, Juan CarlosAlcaine, AlejandroCamara, OscarDigital healthElectrocardiogramCconvolutional neural networkArtificial intelligenceDelineationMulti-centre studyData augmentationSegmentationIntroduction: Extracting beat-by-beat information from electrocardiograms (ECGs) is crucial for various downstream diagnostic tasks that rely on ECG-based measurements. However, these measurements can be expensive and time-consuming to produce, especially for long-term recordings. Traditional ECG detection and delineation methods, relying on classical signal processing algorithms such as those based on wavelet transforms, produce high-quality delineations but struggle to generalise to diverse ECG patterns. Machine learning (ML) techniques based on deep learning algorithms have emerged as promising alternatives, capable of achieving similar performance without handcrafted features or thresholds. However, supervised ML techniques require large annotated datasets for training, and existing datasets for ECG detection/delineation are limited in size and the range of pathological conditions they represent. Methods: This article addresses this challenge by introducing two key innovations. First, we develop a synthetic data generation scheme that probabilistically constructs unseen ECG traces from “pools” of fundamental segments extracted from existing databases. A set of rules guides the arrangement of these segments into coherent synthetic traces, while expert domain knowledge ensures the realism of the generated traces, increasing the input variability for training the model. Second, we propose two novel segmentation-based loss functions that encourage the accurate prediction of the number of independent ECG structures and promote tighter segmentation boundaries by focusing on a reduced number of samples. Results: The proposed approach achieves remarkable performance, with a F1 - score of 99.38% and delineation errors of 2.19 ± 17.73 ms and 4.45 ± 18.32 ms for ECG segment onsets and offsets across the P, QRS, and T waves. These results, aggregated from three diverse freely available databases (QT, LU, and Zhejiang), surpass current state-of-the-art detection and delineation approaches. Discussion: Notably, the model demonstrated exceptional performance despite variations in lead configurations, sampling frequencies, and represented pathophysiology mechanisms, underscoring its robust generalisation capabilities. Real-world examples, featuring clinical data with various pathologies, illustrate the potential of our approach to streamline ECG analysis across different medical settings, fostered by releasing the codes as open source.The authors declare financial support was received for the research, authorship, and/or publication of this article. This research was supported by the Secretariat for Universities and Research of the Government of Catalonia (2017 FI_B 01008). This work was partially funded by Departamento de Ciencia, Universidad y Sociedad del Conocimiento, from the Gobierno de Aragón (Spain) (Research Group T71_23D) and by project PID2022-139143OA-I00 funded by MICIU/AEI/10.13039/501100011033 and by ERDF/EU. The GPU was donated by the NVIDIA Corporation.Frontiers202520252024info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfapplication/pdfhttp://hdl.handle.net/10230/71597http://dx.doi.org/10.3389/fcvm.2024.1341786reponame:Repositorio Digital de la UPFinstname:Universitat Pompeu FabraInglésFrontiers in Cardiovascular Medicine. 2024 Jul 19;11:1341786info:eu-repo/grantAgreement/ES/3PE/PID2022-139143OA-I00© 2024 Jimenez-Perez, Acosta, Alcaine and Camara. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.http://creativecommons.org/licenses/by/4.0/info:eu-repo/semantics/openAccessoai:repositori.upf.edu:10230/715972026-06-12T07:21:37Z
score 15,81155