CCLR-DL: A novel statistics and deep learning hybrid method for feature selection and forecasting healthcare demand

Background and Objective: Hybrid forecasting methods aim to overcome the limitations of classical statistical approaches and deep learning models. While statistical methods provide interpretability, they often lack predictive power. Conversely, deep learning models achieve high accuracy but act as “...

Descripción completa

Detalles Bibliográficos
Autores: Hernández Guillamet, Guillem, López Seguí, Francesc, Vidal-Alaball, Josep, López Ibáñez, Beatriz
Tipo de recurso: artículo
Estado:Versión publicada
Fecha de publicación:2025
País:España
Institución:Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
Repositorio:Recercat. Dipósit de la Recerca de Catalunya
OAI Identifier:oai:recercat.cat:10256/27320
Acceso en línea:http://hdl.handle.net/10256/27320
Access Level:acceso abierto
Palabra clave:Aprenentatge profund
Anàlisi multivariable
Multivariate analysis
Deep learning
id ES_7c8bbfdb8f26c6c8e2fdbb87e24fafb0
oai_identifier_str oai:recercat.cat:10256/27320
network_acronym_str ES
network_name_str España
repository_id_str
spelling CCLR-DL: A novel statistics and deep learning hybrid method for feature selection and forecasting healthcare demandHernández Guillamet, GuillemLópez Seguí, FrancescVidal-Alaball, JosepLópez Ibáñez, BeatrizAprenentatge profundAnàlisi multivariableMultivariate analysisDeep learningBackground and Objective: Hybrid forecasting methods aim to overcome the limitations of classical statistical approaches and deep learning models. While statistical methods provide interpretability, they often lack predictive power. Conversely, deep learning models achieve high accuracy but act as “black boxes.” This study introduces the Comprehensive Cross-Correlation and Lagged Linear Regression Deep Learning (CCLR-DL) framework, combining statistical and deep learning techniques to enhance both forecasting accuracy and interpretability. Unlike existing hybrid methods that combine statistical filtering with deep learning, CCLR-DL integrates causal statistical selection with neural forecasting, producing interpretable predictors and consistently achieving higher accuracy than models without feature selection or other standard baselines. Methods: The CCLR-DL framework integrates cross-correlation analysis, lagged multiple linear regression, and Granger causality testing with advanced deep learning architectures. This dual-phase approach first identifies causally significant predictors and then fits them into a deep learning model for multivariate time series forecasting. The framework was validated using a real-world dataset of clinical visits and diagnoses from 6.3 million individuals collected over 10 years. Results: In the evaluated setting, the CCLR-DL framework outperformed baseline models, achieving an average Root Mean Square Error (RMSE) improvement of 19.8% over univariate models, 60.1% over no feature selection, and 51.9% over random selection. The causality phase ensured that all selected predictors demonstrated a significant Granger-causal (GC) relationship. Simpler recurrent architectures, particularly bidirectional Long Short-Term Memory units (BiLSTM), yielded the most accurate forecasts by effectively capturing nonlinear temporal dependencies. Conclusions: By addressing the challenges of both prediction accuracy and model transparency, the CCLR-DL framework offers a new approach for high-dimensional, multivariate time series forecasting. In healthcare settings, it may enable decision-makers to anticipate demand shifts with greater reliability, allowing earlier staff scheduling, more efficient resource allocation, and reduced waiting times. In our evaluation, it consistently outperformed baseline strategies, delivering measurable improvements that translate into thousands of patient visits being forecasted more accurately across large populationsThis work was conducted with the support of the Secretary of Universities and Research of the Department of Business and Knowledge at the Generalitat de Catalunya 2021 SGR 01125, and founded by the Industrial Doctorate Plan 2021 DI 106, provided by the Agència de Gestió d’Ajuts Universitaris i de Recerca (AGAUR). Open Access funding provided thanks to the CRUE-CSIC agreement with ElsevierElsevier2025info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionpeer-reviewedapplication/pdfhttp://hdl.handle.net/10256/27320http://hdl.handle.net/10256/27320Computer Methods and Programs in Biomedicine, 2025, vol. 272, art. núm. 109057Articles publicats (D-EEEiA)reponame:Recercat. Dipósit de la Recerca de Catalunyainstname:Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)Inglésinfo:eu-repo/semantics/altIdentifier/doi/10.1016/j.cmpb.2025.109057info:eu-repo/semantics/altIdentifier/issn/0169-2607info:eu-repo/semantics/altIdentifier/eissn/1872-7565Attribution-NonCommercial 4.0 Internationalhttp://creativecommons.org/licenses/by-nc/4.0/info:eu-repo/semantics/openAccessoai:recercat.cat:10256/273202026-05-29T05:05:01Z
dc.title.none.fl_str_mv CCLR-DL: A novel statistics and deep learning hybrid method for feature selection and forecasting healthcare demand
title CCLR-DL: A novel statistics and deep learning hybrid method for feature selection and forecasting healthcare demand
spellingShingle CCLR-DL: A novel statistics and deep learning hybrid method for feature selection and forecasting healthcare demand
Hernández Guillamet, Guillem
Aprenentatge profund
Anàlisi multivariable
Multivariate analysis
Deep learning
title_short CCLR-DL: A novel statistics and deep learning hybrid method for feature selection and forecasting healthcare demand
title_full CCLR-DL: A novel statistics and deep learning hybrid method for feature selection and forecasting healthcare demand
title_fullStr CCLR-DL: A novel statistics and deep learning hybrid method for feature selection and forecasting healthcare demand
title_full_unstemmed CCLR-DL: A novel statistics and deep learning hybrid method for feature selection and forecasting healthcare demand
title_sort CCLR-DL: A novel statistics and deep learning hybrid method for feature selection and forecasting healthcare demand
dc.creator.none.fl_str_mv Hernández Guillamet, Guillem
López Seguí, Francesc
Vidal-Alaball, Josep
López Ibáñez, Beatriz
author Hernández Guillamet, Guillem
author_facet Hernández Guillamet, Guillem
López Seguí, Francesc
Vidal-Alaball, Josep
López Ibáñez, Beatriz
author_role author
author2 López Seguí, Francesc
Vidal-Alaball, Josep
López Ibáñez, Beatriz
author2_role author
author
author
dc.subject.none.fl_str_mv Aprenentatge profund
Anàlisi multivariable
Multivariate analysis
Deep learning
topic Aprenentatge profund
Anàlisi multivariable
Multivariate analysis
Deep learning
description Background and Objective: Hybrid forecasting methods aim to overcome the limitations of classical statistical approaches and deep learning models. While statistical methods provide interpretability, they often lack predictive power. Conversely, deep learning models achieve high accuracy but act as “black boxes.” This study introduces the Comprehensive Cross-Correlation and Lagged Linear Regression Deep Learning (CCLR-DL) framework, combining statistical and deep learning techniques to enhance both forecasting accuracy and interpretability. Unlike existing hybrid methods that combine statistical filtering with deep learning, CCLR-DL integrates causal statistical selection with neural forecasting, producing interpretable predictors and consistently achieving higher accuracy than models without feature selection or other standard baselines. Methods: The CCLR-DL framework integrates cross-correlation analysis, lagged multiple linear regression, and Granger causality testing with advanced deep learning architectures. This dual-phase approach first identifies causally significant predictors and then fits them into a deep learning model for multivariate time series forecasting. The framework was validated using a real-world dataset of clinical visits and diagnoses from 6.3 million individuals collected over 10 years. Results: In the evaluated setting, the CCLR-DL framework outperformed baseline models, achieving an average Root Mean Square Error (RMSE) improvement of 19.8% over univariate models, 60.1% over no feature selection, and 51.9% over random selection. The causality phase ensured that all selected predictors demonstrated a significant Granger-causal (GC) relationship. Simpler recurrent architectures, particularly bidirectional Long Short-Term Memory units (BiLSTM), yielded the most accurate forecasts by effectively capturing nonlinear temporal dependencies. Conclusions: By addressing the challenges of both prediction accuracy and model transparency, the CCLR-DL framework offers a new approach for high-dimensional, multivariate time series forecasting. In healthcare settings, it may enable decision-makers to anticipate demand shifts with greater reliability, allowing earlier staff scheduling, more efficient resource allocation, and reduced waiting times. In our evaluation, it consistently outperformed baseline strategies, delivering measurable improvements that translate into thousands of patient visits being forecasted more accurately across large populations
publishDate 2025
dc.date.none.fl_str_mv 2025
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
peer-reviewed
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/10256/27320
http://hdl.handle.net/10256/27320
url http://hdl.handle.net/10256/27320
dc.language.none.fl_str_mv Inglés
language_invalid_str_mv Inglés
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.1016/j.cmpb.2025.109057
info:eu-repo/semantics/altIdentifier/issn/0169-2607
info:eu-repo/semantics/altIdentifier/eissn/1872-7565
dc.rights.none.fl_str_mv Attribution-NonCommercial 4.0 International
http://creativecommons.org/licenses/by-nc/4.0/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Attribution-NonCommercial 4.0 International
http://creativecommons.org/licenses/by-nc/4.0/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Elsevier
publisher.none.fl_str_mv Elsevier
dc.source.none.fl_str_mv Computer Methods and Programs in Biomedicine, 2025, vol. 272, art. núm. 109057
Articles publicats (D-EEEiA)
reponame:Recercat. Dipósit de la Recerca de Catalunya
instname:Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
instname_str Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
reponame_str Recercat. Dipósit de la Recerca de Catalunya
collection Recercat. Dipósit de la Recerca de Catalunya
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_ 1869411598226948096
score 15,81155