CauRuler: Causal irredundant association rule miner for complex patient trajectory modelling

Background and Objectives. Discovering causal associations between variables is one of the main goals of clinical trials, with the ultimate aim of identifying the causes of specific health status. Prior knowledge of causal paths could help ensure patients do not develop the resultant conditions. In...

Descripción completa

Detalles Bibliográficos
Autores: Hernández Guillamet, Guillem, López Seguí, Francesc, Vidal-Alaball, Josep, López Ibáñez, Beatriz
Tipo de recurso: artículo
Estado:Versión publicada
Fecha de publicación:2023
País:España
Institución:Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
Repositorio:Recercat. Dipósit de la Recerca de Catalunya
OAI Identifier:oai:recercat.cat:10256/22711
Acceso en línea:http://hdl.handle.net/10256/22711
Access Level:acceso abierto
Palabra clave:Mineria de dades
Data mining
Salut -- Bases de dades
Health -- Databases
id ES_14ddd4873fbba6fbb2cd16fca287add1
oai_identifier_str oai:recercat.cat:10256/22711
network_acronym_str ES
network_name_str España
repository_id_str
dc.title.none.fl_str_mv CauRuler: Causal irredundant association rule miner for complex patient trajectory modelling
title CauRuler: Causal irredundant association rule miner for complex patient trajectory modelling
spellingShingle CauRuler: Causal irredundant association rule miner for complex patient trajectory modelling
Hernández Guillamet, Guillem
Mineria de dades
Data mining
Salut -- Bases de dades
Health -- Databases
title_short CauRuler: Causal irredundant association rule miner for complex patient trajectory modelling
title_full CauRuler: Causal irredundant association rule miner for complex patient trajectory modelling
title_fullStr CauRuler: Causal irredundant association rule miner for complex patient trajectory modelling
title_full_unstemmed CauRuler: Causal irredundant association rule miner for complex patient trajectory modelling
title_sort CauRuler: Causal irredundant association rule miner for complex patient trajectory modelling
dc.creator.none.fl_str_mv Hernández Guillamet, Guillem
López Seguí, Francesc
Vidal-Alaball, Josep
López Ibáñez, Beatriz
author Hernández Guillamet, Guillem
author_facet Hernández Guillamet, Guillem
López Seguí, Francesc
Vidal-Alaball, Josep
López Ibáñez, Beatriz
author_role author
author2 López Seguí, Francesc
Vidal-Alaball, Josep
López Ibáñez, Beatriz
author2_role author
author
author
dc.subject.none.fl_str_mv Mineria de dades
Data mining
Salut -- Bases de dades
Health -- Databases
topic Mineria de dades
Data mining
Salut -- Bases de dades
Health -- Databases
description Background and Objectives. Discovering causal associations between variables is one of the main goals of clinical trials, with the ultimate aim of identifying the causes of specific health status. Prior knowledge of causal paths could help ensure patients do not develop the resultant conditions. In recent years, thanks to the enormous amount of health data stored with the support of digital tools, attempts have been made to employ Machine Learning to infer causality. Those methodologies suffer from some deficiencies in controlling cofounders when analysing causality, as well as providing causal rules general enough to be useful in healthcare practice. Conversely, this work presents and evaluates CauRuler, a new approach to deal with causality from association rules. The proposed approach uses a pruning strategy to reduce the association rule set, which does not compromise the causality learning capability of the algorithm. This behaviour makes the algorithm suitable for exploiting large health databases with thousands of patients and medical instances. CauRuler can control a larger number of confounders than other proposals, bringing robustness to causal analysis and avoiding the identification of spurious associations. Additionally, the method generalizes causality using anti-monotone properties to obtain complex and general causal paths. The method can target correct causal associations in complex medical databases with retrospective data. Method: CauRuler extends association rule mining with an irredundancy property so that the set of rules learnt is reduced in size and generalized. General association rules, conformed by fewer items, enable controlling more confounding variables to verify, with more statistical evidence on available data, if they represent causal paths in patient disease trajectories. Results: CauRuler has been tested on a complex real medical database (3,5 M visits to the primary care services between 2019 and 2020, and controlling over 15.000 different variables including diagnoses and demographic and other clinical patient data). The reduction of the rule set achieved by the pruning strategy goes from 7.732 to 2.240 rules, from which 46 have been found to have causality relationships in the patient trajectories, and generalized to 14 rules tested as true causal relationships thanks to the confounding analysis. These rules have been validated by clinicians with the support of a graphical map. The obtained causal paths control in average of 906 confounder variables, retrieving robust results. Conclusions: Causal relationships enable predicting causal paths between health conditions according to patient trajectories. Knowing these causal paths is crucial for understanding and preventing the appearance or worsening of diseases in patients. CauRuler, with high demanding thresholds, has proven its efficiency and effectiveness in targeting previously known causal associations between diagnoses, reaching consensus in the medical community. Softening these thresholds should help target interesting general causal paths
publishDate 2023
dc.date.none.fl_str_mv 2023
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
peer-reviewed
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/10256/22711
http://hdl.handle.net/10256/22711
url http://hdl.handle.net/10256/22711
dc.language.none.fl_str_mv Inglés
language_invalid_str_mv Inglés
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.1016/j.compbiomed.2023.106636
info:eu-repo/semantics/altIdentifier/issn/0010-4825
info:eu-repo/semantics/altIdentifier/eissn/1879-0534
dc.rights.none.fl_str_mv Attribution-NonCommercial-NoDerivatives 4.0 International
http://creativecommons.org/licenses/by-nc-nd/4.0/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Attribution-NonCommercial-NoDerivatives 4.0 International
http://creativecommons.org/licenses/by-nc-nd/4.0/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Elsevier
publisher.none.fl_str_mv Elsevier
dc.source.none.fl_str_mv Computers in Biology and Medicine, 2023, vol. 155, art. núm. 106636
Articles publicats (D-EEEiA)
reponame:Recercat. Dipósit de la Recerca de Catalunya
instname:Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
instname_str Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
reponame_str Recercat. Dipósit de la Recerca de Catalunya
collection Recercat. Dipósit de la Recerca de Catalunya
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_ 1869403760721133568
spelling CauRuler: Causal irredundant association rule miner for complex patient trajectory modellingHernández Guillamet, GuillemLópez Seguí, FrancescVidal-Alaball, JosepLópez Ibáñez, BeatrizMineria de dadesData miningSalut -- Bases de dadesHealth -- DatabasesBackground and Objectives. Discovering causal associations between variables is one of the main goals of clinical trials, with the ultimate aim of identifying the causes of specific health status. Prior knowledge of causal paths could help ensure patients do not develop the resultant conditions. In recent years, thanks to the enormous amount of health data stored with the support of digital tools, attempts have been made to employ Machine Learning to infer causality. Those methodologies suffer from some deficiencies in controlling cofounders when analysing causality, as well as providing causal rules general enough to be useful in healthcare practice. Conversely, this work presents and evaluates CauRuler, a new approach to deal with causality from association rules. The proposed approach uses a pruning strategy to reduce the association rule set, which does not compromise the causality learning capability of the algorithm. This behaviour makes the algorithm suitable for exploiting large health databases with thousands of patients and medical instances. CauRuler can control a larger number of confounders than other proposals, bringing robustness to causal analysis and avoiding the identification of spurious associations. Additionally, the method generalizes causality using anti-monotone properties to obtain complex and general causal paths. The method can target correct causal associations in complex medical databases with retrospective data. Method: CauRuler extends association rule mining with an irredundancy property so that the set of rules learnt is reduced in size and generalized. General association rules, conformed by fewer items, enable controlling more confounding variables to verify, with more statistical evidence on available data, if they represent causal paths in patient disease trajectories. Results: CauRuler has been tested on a complex real medical database (3,5 M visits to the primary care services between 2019 and 2020, and controlling over 15.000 different variables including diagnoses and demographic and other clinical patient data). The reduction of the rule set achieved by the pruning strategy goes from 7.732 to 2.240 rules, from which 46 have been found to have causality relationships in the patient trajectories, and generalized to 14 rules tested as true causal relationships thanks to the confounding analysis. These rules have been validated by clinicians with the support of a graphical map. The obtained causal paths control in average of 906 confounder variables, retrieving robust results. Conclusions: Causal relationships enable predicting causal paths between health conditions according to patient trajectories. Knowing these causal paths is crucial for understanding and preventing the appearance or worsening of diseases in patients. CauRuler, with high demanding thresholds, has proven its efficiency and effectiveness in targeting previously known causal associations between diagnoses, reaching consensus in the medical community. Softening these thresholds should help target interesting general causal pathsThis study was conducted with the support of the Secretary of Universities and Research of the Department of Business and Knowledge at the Generalitat de Catalunya, Spain (SGR 01125)Open Access funding provided thanks to the CRUE-CSIC agreement with Elsevier3Elsevier2023info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionpeer-reviewedapplication/pdfhttp://hdl.handle.net/10256/22711http://hdl.handle.net/10256/22711Computers in Biology and Medicine, 2023, vol. 155, art. núm. 106636Articles publicats (D-EEEiA)reponame:Recercat. Dipósit de la Recerca de Catalunyainstname:Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)Inglésinfo:eu-repo/semantics/altIdentifier/doi/10.1016/j.compbiomed.2023.106636info:eu-repo/semantics/altIdentifier/issn/0010-4825info:eu-repo/semantics/altIdentifier/eissn/1879-0534Attribution-NonCommercial-NoDerivatives 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/openAccessoai:recercat.cat:10256/227112026-05-29T05:05:01Z
score 15,81155