Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models

[EN] This paper introduces the Group Linear Algorithm with Sparse Principal decomposition, an algorithm for supervised variable selection and clustering. Our approach extends the Sparse Group Lasso regularization to calculate clusters as part of the model fit. Therefore, unlike Sparse Group Lasso, o...

Descripción completa

Detalles Bibliográficos
Autores: Laria, Juan C., Lillo, Rosa E., Aguilera-Morillo, M. Carmen|||0000-0003-1027-9773
Tipo de recurso: artículo
Fecha de publicación:2022
País:España
Institución:Universitat Politècnica de València (UPV)
Repositorio:RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
Idioma:inglés
OAI Identifier:oai:riunet.upv.es:10251/197838
Acceso en línea:https://riunet.upv.es/handle/10251/197838
Access Level:acceso abierto
Palabra clave:Regression
Classification
Feature clustering
Statistical computing
ESTADISTICA E INVESTIGACION OPERATIVA
id ES_956c95cf8d25349abf22c1eee6c00a32
oai_identifier_str oai:riunet.upv.es:10251/197838
network_acronym_str ES
network_name_str España
repository_id_str
spelling Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear modelsLaria, Juan C.Lillo, Rosa E.Aguilera-Morillo, M. Carmen|||0000-0003-1027-9773RegressionClassificationFeature clusteringStatistical computingESTADISTICA E INVESTIGACION OPERATIVA[EN] This paper introduces the Group Linear Algorithm with Sparse Principal decomposition, an algorithm for supervised variable selection and clustering. Our approach extends the Sparse Group Lasso regularization to calculate clusters as part of the model fit. Therefore, unlike Sparse Group Lasso, our idea does not require prior specification of clusters between variables. To determine the clusters, we solve a particular case of sparse Singular Value Decomposition, with a regularization term that follows naturally from the Group Lasso penalty. Moreover, this paper proposes a unified implementation to deal with, but not limited to, linear regression, logistic regression, and proportional hazards models with right-censoring. Our methodology is evaluated using both biological and simulated data, and details of the implementation in R and hyperparameter search are discussed.Springer-VerlagDepartamento de Estadística e Investigación Operativa Aplicadas y CalidadEscuela Técnica Superior de Ingeniería IndustrialGrupo de Ingeniería Estadística Multivariante GIEMAgencia Estatal de InvestigaciónRepositorio Institucional de la Universitat Politècnica de València Riunet20222022-02-01journal articlehttp://purl.org/coar/resource_type/c_6501VoRhttp://purl.org/coar/version/c_970fb48d4fbd8a85info:eu-repo/semantics/articleapplication/pdfapplication/pdfhttps://riunet.upv.es/handle/10251/197838reponame:RiuNet. Repositorio Institucional de la Universitat Politécnica de Valénciainstname:Universitat Politècnica de València (UPV)InglésengAgencia Estatal de Investigación http://dx.doi.org/10.13039/501100011033 Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020 PID2019-104901RB-I00 NUEVAS ESTRATEGIAS EN REGRESION PENALIZADA CON APLICACIONES EN SALUD, DEMOGRAFIA Y ECONOMIAopen accesshttp://purl.org/coar/access_right/c_abf2Reserva de todos los derechoshttp://rightsstatements.org/vocab/InC/1.0/info:eu-repo/semantics/openAccessoai:riunet.upv.es:10251/1978382026-06-13T07:49:27Z
dc.title.none.fl_str_mv Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models
title Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models
spellingShingle Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models
Laria, Juan C.
Regression
Classification
Feature clustering
Statistical computing
ESTADISTICA E INVESTIGACION OPERATIVA
title_short Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models
title_full Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models
title_fullStr Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models
title_full_unstemmed Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models
title_sort Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models
dc.creator.none.fl_str_mv Laria, Juan C.
Lillo, Rosa E.
Aguilera-Morillo, M. Carmen|||0000-0003-1027-9773
author Laria, Juan C.
author_facet Laria, Juan C.
Lillo, Rosa E.
Aguilera-Morillo, M. Carmen|||0000-0003-1027-9773
author_role author
author2 Lillo, Rosa E.
Aguilera-Morillo, M. Carmen|||0000-0003-1027-9773
author2_role author
author
dc.contributor.none.fl_str_mv Departamento de Estadística e Investigación Operativa Aplicadas y Calidad
Escuela Técnica Superior de Ingeniería Industrial
Grupo de Ingeniería Estadística Multivariante GIEM
Agencia Estatal de Investigación
Repositorio Institucional de la Universitat Politècnica de València Riunet
dc.subject.none.fl_str_mv Regression
Classification
Feature clustering
Statistical computing
ESTADISTICA E INVESTIGACION OPERATIVA
topic Regression
Classification
Feature clustering
Statistical computing
ESTADISTICA E INVESTIGACION OPERATIVA
description [EN] This paper introduces the Group Linear Algorithm with Sparse Principal decomposition, an algorithm for supervised variable selection and clustering. Our approach extends the Sparse Group Lasso regularization to calculate clusters as part of the model fit. Therefore, unlike Sparse Group Lasso, our idea does not require prior specification of clusters between variables. To determine the clusters, we solve a particular case of sparse Singular Value Decomposition, with a regularization term that follows naturally from the Group Lasso penalty. Moreover, this paper proposes a unified implementation to deal with, but not limited to, linear regression, logistic regression, and proportional hazards models with right-censoring. Our methodology is evaluated using both biological and simulated data, and details of the implementation in R and hyperparameter search are discussed.
publishDate 2022
dc.date.none.fl_str_mv 2022
2022-02-01
dc.type.none.fl_str_mv journal article
http://purl.org/coar/resource_type/c_6501
VoR
http://purl.org/coar/version/c_970fb48d4fbd8a85
dc.type.openaire.fl_str_mv info:eu-repo/semantics/article
format article
dc.identifier.none.fl_str_mv https://riunet.upv.es/handle/10251/197838
url https://riunet.upv.es/handle/10251/197838
dc.language.none.fl_str_mv Inglés
eng
language_invalid_str_mv Inglés
language eng
dc.relation.none.fl_str_mv Agencia Estatal de Investigación http://dx.doi.org/10.13039/501100011033 Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020 PID2019-104901RB-I00 NUEVAS ESTRATEGIAS EN REGRESION PENALIZADA CON APLICACIONES EN SALUD, DEMOGRAFIA Y ECONOMIA
dc.rights.none.fl_str_mv open access
http://purl.org/coar/access_right/c_abf2
Reserva de todos los derechos
http://rightsstatements.org/vocab/InC/1.0/
dc.rights.openaire.fl_str_mv info:eu-repo/semantics/openAccess
rights_invalid_str_mv open access
http://purl.org/coar/access_right/c_abf2
Reserva de todos los derechos
http://rightsstatements.org/vocab/InC/1.0/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv Springer-Verlag
publisher.none.fl_str_mv Springer-Verlag
dc.source.none.fl_str_mv reponame:RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
instname:Universitat Politècnica de València (UPV)
instname_str Universitat Politècnica de València (UPV)
reponame_str RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
collection RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_ 1869413820889300992
score 15,300724