Efficient deep ensembles by averaging neural networks in parameter space

Although deep ensembles provide large accuracy boosts relative to individual models, their use is not widespread in environments in which computational constraints are limited, as deep ensembles require storing M models and require M forward passes at prediction time. We propose a novel, computation...

Descripción completa

Detalles Bibliográficos
Autor: Norris Mitchell, Philip
Tipo de recurso: tesis de maestría
Fecha de publicación:2021
País:España
Institución:Universitat Politècnica de Catalunya (UPC)
Repositorio:UPCommons. Portal del coneixement obert de la UPC
Idioma:inglés
OAI Identifier:oai:upcommons.upc.edu:2117/356936
Acceso en línea:https://hdl.handle.net/2117/356936
Access Level:acceso abierto
Palabra clave:Artificial intelligence
Ensemble learning
Deep ensembles
Knowledge distillation
Permutation learning
Intel·ligència artificial
Classificació AMS::68 Computer science::68T Artificial intelligence
Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial
Descripción
Sumario:Although deep ensembles provide large accuracy boosts relative to individual models, their use is not widespread in environments in which computational constraints are limited, as deep ensembles require storing M models and require M forward passes at prediction time. We propose a novel, computationally efficient alternative, which we name permAVG. Although deep ensembles cannot simply be average in parameter space, as all models find distinct perhaps distant local optima, permAVG exploits the symmetries of the loss landscape by learning permutations, such that all M models can be permuted into the same local optimum and can thereafter safely be averaged.