Deep ensemble-based hard sample mining for food recognition

Deep neural networks represent a compelling technique to tackle complex real-world problems, but are over-parameterized and often suffer from over- or under-confident estimates. Deep ensembles have shown better parameter estimations and often provide reliable uncertainty estimates that contribute to...

Descripción completa

Detalles Bibliográficos
Autores: Nagarajan, Bhalaji, Bolaños Solà, Marc, Aguilar Torres, Eduardo, Radeva, Petia
Tipo de recurso: artículo
Estado:Versión publicada
Fecha de publicación:2023
País:España
Institución:Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
Repositorio:Recercat. Dipósit de la Recerca de Catalunya
OAI Identifier:oai:recercat.cat:2445/219021
Acceso en línea:https://hdl.handle.net/2445/219021
Access Level:acceso abierto
Palabra clave:Aprenentatge automàtic
Reconeixement de formes (Informàtica)
Visió per ordinador
Machine learning
Pattern recognition systems
Computer vision
Descripción
Sumario:Deep neural networks represent a compelling technique to tackle complex real-world problems, but are over-parameterized and often suffer from over- or under-confident estimates. Deep ensembles have shown better parameter estimations and often provide reliable uncertainty estimates that contribute to the robustness of the results. In this work, we propose a new metric to identify samples that are hard to classify. Our metric is defined as coincidence score for deep ensembles which measures the agreement of its individual models. The main hypothesis we rely on is that deep learning algorithms learn the low-loss samples better compared to large-loss samples. In order to compensate for this, we use controlled over-sampling on the identified ”hard” samples using proper data augmentation schemes to enable the models to learn those samples better. We validate the proposed metric using two public food datasets on different backbone architectures and show the improvements compared to the conventional deep neural network training using different performance metrics.