Optimising Convolutional Neural Network Architectures for Fin Whale Pulse Detection in Spectrograms

Deep neural networks are widely used for image classification in different fields, although selecting an appropriate architecture often remains a trial-and-error process. The purpose of this work is to investigate a convolutional neural network architecture used to detect whale pulses in spectrogram...

Descripción completa

Detalles Bibliográficos
Autores: Román Ruiz, Marta, Rossi, Claudio
Tipo de recurso: artículo
Estado:Versión publicada
Fecha de publicación:2026
País:España
Institución:Consejo Superior de Investigaciones Científicas (CSIC)
Repositorio:DIGITAL.CSIC. Repositorio Institucional del CSIC
OAI Identifier:oai:dnet:digitalcsic_::c57e35e43eace3ae96caa545fa4f6214
Acceso en línea:http://hdl.handle.net/10261/429805
Access Level:acceso abierto
Palabra clave:layer analysis
model optimisation
bioacoustics
spectrogram classification
Descripción
Sumario:Deep neural networks are widely used for image classification in different fields, although selecting an appropriate architecture often remains a trial-and-error process. The purpose of this work is to investigate a convolutional neural network architecture used to detect whale pulses in spectrograms in order to better understand the causes of its underperformance. By examining the behaviour of its internal layers, we show that the early convolutional blocks capture the most informative acoustic features, while deeper layers provide limited additional benefit and, under the considered training conditions, may even degrade classification accuracy. Based on these observations, we derive a simplified architecture consisting of only the first two convolutional layers followed by a lightweight classifier. This network achieves near-optimal performance, improving accuracy from 87% to 98%, and exhibits substantially lower variability between repetitions compared to the original model.