Learning to exploit the prior network knowledge for weakly supervised semantic segmentation

Redondo Cabrera, Carolina; Baptista Ríos, Marcos; López Sastre, Roberto Javier|||0000-0002-2477-0152

Learning to exploit the prior network knowledge for weakly supervised semantic segmentation

Training a convolutional neural network for semantic segmentation typically requires collecting a large amount of accurate pixel-level annotations and is a hard and expensive task. In contrast, simple image tags are easier to gather. In this paper, we introduce a novel weakly supervised semantic seg...

Descripción completa

Detalles Bibliográficos
Autores:	Redondo Cabrera, Carolina, Baptista Ríos, Marcos, López Sastre, Roberto Javier\|\|\|0000-0002-2477-0152
Tipo de recurso:	artículo
Fecha de publicación:	2019
País:	España
Institución:	Universidad de Alcalá (UAH)
Repositorio:	e_Buah Biblioteca Digital Universidad de Alcalá
Idioma:	inglés
OAI Identifier:	oai:ebuah.uah.es:10017/63249
Acceso en línea:	http://hdl.handle.net/10017/63249 https://dx.doi.org/10.1109/TIP.2019.2901393
Access Level:	acceso abierto
Palabra clave:	Semantic segmentation Weakly supervised Deep learning Telecomunicaciones Telecommunication

Descripción
Sumario:	Training a convolutional neural network for semantic segmentation typically requires collecting a large amount of accurate pixel-level annotations and is a hard and expensive task. In contrast, simple image tags are easier to gather. In this paper, we introduce a novel weakly supervised semantic segmentation model which is able to learn from image labels and just image labels. Our model uses the prior knowledge of a network trained for image recognition, employing these image annotations as an attention mechanism to identify semantic regions in the images. We then present a methodology that builds accurate class-specific segmentation masks from these regions, where neither external objectness nor saliency algorithms are required. We describe how to incorporate this mask generation strategy into a fully end-to-end trainable process, where the network jointly learns to classify and segment images. Our experiments on PASCAL VOC 2012 dataset show that exploiting these generated class-specific masks in conjunction with our novel end-to-end learning process outperforms several recent weakly supervised semantic segmentation methods that use image tags only, and even some models that leverage additional supervision or training data.

Learning to exploit the prior network knowledge for weakly supervised semantic segmentation

Similares en LA Referencia