Motion-region annotation for complex videos via label propagation across occluders

Motion cue is pivotal in moving object analysis, which is the root for motion segmentation and detection. These preprocessing tasks are building blocks for several applications such as recognition, matching and estimation. To devise a robust algorithm for motion analysis, it is imperative to have a...

Descripción completa

Detalles Bibliográficos
Autores: Muhammad Habib, Mahmood, Diez, Yago, Oliver i Malagelada, Arnau, Salvi, Joaquim, Lladó Bardera, Xavier
Tipo de recurso: artículo
Estado:Versión publicada
Fecha de publicación:2022
País:España
Institución:Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
Repositorio:Recercat. Dipósit de la Recerca de Catalunya
OAI Identifier:oai:recercat.cat:10256/22118
Acceso en línea:http://hdl.handle.net/10256/22118
Access Level:acceso abierto
Palabra clave:Algorismes computacionals
Computer algorithms
Imatges -- Segmentació
Image segmentation
Imatges -- Moviment
Image motion
Visió artificial (Robòtica)
Robot vision
Descripción
Sumario:Motion cue is pivotal in moving object analysis, which is the root for motion segmentation and detection. These preprocessing tasks are building blocks for several applications such as recognition, matching and estimation. To devise a robust algorithm for motion analysis, it is imperative to have a comprehensive dataset to evaluate an algorithm’s performance. The main limitation in making these kind of datasets is the creation of ground-truth annotation of motion, as each moving object might span over multiple frames with changes in size, illumination and angle of view. Besides the optical changes, the object can undergo occlusion by static or moving occluders. The challenge increases when the video is captured by a moving camera. In this paper, we tackle the task of providing ground-truth annotation on motion regions in videos captured from a moving camera. With minimal manual annotation of an object mask, we are able to propagate the label mask in all the frames. Object label correction based on static and moving occluder is also performed by applying occluder mask tracking for a given depth ordering. A motion annotation dataset is also proposed to evaluate algorithm performance. The results show that our cascaded-naive approach provides successful results. All the resources of the annotation tool are publicly available at http://dixie.udg.edu/anntool/