Vanishing Mask Refinement in Semi-Supervised Video Segmentation
This paper presents a novel architecture, Video Object Segmentation Enhanced with Segment Anything Model, aimed at improving Semi-supervised Video Object Segmentation models by refining each output object mask with fundation models. Video Object Segmentation is a significant focus in the field of co...
| Autores: | , , , , |
|---|---|
| Formato: | artículo |
| Fecha de publicación: | 2024 |
| País: | España |
| Recursos: | Universidad de Alcalá (UAH) |
| Repositorio: | e_Buah Biblioteca Digital Universidad de Alcalá |
| Idioma: | inglés |
| OAI Identifier: | oai:ebuah.uah.es:10017/64659 |
| Acesso em linha: | http://hdl.handle.net/10017/64659 https://dx.doi.org/10.2139/ssrn.4876026 |
| Access Level: | acceso abierto |
| Palavra-chave: | Video Object Segmentation Long-Term Videos Deep Learning Informática Computer science |
| Resumo: | This paper presents a novel architecture, Video Object Segmentation Enhanced with Segment Anything Model, aimed at improving Semi-supervised Video Object Segmentation models by refining each output object mask with fundation models. Video Object Segmentation is a significant focus in the field of computer vision, with object appearance, occlusions, camera movements, or perspective alterations being the main challenge to overcome. This study explores the diverse inputs accepted by Segment Anything Model in order to establish the optimal configuration for our model by intense testing. The results on established video segmentation datasets demonstrate that our proposal enhances the mask outputs of the base model for single object, multi-object, and long video datasets and sets the basis for future exploration by the combination of these two architectures. |
|---|