Advancing Egocentric Action Recognition for Perceptually-enabled Task Guidance

This project aims to create a task guidance assistant using the HoloLens headset that will guide the user through augmented reality. The primary focus of this thesis lies in the integration of an action recognition framework for egocentric videos, crucial for task prediction within the system. Devel...

Descripción completa

Detalles Bibliográficos
Autor: Manzano Rodríguez, Ana
Tipo de recurso: tesis de maestría
Fecha de publicación:2023
País:España
Institución:Universitat Politècnica de Catalunya (UPC)
Repositorio:UPCommons. Portal del coneixement obert de la UPC
Idioma:inglés
OAI Identifier:oai:upcommons.upc.edu:2117/420290
Acceso en línea:https://hdl.handle.net/2117/420290
Access Level:acceso abierto
Palabra clave:Computer vision
Augmented reality
action recognition
computer vision
reconeixement d'accions
Visió per ordinador
Realitat augmentada
Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial
Descripción
Sumario:This project aims to create a task guidance assistant using the HoloLens headset that will guide the user through augmented reality. The primary focus of this thesis lies in the integration of an action recognition framework for egocentric videos, crucial for task prediction within the system. Development starts in a kitchen environment, with the intention of using transfer learning for military scenarios in the future. Epic-Kitchens serves as an initial reference dataset, subsequently followed by the creation of a customized dataset. Various state-of-the-art action recognition models are considered, with Omnivore being the final choice. Initial results show 14.23% Top 5 action recognition accuracy within the created dataset. Through classifier modifications and application of diverse post-processing video techniques, this accuracy is significantly improved, culminating in an impressive 83.76%.