Human Emotion Detection Through First Person View

Human emotion detection is a rapidly evolving field with critical applications in affective computing, human-computer interaction, and social robotics. While traditional approaches rely heavily on facial expressions and speech analysis, recent advancements have shown that body posture and motion dyn...

Descripción completa

Detalles Bibliográficos
Autor: Umbert Bosch, Miquel
Tipo de recurso: tesis de maestría
Fecha de publicación:2025
País:España
Institución:Universitat Politècnica de Catalunya (UPC)
Repositorio:UPCommons. Portal del coneixement obert de la UPC
Idioma:inglés
OAI Identifier:oai:upcommons.upc.edu:2117/431674
Acceso en línea:https://hdl.handle.net/2117/431674
Access Level:acceso abierto
Palabra clave:Computer vision
Deep learning (Machine learning)
Emotions
Visió per computador
aprenentatge profund
models per visió
transformers
BoLD
Ego4D
deep learning
vision models
time series analysis
Visió per ordinador
Aprenentatge profund (Aprenentatge automàtic)
Emocions
Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic
Descripción
Sumario:Human emotion detection is a rapidly evolving field with critical applications in affective computing, human-computer interaction, and social robotics. While traditional approaches rely heavily on facial expressions and speech analysis, recent advancements have shown that body posture and motion dynamics can provide significant insights into emotional states. In this study, we explore emotion recognition from a first-person view (FPV) perspective using body pose estimation. FPV introduces unique challenges due to its dynamic and limited field of vision compared to third-person (exo-view) data. Our approach involves training a vision transformer model on the BoLD (Body Language Dataset) and evaluating its performance on the Ego4D dataset, particularly within the social interactions subset. The research investigates how pose estimation models can be adapted to FPV data, the key differences between FPV and exo-view body posture dynamics, and the role of context (environment, social roles) in emotion recognition. By leveraging skeletal trajectories extracted from FPV videos, we aim to identify crucial markers for emotional differentiation and assess the feasibility of posture-based emotion recognition as an alternative to facial and speech-based models. Our findings contribute to the growing field of affective computing, offering novel insights into the intersection of computer vision, human behavior analysis, and deep learning