Deep regression of social signals in Dyadic Scenarios
The purpose of this project is to design a general system for emotion recognition through social signals in dyadic using deep learning methods using raw data from audio, video and text transcriptions from publicly available database records. The automatic emotion recognition problem has increased th...
| Autor: | |
|---|---|
| Tipo de recurso: | tesis de maestría |
| Fecha de publicación: | 2020 |
| País: | España |
| Institución: | Universitat Politècnica de Catalunya (UPC) |
| Repositorio: | UPCommons. Portal del coneixement obert de la UPC |
| Idioma: | inglés |
| OAI Identifier: | oai:upcommons.upc.edu:2117/336189 |
| Acceso en línea: | https://hdl.handle.net/2117/336189 |
| Access Level: | acceso abierto |
| Palabra clave: | Neural networks (Computer science) Machine learning emotion recognition recurrent neural networks feature extraction multi-modal database dyadic scenario Xarxes neuronals (Informàtica) Aprenentatge automàtic Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial |
| Sumario: | The purpose of this project is to design a general system for emotion recognition through social signals in dyadic using deep learning methods using raw data from audio, video and text transcriptions from publicly available database records. The automatic emotion recognition problem has increased the attention in the scientific community considering the multi applications for emotion detection but also to design more accurate and complex empathic machines. During this project are proposed alternatives for utterance representation of multi-modal data generated from text, audio and video, in order to improve the state of the art system for emotion recognition based on deep learning networks. The proposed framework is based in IEMOCAP database but it has a general scope for any multi-modal database. The performance of this system outperforms the state of the art method and delivers an informative analysis concerning the utterance representation quality. Finally, the conclusions of this work are exposed along with potential future lines of work related to emotion recognition systems and emotion representations. |
|---|