Speaker diarization and speech recognition in the semi-automatization of audio description

Delgado Flores, Héctor; Matamala, Anna|||0000-0002-1607-9011; Serrano, Javier|||0000-0003-1235-2145

Speaker diarization and speech recognition in the semi-automatization of audio description

This article presents an overview of the technological components used in the process of audio description, and suggests a new scenario in which speech recognition, machine translation, and text-to-speech, with the corresponding human revision, could be used to increase audio description provision....

Descripción completa

Detalles Bibliográficos
Autores:	Delgado Flores, Héctor, Matamala, Anna\|\|\|0000-0002-1607-9011, Serrano, Javier\|\|\|0000-0003-1235-2145
Tipo de recurso:	artículo
Fecha de publicación:	2015
País:	España
Institución:	Universitat Autònoma de Barcelona
Repositorio:	Dipòsit Digital de Documents de la UAB
Idioma:	inglés
OAI Identifier:	oai:ddd.uab.cat:144880
Acceso en línea:	https://ddd.uab.cat/record/144880 https://dx.doi.org/urn:doi:10.5007/2175-7968.2015v35n2p308
Access Level:	acceso abierto
Palabra clave:	Audio description Accessibility Speaker diarization Speech recognition Technology Audiodescripción Accesibilidad Diarización Reconocimiento de habla Tecnología

Descripción
Sumario:	This article presents an overview of the technological components used in the process of audio description, and suggests a new scenario in which speech recognition, machine translation, and text-to-speech, with the corresponding human revision, could be used to increase audio description provision. The article focuses on a process in which both speaker diarization and speech recognition are used in order to obtain a semi-automatic transcription of the audio description track. The technical process is presented and experimental results are summarized.

Speaker diarization and speech recognition in the semi-automatization of audio description

Similares en LA Referencia