Speaker diarization and speech recognition in the semi-automatization of audio description

Delgado Flores, Héctor; Matamala, Anna|||0000-0002-1607-9011; Serrano, Javier|||0000-0003-1235-2145

Speaker diarization and speech recognition in the semi-automatization of audio description

This article presents an overview of the technological components used in the process of audio description, and suggests a new scenario in which speech recognition, machine translation, and text-to-speech, with the corresponding human revision, could be used to increase audio description provision....

ver descrição completa

Detalhes bibliográficos
Autores:	Delgado Flores, Héctor, Matamala, Anna\|\|\|0000-0002-1607-9011, Serrano, Javier\|\|\|0000-0003-1235-2145
Tipo de documento:	artigo
Data de publicação:	2015
País:	España
Recursos:	Universitat Autònoma de Barcelona
Repositório:	Dipòsit Digital de Documents de la UAB
Idioma:	inglês
OAI Identifier:	oai:ddd.uab.cat:144880
Acesso em linha:	https://ddd.uab.cat/record/144880 https://dx.doi.org/urn:doi:10.5007/2175-7968.2015v35n2p308
Access Level:	Acceso aberto
Palavra-chave:	Audio description Accessibility Speaker diarization Speech recognition Technology Audiodescripción Accesibilidad Diarización Reconocimiento de habla Tecnología

Descrição
Resumo:	This article presents an overview of the technological components used in the process of audio description, and suggests a new scenario in which speech recognition, machine translation, and text-to-speech, with the corresponding human revision, could be used to increase audio description provision. The article focuses on a process in which both speaker diarization and speech recognition are used in order to obtain a semi-automatic transcription of the audio description track. The technical process is presented and experimental results are summarized.

Speaker diarization and speech recognition in the semi-automatization of audio description

Registros relacionados