Speaker diarization and speech recognition in the semi-automatization of audio description
This article presents an overview of the technological components used in the process of audio description, and suggests a new scenario in which speech recognition, machine translation, and text-to-speech, with the corresponding human revision, could be used to increase audio description provision....
| Autores: | , , |
|---|---|
| Tipo de recurso: | artículo |
| Fecha de publicación: | 2015 |
| País: | España |
| Institución: | Universitat Autònoma de Barcelona |
| Repositorio: | Dipòsit Digital de Documents de la UAB |
| Idioma: | inglés |
| OAI Identifier: | oai:ddd.uab.cat:144880 |
| Acceso en línea: | https://ddd.uab.cat/record/144880 https://dx.doi.org/urn:doi:10.5007/2175-7968.2015v35n2p308 |
| Access Level: | acceso abierto |
| Palabra clave: | Audio description Accessibility Speaker diarization Speech recognition Technology Audiodescripción Accesibilidad Diarización Reconocimiento de habla Tecnología |
| Sumario: | This article presents an overview of the technological components used in the process of audio description, and suggests a new scenario in which speech recognition, machine translation, and text-to-speech, with the corresponding human revision, could be used to increase audio description provision. The article focuses on a process in which both speaker diarization and speech recognition are used in order to obtain a semi-automatic transcription of the audio description track. The technical process is presented and experimental results are summarized. |
|---|