Speak2Subs: Evaluating State-of-the-Art Speech Recognition Models and Compliant Subtitle Generation
With recent advances in largue language models, the evolution of speech-to-text tasks has been exponential. While state-of-the-art automatic speech recognition (ASR) models have taken a big step in speech transcription, creating quality subtitles still requires human intervention. This project has t...
| Autor: | |
|---|---|
| Tipo de recurso: | tesis de maestría |
| Fecha de publicación: | 2024 |
| País: | España |
| Institución: | Universidad Nacional de Educación a Distancia |
| Repositorio: | e-spacio. Repositorio Institucional de la UNED |
| Idioma: | inglés |
| OAI Identifier: | oai:e-spacio.uned.es:20.500.14468/22596 |
| Acceso en línea: | https://hdl.handle.net/20.500.14468/22596 |
| Access Level: | acceso abierto |
| Palabra clave: | 1203.04 Inteligencia artificial ASR LLM Speech-To-Text Subtitle |
| id |
ES_d8a7b5dd2b1cb753b8ebb5bbe94ece93 |
|---|---|
| oai_identifier_str |
oai:e-spacio.uned.es:20.500.14468/22596 |
| network_acronym_str |
ES |
| network_name_str |
España |
| repository_id_str |
|
| spelling |
Speak2Subs: Evaluating State-of-the-Art Speech Recognition Models and Compliant Subtitle GenerationFresneda García, Julio1203.04 Inteligencia artificialASRLLMSpeech-To-TextSubtitleWith recent advances in largue language models, the evolution of speech-to-text tasks has been exponential. While state-of-the-art automatic speech recognition (ASR) models have taken a big step in speech transcription, creating quality subtitles still requires human intervention. This project has two main aspects: evaluating cutting-edge ASR models for speech-to-text, and developing a package that uses this ASR models to generate high-quality and compliant subtitles. ASR models do not inherently provide results suitable for subtitles. Therefore, one of the primary objectives of this package is to utilize and enhance the output generated by ASR models to create subtitles of a quality that requires minimal human modification. This enhancement is necessary because ASR models alone are incapable of producing subtitles that meet the required standards of quality. Speak2Subs has achieved this goal, being a tool that produces high-quality subtitles with minimal human interaction.Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática.e-Spacio UNED20242024-06-1120242024-02-0120242024-02-01master thesishttp://purl.org/coar/resource_type/c_bdccinfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/20.500.14468/22596reponame:e-spacio. Repositorio Institucional de la UNEDinstname:Universidad Nacional de Educación a DistanciaInglésengopen accesshttp://purl.org/coar/access_right/c_abf2info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-nd/4.0/deed.esoai:e-spacio.uned.es:20.500.14468/225962026-06-06T12:38:31Z |
| dc.title.none.fl_str_mv |
Speak2Subs: Evaluating State-of-the-Art Speech Recognition Models and Compliant Subtitle Generation |
| title |
Speak2Subs: Evaluating State-of-the-Art Speech Recognition Models and Compliant Subtitle Generation |
| spellingShingle |
Speak2Subs: Evaluating State-of-the-Art Speech Recognition Models and Compliant Subtitle Generation Fresneda García, Julio 1203.04 Inteligencia artificial ASR LLM Speech-To-Text Subtitle |
| title_short |
Speak2Subs: Evaluating State-of-the-Art Speech Recognition Models and Compliant Subtitle Generation |
| title_full |
Speak2Subs: Evaluating State-of-the-Art Speech Recognition Models and Compliant Subtitle Generation |
| title_fullStr |
Speak2Subs: Evaluating State-of-the-Art Speech Recognition Models and Compliant Subtitle Generation |
| title_full_unstemmed |
Speak2Subs: Evaluating State-of-the-Art Speech Recognition Models and Compliant Subtitle Generation |
| title_sort |
Speak2Subs: Evaluating State-of-the-Art Speech Recognition Models and Compliant Subtitle Generation |
| dc.creator.none.fl_str_mv |
Fresneda García, Julio |
| author |
Fresneda García, Julio |
| author_facet |
Fresneda García, Julio |
| author_role |
author |
| dc.contributor.none.fl_str_mv |
e-Spacio UNED |
| dc.subject.none.fl_str_mv |
1203.04 Inteligencia artificial ASR LLM Speech-To-Text Subtitle |
| topic |
1203.04 Inteligencia artificial ASR LLM Speech-To-Text Subtitle |
| description |
With recent advances in largue language models, the evolution of speech-to-text tasks has been exponential. While state-of-the-art automatic speech recognition (ASR) models have taken a big step in speech transcription, creating quality subtitles still requires human intervention. This project has two main aspects: evaluating cutting-edge ASR models for speech-to-text, and developing a package that uses this ASR models to generate high-quality and compliant subtitles. ASR models do not inherently provide results suitable for subtitles. Therefore, one of the primary objectives of this package is to utilize and enhance the output generated by ASR models to create subtitles of a quality that requires minimal human modification. This enhancement is necessary because ASR models alone are incapable of producing subtitles that meet the required standards of quality. Speak2Subs has achieved this goal, being a tool that produces high-quality subtitles with minimal human interaction. |
| publishDate |
2024 |
| dc.date.none.fl_str_mv |
2024 2024-06-11 2024 2024-02-01 2024 2024-02-01 |
| dc.type.none.fl_str_mv |
master thesis http://purl.org/coar/resource_type/c_bdcc |
| dc.type.openaire.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| dc.identifier.none.fl_str_mv |
https://hdl.handle.net/20.500.14468/22596 |
| url |
https://hdl.handle.net/20.500.14468/22596 |
| dc.language.none.fl_str_mv |
Inglés eng |
| language_invalid_str_mv |
Inglés |
| language |
eng |
| dc.rights.none.fl_str_mv |
open access http://purl.org/coar/access_right/c_abf2 info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-nd/4.0/deed.es |
| rights_invalid_str_mv |
open access http://purl.org/coar/access_right/c_abf2 https://creativecommons.org/licenses/by-nc-nd/4.0/deed.es |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. |
| publisher.none.fl_str_mv |
Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. |
| dc.source.none.fl_str_mv |
reponame:e-spacio. Repositorio Institucional de la UNED instname:Universidad Nacional de Educación a Distancia |
| instname_str |
Universidad Nacional de Educación a Distancia |
| reponame_str |
e-spacio. Repositorio Institucional de la UNED |
| collection |
e-spacio. Repositorio Institucional de la UNED |
| repository.name.fl_str_mv |
|
| repository.mail.fl_str_mv |
|
| _version_ |
1869421193108389888 |
| score |
15,81155 |