Basque and Spanish Multilingual TTS Model for Speech-to-Speech Translation

De Zuazo Oteiza, Xabier

Basque and Spanish Multilingual TTS Model for Speech-to-Speech Translation

[EN] Lately, multiple Text-to-Speech models have emerged using Deep Neural networks to synthesize audio from text. In this work, the state-of-the-art multilingual and multi-speaker Text-to-Speech model has been trained in Basque, Spanish, Catalan, and Galician. The research consisted of gathering th...

Descripción completa

Detalles Bibliográficos
Autor:	De Zuazo Oteiza, Xabier
Tipo de recurso:	tesis de maestría
Fecha de publicación:	2023
País:	España
Institución:	Universidad del País Vasco
Repositorio:	Addi. Archivo Digital para la Docencia y la Investigación
OAI Identifier:	oai:addi.ehu.eus:10810/61815
Acceso en línea:	http://hdl.handle.net/10810/61815
Access Level:	acceso abierto
Palabra clave:	multilingual multi-speaker text-to-speech speech-to-text machine translation speech-to-speech translation cross-lingual zero-shot voice conversion Basque Spanish

Descripción
Sumario:	[EN] Lately, multiple Text-to-Speech models have emerged using Deep Neural networks to synthesize audio from text. In this work, the state-of-the-art multilingual and multi-speaker Text-to-Speech model has been trained in Basque, Spanish, Catalan, and Galician. The research consisted of gathering the datasets, pre-processing their audio and text data, training the model in the languages in different steps, and evaluating the results at each point. For the training step, a transfer learning approach has been used from a model already trained in three languages: English, Portuguese, and French. Therefore, the final model created here supports a total of seven languages. Moreover, these models also support zero-shot voice conversion, using an input audio file as a reference. Finally, a prototype application has been created to do Speech-to-Speech Translation, putting together the models trained here and other models from the community. Along the way, some Deep Speech Speech-to-Text models have been generated for Basque and Galician.

Basque and Spanish Multilingual TTS Model for Speech-to-Speech Translation

Similares en LA Referencia