Linguistic-family-specific Encoders and Decoders for Multilingual Spoken Machine Translation
This project provides a spoken language translation system trained with UN Parallel Corpus and MuST-C, aiming at study the correlation between languages of different linguistic families and the performance of the translation tasks. This SLT system consists of a text-to-text Neural Machine Translatio...
| Autor: | |
|---|---|
| Tipo de recurso: | tesis de maestría |
| Fecha de publicación: | 2022 |
| País: | España |
| Institución: | Universitat Politècnica de Catalunya (UPC) |
| Repositorio: | UPCommons. Portal del coneixement obert de la UPC |
| Idioma: | inglés |
| OAI Identifier: | oai:upcommons.upc.edu:2117/383299 |
| Acceso en línea: | https://hdl.handle.net/2117/383299 |
| Access Level: | acceso abierto |
| Palabra clave: | Natural language processing (Computer science) Decoders (Electronics) Machine translating multilingual machine translation spoken language translation natural language processing neural machine translation Tractament del llenguatge natural (Informàtica) Descodificadors (Electrònica) Traducció automàtica Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic |
| Sumario: | This project provides a spoken language translation system trained with UN Parallel Corpus and MuST-C, aiming at study the correlation between languages of different linguistic families and the performance of the translation tasks. This SLT system consists of a text-to-text Neural Machine Translation model, whose dataset includes six languages from five linguistic families, and a Automated Speech Recognition model, using dataset that contains four languages from four linguistic families. The combined SLT system is an end2end system, which is a relatively new task, and in this project, the idea is to analyze how would different linguistic families perform when training under the same conditions. Apart from measuring the performance using BLEU score system, this project also performs fine-tuning and zero-shot translation tasks. In general, the obtained BLEU scores are good and similar to original baseline models studies in UNPC and MuST-C papers. Finetuning and zero-shot translation experiments also obtained reasonable results, proving the hypothesized positive correlation between the closeness of languages and the performances of the translation tasks. |
|---|