Data augmentation with translation memories for desktop machine translation fine-tuning in 3 language pairs

This study aims to investigate the effect of data augmentation through translation memories for desktop machine translation (MT) fine-tuning in OPUS-CAT. It also focuses on assessing the usefulness of desktop MT for professional translators. Engines in three language pairs (English → Turkish, Englis...

Descripción completa

Detalles Bibliográficos
Autores: Dogru, Gokhan, Moorkens, Joss
Tipo de recurso: artículo
Estado:Versión publicada
Fecha de publicación:2024
País:España
Institución:Universitat Pompeu Fabra
Repositorio:Repositorio Digital de la UPF
OAI Identifier:oai:repositori.upf.edu:10230/69349
Acceso en línea:http://hdl.handle.net/10230/69349
http://dx.doi.org/10.26034/cm.jostrans.2024.4716
Access Level:acceso abierto
Palabra clave:Machine translation fine-tuning
Domain adaptation
Desktop machine translation
Localization
Parallel corpora
Professional translators
Machine translation evaluation
Descripción
Sumario:This study aims to investigate the effect of data augmentation through translation memories for desktop machine translation (MT) fine-tuning in OPUS-CAT. It also focuses on assessing the usefulness of desktop MT for professional translators. Engines in three language pairs (English → Turkish, English → Spanish, and English → Catalan) are fine-tuned with corpora of two different sizes. The translation quality of each engine is measured through automatic evaluation metrics (BLEU, chrF2, TER and COMET) and human evaluation metrics (ranking, adequacy and fluency). Overall evaluation results indicate promising quality improvements in all three language pairs and imply that the use of desktop MT applications such as OPUS-CAT and fine-tuning MT engines with custom data in a translator’s desktop can potentially provide high-quality translations aside from their advantages such as privacy, confidentiality and low use of computation power.