Data augmentation with translation memories for desktop machine translation fine-tuning in 3 language pairs
This study aims to investigate the effect of data augmentation through translation memories for desktop machine translation (MT) fine-tuning in OPUS-CAT. It also focuses on assessing the usefulness of desktop MT for professional translators. Engines in three language pairs (English → Turkish, Englis...
| Autores: | , |
|---|---|
| Tipo de recurso: | artículo |
| Estado: | Versión publicada |
| Fecha de publicación: | 2024 |
| País: | España |
| Institución: | Universitat Pompeu Fabra |
| Repositorio: | Repositorio Digital de la UPF |
| OAI Identifier: | oai:repositori.upf.edu:10230/69349 |
| Acceso en línea: | http://hdl.handle.net/10230/69349 http://dx.doi.org/10.26034/cm.jostrans.2024.4716 |
| Access Level: | acceso abierto |
| Palabra clave: | Machine translation fine-tuning Domain adaptation Desktop machine translation Localization Parallel corpora Professional translators Machine translation evaluation |
| id |
ES_9b4f3d862ed2f0dfb717c9366393f41b |
|---|---|
| oai_identifier_str |
oai:repositori.upf.edu:10230/69349 |
| network_acronym_str |
ES |
| network_name_str |
España |
| repository_id_str |
|
| spelling |
Data augmentation with translation memories for desktop machine translation fine-tuning in 3 language pairsDogru, GokhanMoorkens, JossMachine translation fine-tuningDomain adaptationDesktop machine translationLocalizationParallel corporaProfessional translatorsMachine translation evaluationThis study aims to investigate the effect of data augmentation through translation memories for desktop machine translation (MT) fine-tuning in OPUS-CAT. It also focuses on assessing the usefulness of desktop MT for professional translators. Engines in three language pairs (English → Turkish, English → Spanish, and English → Catalan) are fine-tuned with corpora of two different sizes. The translation quality of each engine is measured through automatic evaluation metrics (BLEU, chrF2, TER and COMET) and human evaluation metrics (ranking, adequacy and fluency). Overall evaluation results indicate promising quality improvements in all three language pairs and imply that the use of desktop MT applications such as OPUS-CAT and fine-tuning MT engines with custom data in a translator’s desktop can potentially provide high-quality translations aside from their advantages such as privacy, confidentiality and low use of computation power.Jostrans (Journal of Specialised Translation)202520252024info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfapplication/pdfhttp://hdl.handle.net/10230/69349http://dx.doi.org/10.26034/cm.jostrans.2024.4716reponame:Repositorio Digital de la UPFinstname:Universitat Pompeu FabraInglésThe Journal of Specialised Translation. 2024;41:149–78This work is licensed under a Creative Commons Attribution 4.0 International License.http://creativecommons.org/licenses/by/4.0/info:eu-repo/semantics/openAccessoai:repositori.upf.edu:10230/693492026-06-12T07:21:37Z |
| dc.title.none.fl_str_mv |
Data augmentation with translation memories for desktop machine translation fine-tuning in 3 language pairs |
| title |
Data augmentation with translation memories for desktop machine translation fine-tuning in 3 language pairs |
| spellingShingle |
Data augmentation with translation memories for desktop machine translation fine-tuning in 3 language pairs Dogru, Gokhan Machine translation fine-tuning Domain adaptation Desktop machine translation Localization Parallel corpora Professional translators Machine translation evaluation |
| title_short |
Data augmentation with translation memories for desktop machine translation fine-tuning in 3 language pairs |
| title_full |
Data augmentation with translation memories for desktop machine translation fine-tuning in 3 language pairs |
| title_fullStr |
Data augmentation with translation memories for desktop machine translation fine-tuning in 3 language pairs |
| title_full_unstemmed |
Data augmentation with translation memories for desktop machine translation fine-tuning in 3 language pairs |
| title_sort |
Data augmentation with translation memories for desktop machine translation fine-tuning in 3 language pairs |
| dc.creator.none.fl_str_mv |
Dogru, Gokhan Moorkens, Joss |
| author |
Dogru, Gokhan |
| author_facet |
Dogru, Gokhan Moorkens, Joss |
| author_role |
author |
| author2 |
Moorkens, Joss |
| author2_role |
author |
| dc.subject.none.fl_str_mv |
Machine translation fine-tuning Domain adaptation Desktop machine translation Localization Parallel corpora Professional translators Machine translation evaluation |
| topic |
Machine translation fine-tuning Domain adaptation Desktop machine translation Localization Parallel corpora Professional translators Machine translation evaluation |
| description |
This study aims to investigate the effect of data augmentation through translation memories for desktop machine translation (MT) fine-tuning in OPUS-CAT. It also focuses on assessing the usefulness of desktop MT for professional translators. Engines in three language pairs (English → Turkish, English → Spanish, and English → Catalan) are fine-tuned with corpora of two different sizes. The translation quality of each engine is measured through automatic evaluation metrics (BLEU, chrF2, TER and COMET) and human evaluation metrics (ranking, adequacy and fluency). Overall evaluation results indicate promising quality improvements in all three language pairs and imply that the use of desktop MT applications such as OPUS-CAT and fine-tuning MT engines with custom data in a translator’s desktop can potentially provide high-quality translations aside from their advantages such as privacy, confidentiality and low use of computation power. |
| publishDate |
2024 |
| dc.date.none.fl_str_mv |
2024 2025 2025 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.none.fl_str_mv |
http://hdl.handle.net/10230/69349 http://dx.doi.org/10.26034/cm.jostrans.2024.4716 |
| url |
http://hdl.handle.net/10230/69349 http://dx.doi.org/10.26034/cm.jostrans.2024.4716 |
| dc.language.none.fl_str_mv |
Inglés |
| language_invalid_str_mv |
Inglés |
| dc.relation.none.fl_str_mv |
The Journal of Specialised Translation. 2024;41:149–78 |
| dc.rights.none.fl_str_mv |
This work is licensed under a Creative Commons Attribution 4.0 International License. http://creativecommons.org/licenses/by/4.0/ info:eu-repo/semantics/openAccess |
| rights_invalid_str_mv |
This work is licensed under a Creative Commons Attribution 4.0 International License. http://creativecommons.org/licenses/by/4.0/ |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf application/pdf |
| dc.publisher.none.fl_str_mv |
Jostrans (Journal of Specialised Translation) |
| publisher.none.fl_str_mv |
Jostrans (Journal of Specialised Translation) |
| dc.source.none.fl_str_mv |
reponame:Repositorio Digital de la UPF instname:Universitat Pompeu Fabra |
| instname_str |
Universitat Pompeu Fabra |
| reponame_str |
Repositorio Digital de la UPF |
| collection |
Repositorio Digital de la UPF |
| repository.name.fl_str_mv |
|
| repository.mail.fl_str_mv |
|
| _version_ |
1869414494112841728 |
| score |
15.81155 |