Corpus PaGeS: A multifunctional resource for language learning, translation and cross-linguistic research
This chapter presents the bilingual parallel corpus PaGeS, compiled by the research group SpatiAlEs from the University of Santiago de Compostela. PaGeS currently amounts to nearly 20 million tokens and consists of texts originally written in German and in Spanish and their correspondent translation...
| Autores: | , , , , |
|---|---|
| Tipo de recurso: | capítulo de libro |
| Fecha de publicación: | 2019 |
| País: | España |
| Institución: | Universidad de Santiago de Compostela (USC) |
| Repositorio: | Minerva. Repositorio Institucional de la Universidad de Santiago de Compostela |
| Idioma: | inglés |
| OAI Identifier: | oai:minerva.usc.gal:10347/39334 |
| Acceso en línea: | https://hdl.handle.net/10347/39334 |
| Access Level: | acceso abierto |
| Palabra clave: | Parallel corpora Corpus alignment Corpus visualization Spanish/German 5701 Lingüística aplicada |
| Sumario: | This chapter presents the bilingual parallel corpus PaGeS, compiled by the research group SpatiAlEs from the University of Santiago de Compostela. PaGeS currently amounts to nearly 20 million tokens and consists of texts originally written in German and in Spanish and their correspondent translations into the other language, as well as a small portion of German and Spanish translations from third languages. The present contribution introduces the main characteristics of the PaGeS corpus, focusing on its design and compilation. It first explains the criteria for the selection of the texts and the details of text pre-processing, automatic alignment and manual review. It then addresses the search and display features describing the server architecture and indexing process. Finally, the intended development of the PaGeS corpus is briefly discussed. |
|---|