Corpus tools for parallel corpora of theatre plays: an introduction to TAligner and ACM-theatre

Software tools are of vital importance in corpus-based research, but they can also lead to restrictions on the type of supported corpora and the range of analyses that can be performed. For example, corpus analysis tools, as general purpose software, do not include specific features to process corpo...

Descripción completa

Detalles Bibliográficos
Autores: Andaluz-Pinedo, Olaia, Sanjurjo González, Hugo
Tipo de recurso: artículo
Fecha de publicación:2022
País:España
Institución:Universidad de Cantabria (UC)
Repositorio:UCrea Repositorio Abierto de la Universidad de Cantabria
Idioma:inglés
OAI Identifier:oai:repositorio.unican.es:10902/32114
Acceso en línea:https://hdl.handle.net/10902/32114
Access Level:acceso abierto
Palabra clave:Corpus building
Corpus analysis
Software
Parallel corpora
Theatre translations
Descripción
Sumario:Software tools are of vital importance in corpus-based research, but they can also lead to restrictions on the type of supported corpora and the range of analyses that can be performed. For example, corpus analysis tools, as general purpose software, do not include specific features to process corpora of theatre plays. This situation is even worse for parallel corpora of theatrical texts, in that there is currently a lack of software that allows for both the alignment and analysis of parallel corpora here. In this contribution, we will first outline the peculiarities of theatre texts and suggest three software features to address them: annotation of the structural units of plays, alignment at the utterance level, and concordances and statistics using the annotated units. Second, we will present the specific functionalities of TAligner and ACM to build and analyse parallel corpora of play texts, showing how new avenues of research are opening up with the development of these tools.