Annotating Expressions of Engagement in online book reviews: A contrastive (English-Spanish) corpus study for computational processing

This dissertation studies the expression of Engagement and alternative points of view in English and Spanish online book reviews, following the Appraisal model designed by Martin and White (2005). The study has three main aims: 1) to test two main aspects of the linguistic category of Engagement emp...

Descripción completa

Detalles Bibliográficos
Autor: Mora, Natalia
Tipo de recurso: tesis de maestría
Fecha de publicación:2011
País:España
Institución:Universidad Complutense de Madrid (UCM)
Repositorio:Docta Complutense
Idioma:inglés
OAI Identifier:oai:docta.ucm.es:20.500.14352/46377
Acceso en línea:https://hdl.handle.net/20.500.14352/46377
Access Level:acceso abierto
Palabra clave:Online book reviews
Expressions of Engagement
Computational linguistic
Natural Language Processing
Engagement in English and Spanish
Filología inglesa
Lingüística
5505.10 Filología
57 Lingüística
Descripción
Sumario:This dissertation studies the expression of Engagement and alternative points of view in English and Spanish online book reviews, following the Appraisal model designed by Martin and White (2005). The study has three main aims: 1) to test two main aspects of the linguistic category of Engagement empirically, namely the identification of span realising Engagement and the classification of Engagement into different subtypes; 2) to extract relevant contrastive features of the use of Engagement in English and Spanish in online book reviews; 3) to create a bilingual (comparable) machine-readable annotated corpus with Engagement features in English and Spanish which can serve as the training corpus for machine learning algorithms and be offered to the scientific community for further research. Following standard methodologies in the field of Natural Language Processing, two agreement studies are carried out, designed to measure inter-annotator agreement based on an initial set of 10 reviews. A larger set of 28 reviews (14 English, 14 Spanish) is further annotated by one single human coder in order to extract relevant results on contrastive aspects and provide publicly-available machine-readable annotated texts with Engagement categories. The findings reveal disagreement mainly on span length and the annotation of some specific categories, namely Pronounce and Counter. In addition, differences regarding frequency in the use of Engagement types were found in both languages, although the expressions employed were formally similar. Finally, the results of the annotation of the larger data set showed that more expressions than what was initially expected can be annotated context-independently, although regarding some other expressions, register and collocations were seen to have a decisive influence on their interpretation of some expressions, in the same way that genre has on their frequency of use, for resources aimed at emphasising reviewer’s personal opinion were more frequent than those who acknowledged and evaluated external sources.