TEI-friendly annotation scheme for medieval named entities: a case on a Spanish medieval corpus

Medieval documents are a rich source of historical data. Performing named-entity recognition (NER) on this genre of texts can provide us with valuable historical evidence. However, traditional NER categories and schemes are usually designed with modern documents in mind (i.e. journalistic text) and...

Descripción completa

Detalles Bibliográficos
Autores: Álvarez Mellado, Elena, Díez Platas, María Luisa, Ruiz Fabo, Pablo, Bermúdez Sabel, Helena, Ros Muñoz, Salvador, González-Blanco García, Elena
Tipo de recurso: artículo
Fecha de publicación:2021
País:España
Institución:Universidad Nacional de Educación a Distancia
Repositorio:e-spacio. Repositorio Institucional de la UNED
Idioma:inglés
OAI Identifier:oai:e-spacio.uned.es:20.500.14468/29391
Acceso en línea:https://hdl.handle.net/20.500.14468/29391
Access Level:acceso abierto
Palabra clave:1203.17 Informática
5701 Lingüística aplicada
Named-entity annotation
Annotation scheme
Historical NER
Medieval named entities
Medieval Spanish corpus
Descripción
Sumario:Medieval documents are a rich source of historical data. Performing named-entity recognition (NER) on this genre of texts can provide us with valuable historical evidence. However, traditional NER categories and schemes are usually designed with modern documents in mind (i.e. journalistic text) and the general-domain NER annotation schemes fail to capture the nature of medieval entities. In this paper we explore the challenges of performing named-entity annotation on a corpus of Spanish medieval documents: we discuss the mismatches that arise when applying traditional NER categories to a corpus of Spanish medieval documents and we propose a novel humanist-friendly TEI-compliant annotation scheme and guidelines intended to capture the particular nature of medieval entities.