Etiquetado gramatical y lematización en el Corpus Histórico Judeoespañol (CORHIJE): problemas, soluciones y resoluciones
[EN] After a brief review of the most salient features of the Corpus Histórico Judeoespañol - CORHIJE —which was already presented at the 3 rd Edition of the Congreso de Corpus Diacrónicos en lenguas Iberorrománicas (CODILI, Zurich 2014)—, this paper describes the ongoing process of lemmatization an...
| Autores: | , |
|---|---|
| Tipo de recurso: | artículo |
| Estado: | Versión publicada |
| Fecha de publicación: | 2017 |
| País: | España |
| Institución: | Consejo Superior de Investigaciones Científicas (CSIC) |
| Repositorio: | DIGITAL.CSIC. Repositorio Institucional del CSIC |
| OAI Identifier: | oai:digital.csic.es:10261/193832 |
| Acceso en línea: | http://hdl.handle.net/10261/193832 |
| Access Level: | acceso abierto |
| Palabra clave: | Linguistic Corpora Digital Corpus Design Judeo-Spanish Diachrony Corpus lingüísticos Diseño de corpus electrónicos Judeoespañol Diacronía |
| Sumario: | [EN] After a brief review of the most salient features of the Corpus Histórico Judeoespañol - CORHIJE —which was already presented at the 3 rd Edition of the Congreso de Corpus Diacrónicos en lenguas Iberorrománicas (CODILI, Zurich 2014)—, this paper describes the ongoing process of lemmatization and grammatical annotation of the corpus. We focus on describing the challenges we have encountered during the annotation process and the solutions we have applied to them, which, in some cases, have led us to take relatively arbitrary resolutions in accordance with the description and analysis goals we were trying to achieve: problems, solutions, and resolutions that amplify the title of our presentation. |
|---|