Corpus for Complex Word Identification in Medical Spanish Texts (CWI-Med-Sp) [DATASET]
[Description of methods used for collection/generation of data] The corpus statistics and methods are explained in the following article: Federico Ortega-Riba, Leonardo Campillos-Llanos, Doaa Samy (2025) "Lexical Simplification in Spanish Texts For Patients: The Complex Word Identification Task...
| Autores: | , |
|---|---|
| Tipo de recurso: | conjunto de datos |
| Fecha de publicación: | 2024 |
| País: | España |
| Institución: | Consejo Superior de Investigaciones Científicas (CSIC) |
| Repositorio: | DIGITAL.CSIC. Repositorio Institucional del CSIC |
| OAI Identifier: | oai:digital.csic.es:10261/373675 |
| Acceso en línea: | http://hdl.handle.net/10261/373675 https://doi.org/10.20350/digitalCSIC/16706 |
| Access Level: | acceso abierto |
| Palabra clave: | Patient information documents Annotated corpus Medical text simplification Biomedical natural language processing Consent forms Clinical trials Linguistics Medical sciences Linguistic research Ciencias médicas |
| Sumario: | [Description of methods used for collection/generation of data] The corpus statistics and methods are explained in the following article: Federico Ortega-Riba, Leonardo Campillos-Llanos, Doaa Samy (2025) "Lexical Simplification in Spanish Texts For Patients: The Complex Word Identification Task". (Under review). [Methods for processing the data] Manual annotation of complex words (CW) according to the criteria defined in the guideline explained in the companion article. |
|---|