On the modification of binarization algorithms to retain grayscale information for handwritten text recognition
[EN] The amount of digitized legacy documents has been rising over the last years due mainly to the increasing number of on-line digital libraries publishing this kind of documents. The vast majority of them remain waiting to be transcribed to provide historians and other researchers new ways of ind...
| Autores: | , , |
|---|---|
| Tipo de recurso: | capítulo de libro |
| Fecha de publicación: | 2015 |
| País: | España |
| Institución: | Universitat Politècnica de València (UPV) |
| Repositorio: | RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia |
| Idioma: | inglés |
| OAI Identifier: | oai:riunet.upv.es:10251/64376 |
| Acceso en línea: | https://riunet.upv.es/handle/10251/64376 |
| Access Level: | acceso abierto |
| Palabra clave: | Handwritten text recognition Pre-processing of handwritten historical documents Background removal LENGUAJES Y SISTEMAS INFORMATICOS |
| Sumario: | [EN] The amount of digitized legacy documents has been rising over the last years due mainly to the increasing number of on-line digital libraries publishing this kind of documents. The vast majority of them remain waiting to be transcribed to provide historians and other researchers new ways of indexing, consulting and querying them. However, the performance accuracy of state-of-the-art Handwritten Text Recognition techniques decreases dramatically when they are applied to these historical documents. This is mainly due to the typical paper degradation problems. Therefore, robust pre-processing techniques is an important step for helping further recognition steps. This paper proposes to take existing binarization techniques, in order to retain their advantages, and modify them in such a way that some of the original grayscale information is preserved and be considered by the subsequent recognizer. Results are reported with the publicly available ESPOSALLES database. |
|---|