On the modification of binarization algorithms to retain grayscale information for handwritten text recognition

[EN] The amount of digitized legacy documents has been rising over the last years due mainly to the increasing number of on-line digital libraries publishing this kind of documents. The vast majority of them remain waiting to be transcribed to provide historians and other researchers new ways of ind...

Descripción completa

Detalles Bibliográficos
Autores: Villegas, Mauricio, Romero Gómez, Verónica, Sánchez Peiró, Joan Andreu|||0000-0003-0423-2020
Tipo de recurso: capítulo de libro
Fecha de publicación:2015
País:España
Institución:Universitat Politècnica de València (UPV)
Repositorio:RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
Idioma:inglés
OAI Identifier:oai:riunet.upv.es:10251/64376
Acceso en línea:https://riunet.upv.es/handle/10251/64376
Access Level:acceso abierto
Palabra clave:Handwritten text recognition
Pre-processing of handwritten historical documents
Background removal
LENGUAJES Y SISTEMAS INFORMATICOS
Descripción
Sumario:[EN] The amount of digitized legacy documents has been rising over the last years due mainly to the increasing number of on-line digital libraries publishing this kind of documents. The vast majority of them remain waiting to be transcribed to provide historians and other researchers new ways of indexing, consulting and querying them. However, the performance accuracy of state-of-the-art Handwritten Text Recognition techniques decreases dramatically when they are applied to these historical documents. This is mainly due to the typical paper degradation problems. Therefore, robust pre-processing techniques is an important step for helping further recognition steps. This paper proposes to take existing binarization techniques, in order to retain their advantages, and modify them in such a way that some of the original grayscale information is preserved and be considered by the subsequent recognizer. Results are reported with the publicly available ESPOSALLES database.