Assessing reproducibility in screenshot-based task mining: A decision discovery perspective

This reproducible paper serves as a companion paper to prior work introducing a novel Task Mining framework that incorporates UI screenshots as supplementary data to enhance the interpretability of human decisionmaking in Robotic Process Automation (RPA). This framework enriches the traditional User...

Descripción completa

Detalles Bibliográficos
Autores: Martínez Rojas, Antonio, Rodríguez Ruíz, Antonio, Jiménez Ramírez, Andrés, González Enríquez, José, Reijers, H.A, Nour Eldin, Ali, Sedmidubsky, Jan, Kraus, Alexander
Tipo de recurso: artículo
Estado:Versión publicada
Fecha de publicación:2026
País:España
Institución:Universidad de Sevilla (US)
Repositorio:idUS. Depósito de Investigación de la Universidad de Sevilla
OAI Identifier:oai:dnet:idus________::f589b5ceb5eb7e2d93cb101e1b57fc9b
Acceso en línea:https://hdl.handle.net/11441/186459
https://doi.org/10.1016/j.is.2026.102745
Access Level:acceso abierto
Palabra clave:Robotic process automation
UI log
User behavior mining
Task mining
Decision model Discovery
Reproducible Paper
Descripción
Sumario:This reproducible paper serves as a companion paper to prior work introducing a novel Task Mining framework that incorporates UI screenshots as supplementary data to enhance the interpretability of human decisionmaking in Robotic Process Automation (RPA). This framework enriches the traditional User Interface (UI) log — typically composed of timestamped events such as mouse clicks and keystrokes — with image data, allowing for a more detailed process model to be discovered, particularly in the context of decision-making rules. The aim of this reproducibility paper is to provide a detailed, step-by-step reproducibility protocol to replicate the Task Mining framework’s core methodology, including data processing, extraction of features within screenshots, and the construction of decision trees based on enriched UI logs to reproduce the results obtained on the original paper on a set of experiments designed to validate the accuracy and efficacy of the framework across varying UI log sizes and interface complexities. Finally, we make an argument that the results reported in our primary work can be considered weakly reproducible.