Empowering conformance checking using Big Data through horizontal decomposition

Conformance checking unleashes the full power of process mining: techniques from this discipline enable the analysis of the quality of a process model through the discovery of event data, the identification of potential deviations, and the projection of real traces onto process models. In this way,...

Descripción completa

Detalles Bibliográficos
Autores: Valencia Parra, Álvaro, Varela Vaca, Ángel Jesús, Gómez López, María Teresa, Carmona Vargas, Josep|||0000-0001-9656-254X, Bergenthum, Robin
Tipo de recurso: artículo
Fecha de publicación:2021
País:España
Institución:Universitat Politècnica de Catalunya (UPC)
Repositorio:UPCommons. Portal del coneixement obert de la UPC
Idioma:inglés
OAI Identifier:oai:upcommons.upc.edu:2117/345143
Acceso en línea:https://hdl.handle.net/2117/345143
https://dx.doi.org/10.1016/j.is.2021.101731
Access Level:acceso abierto
Palabra clave:Big data
Process mining
Data mining
Conformance checking
Decompositional techniques
MapReduce
Dades massives
Mineria de dades
Àrees temàtiques de la UPC::Informàtica::Sistemes d'informació
Descripción
Sumario:Conformance checking unleashes the full power of process mining: techniques from this discipline enable the analysis of the quality of a process model through the discovery of event data, the identification of potential deviations, and the projection of real traces onto process models. In this way, the insights gained from the available event data can be transferred to a richer conceptual level, amenable for human interpretation. Unfortunately, most of the aforementioned functionalities are grounded in an extremely difficult fundamental problem: given an observed trace and a process model, find the model trace that most closely resembles to the trace observed. This paper presents an architecture that supports the creation and distribution of alignment subproblems based on an innovative horizontal acyclic model decomposition, disengaged from the conformance checking algorithm applied for their solution. This is supported by a Big Data infrastructure that facilitates the customised distribution of a gross amount of data. Experiments are provided that testify to the enormous potential of the architecture proposed, thereby opening the door to further research in several directions.