Asynchronous runtime with distributed manager for task-based programming models

Parallel task-based programming models, like OpenMP, allow application developers to easily create a parallel version of their sequential codes. The standard OpenMP 4.0 introduced the possibility of describing a set of data dependences per task that the runtime uses to order the tasks execution. Thi...

Descripción completa

Detalles Bibliográficos
Autores: Bosch Pons, Jaume|||0000-0002-4040-3416, Álvarez Martínez, Carlos|||0000-0003-0536-5183, Jiménez González, Daniel|||0000-0001-6064-7883, Martorell Bofill, Xavier|||0000-0002-0417-3430, Ayguadé Parra, Eduard|||0000-0002-5146-103X
Tipo de recurso: artículo
Fecha de publicación:2020
País:España
Institución:Universitat Politècnica de Catalunya (UPC)
Repositorio:UPCommons. Portal del coneixement obert de la UPC
Idioma:inglés
OAI Identifier:oai:upcommons.upc.edu:2117/330058
Acceso en línea:https://hdl.handle.net/2117/330058
https://dx.doi.org/10.1016/j.parco.2020.102664
Access Level:acceso abierto
Palabra clave:Parallel programming (Computer science)
Application program interfaces (Computer software)
OmpSs
OpenMP
Task-based
Task-graph
Dependence manager
Runtime
Programació en paral·lel (Informàtica)
Interfícies de programació d'aplicacions (Programari)
Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
Descripción
Sumario:Parallel task-based programming models, like OpenMP, allow application developers to easily create a parallel version of their sequential codes. The standard OpenMP 4.0 introduced the possibility of describing a set of data dependences per task that the runtime uses to order the tasks execution. This order is calculated using shared graphs, which are updated by all threads in exclusive access using synchronization mechanisms (locks) to ensure the dependence management correctness. The contention in the access to these structures becomes critical in many-core systems because several threads may be wasting computation resources waiting their turn. This paper proposes an asynchronous management of the runtime structures, like task dependence graphs, suitable for task-based programming model runtimes. In such organization, the threads request actions to the runtime instead of doing them directly. The requests are then handled by a distributed runtime manager (DDAST) which does not require dedicated resources. Instead, the manager uses the idle threads to modify the runtime structures. The paper also presents an implementation, analysis and performance evaluation of such runtime organization. The performance results show that the proposed asynchronous organization outperforms the speedup obtained by the original runtime for different benchmarks and different many-core architectures.