The MPI/OmpSs parallel programming model

Marjanović, Vladimir

The MPI/OmpSs parallel programming model

Even today supercomputing systems have already reached millions of cores for a single machine, which are connected by using a complex network interconnection. Reducing communication time across processes becomes the most important issue in order to achieve the highest possible performance. The Messa...

Descripción completa

Detalles Bibliográficos
Autor:	Marjanović, Vladimir
Tipo de recurso:	tesis doctoral
Estado:	Versión publicada
Fecha de publicación:	2016
País:	España
Institución:	CBUC, CESCA
Repositorio:	TDR. Tesis Doctorales en Red
OAI Identifier:	oai:www.tdx.cat:10803/398135
Acceso en línea:	http://hdl.handle.net/10803/398135 https://dx.doi.org/10.5821/dissertation-2117-98109
Access Level:	acceso abierto
Palabra clave:	Àrees temàtiques de la UPC::Informàtica 004

Descripción
Sumario:	Even today supercomputing systems have already reached millions of cores for a single machine, which are connected by using a complex network interconnection. Reducing communication time across processes becomes the most important issue in order to achieve the highest possible performance. The Message Passing Interface (MPI), which is the most widely used programming model for large distributed memory, supports asynchronous communication primitives for overlapping communication and computation. However, these primitives are difficult to use and increase code complexity. which then requiring more development effort and making less readable programs. This thesis presents a new programming model, which allows the programmer to easily introduce the asynchrony necessary to overlap communication and computation. The proposed programming model is based on MPI and tasked based shared memory framework, namely OmpSs. The thesis further describes implementation details which in order to allow efficient inter-operation of the OmpSs runtime and MPI. The thesis demonstrates the hybrid use of MPI/OmpSs with several applications of which the HPL benchmark is the most important case study. The hybrid MPI/OmpSs versions significantly improve the performance of the applications compared with their pure MPI counterparts. For the HPL we get close to the asymptotic performance at relatively small problem sizes and still get significant benefits at large problem sizes. In addition, the hybrid MPI/OmpSs approach substantially reduces code complexity and is less sensitive to network bandwidth and operating system noise than the pure MPI versions. In addition, the thesis analyzes and compares current techniques for overlapping computation and collective communication, including approaches using point-to-point communications and additional communication threads, respectively. The thesis stresses the importance of understanding the characteristic of a computational kernel that runs concurrently with communication. Experimental evaluations is done using the Communication Computation Concurrent (CCUBE) synthetic benchmark, developed in this thesis, as well as the HPL.

The MPI/OmpSs parallel programming model

Similares en LA Referencia