Sparse Linear System Solvers on GPUs: Parallel Preconditioning, Workload Balancing, and Communication Reduction
With the breakdown of Dennard scaling in the mid-2000s and the end of Moore's law on the horizon, the high performance computing community is turning its attention towards unconventional accelerator hardware to ensure the continued growth of computational capacity. This dissertation presents se...
| Autor: | |
|---|---|
| Tipo de recurso: | tesis doctoral |
| Estado: | Versión publicada |
| Fecha de publicación: | 2019 |
| País: | España |
| Institución: | CBUC, CESCA |
| Repositorio: | TDR. Tesis Doctorales en Red |
| OAI Identifier: | oai:www.tdx.cat:10803/667096 |
| Acceso en línea: | http://hdl.handle.net/10803/667096 http://dx.doi.org/10.6035/14101.2019.709084 |
| Access Level: | acceso abierto |
| Palabra clave: | High Performance Computing Graphics Processing Units Adaptive Precision Krylov Methods Sparse Matrix-Vector Product Preconditioning Tecnologies de la informació i les comunicacions (TIC) 004 |
| Sumario: | With the breakdown of Dennard scaling in the mid-2000s and the end of Moore's law on the horizon, the high performance computing community is turning its attention towards unconventional accelerator hardware to ensure the continued growth of computational capacity. This dissertation presents several contributions related to the iterative solution of sparse linear systems on the most widely used general purpose accelerator - the Graphics Processing Unit (GPU). Specifically, it accelerates the major building blocks of Krylov solvers, and describes their realization as part of a software library of reusable building blocks. The first part of the dissertation focuses on the sparse matrix-vector product and effective load balancing in the presence of irregular sparsity patterns. The second part describes the design of high-performance preconditioners. Finally, the third part demonstrates the potential of adaptive precision techniques for constructing preconditioners with lower memory footprint, and accuracy comparable to their full precision equivalents. |
|---|