Towards resilient EU HPC systems: A blueprint
This document aims to spearhead a Europe-wide discussion on HPC system resilience and to help the European HPC community define best practices for resilience. We analyse a wide range of state-of-the-art resilience mechanisms and recommend the most effective approaches to employ in large-scale HPC sy...
| Autores: | , , , , , , , , , , , |
|---|---|
| Tipo de recurso: | informe técnico |
| Fecha de publicación: | 2020 |
| País: | España |
| Institución: | Universitat Politècnica de Catalunya (UPC) |
| Repositorio: | UPCommons. Portal del coneixement obert de la UPC |
| Idioma: | inglés |
| OAI Identifier: | oai:upcommons.upc.edu:2117/330695 |
| Acceso en línea: | https://hdl.handle.net/2117/330695 |
| Access Level: | acceso abierto |
| Palabra clave: | High performance computing -- Europe Càlcul intensiu (Informàtica) -- Europa Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors |
| Sumario: | This document aims to spearhead a Europe-wide discussion on HPC system resilience and to help the European HPC community define best practices for resilience. We analyse a wide range of state-of-the-art resilience mechanisms and recommend the most effective approaches to employ in large-scale HPC systems. Our guidelines will be useful in the allocation of available resources, as well as guiding researchers and research funding towards the enhancement of resilience approaches with the highest priority and utility. Although our work is focused on the needs of next generation HPC systems in Europe, the principles and evaluations are applicable globally. |
|---|