Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance

Inquiring about different ways to reduce energy consumption during the execution of large-scale applications is essential to maintain and increase the enormous computing power achieved in HPC systems. Fault tolerance methods can have an impact on power consumption. In particular, rollback-recovery...

Descripción completa

Detalles Bibliográficos
Autores principales: Morán, Marina, Balladini, Javier, Rexachs del Rosario, Dolores, Rucci, Enzo
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2022
Materias:
HPC
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/140642
Aporte de:
id I19-R120-10915-140642
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
Energy consumption
Fault tolerance
Uncoordinated checkpoints
HPC
spellingShingle Ciencias Informáticas
Energy consumption
Fault tolerance
Uncoordinated checkpoints
HPC
Morán, Marina
Balladini, Javier
Rexachs del Rosario, Dolores
Rucci, Enzo
Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance
topic_facet Ciencias Informáticas
Energy consumption
Fault tolerance
Uncoordinated checkpoints
HPC
description Inquiring about different ways to reduce energy consumption during the execution of large-scale applications is essential to maintain and increase the enormous computing power achieved in HPC systems. Fault tolerance methods can have an impact on power consumption. In particular, rollback-recovery methods using uncoordinated checkpoints prevent all processes from re-executing in the event of a failure. In this context, it is possible to take actions on the nodes of the processes that do not re-execute to reduce energy consumption. In this work, we describe some issues to consider when we extend the application of energy-saving strategies beyond the nodes that communicate directly with the failed one.
format Objeto de conferencia
Objeto de conferencia
author Morán, Marina
Balladini, Javier
Rexachs del Rosario, Dolores
Rucci, Enzo
author_facet Morán, Marina
Balladini, Javier
Rexachs del Rosario, Dolores
Rucci, Enzo
author_sort Morán, Marina
title Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance
title_short Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance
title_full Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance
title_fullStr Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance
title_full_unstemmed Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance
title_sort some issues to consider in the management of energy consumption in hpc systems with fault tolerance
publishDate 2022
url http://sedici.unlp.edu.ar/handle/10915/140642
work_keys_str_mv AT moranmarina someissuestoconsiderinthemanagementofenergyconsumptioninhpcsystemswithfaulttolerance
AT balladinijavier someissuestoconsiderinthemanagementofenergyconsumptioninhpcsystemswithfaulttolerance
AT rexachsdelrosariodolores someissuestoconsiderinthemanagementofenergyconsumptioninhpcsystemswithfaulttolerance
AT ruccienzo someissuestoconsiderinthemanagementofenergyconsumptioninhpcsystemswithfaulttolerance
bdutipo_str Repositorios
_version_ 1764820459258380292