Black box meta-learning intrinsic rewards for sparse-reward environments

Despite the successes and progress of deep reinforcement learning over the last decade, several challenges remain that hinder its broader application. Some fundamental aspects to improve include data efficiency, generalization capability, and ability to learn in sparse-reward environments, which oft...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Pappalardo, Octavio, Santos, Juan M., Ramele, Rodrigo
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2024
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/176190
Aporte de:
id I19-R120-10915-176190
record_format dspace
spelling I19-R120-10915-1761902025-02-06T20:05:46Z http://sedici.unlp.edu.ar/handle/10915/176190 Black box meta-learning intrinsic rewards for sparse-reward environments Pappalardo, Octavio Santos, Juan M. Ramele, Rodrigo 2024-10 2024 2025-02-06T12:26:25Z en Ciencias Informáticas Despite the successes and progress of deep reinforcement learning over the last decade, several challenges remain that hinder its broader application. Some fundamental aspects to improve include data efficiency, generalization capability, and ability to learn in sparse-reward environments, which often require human-designed dense rewards. Meta-learning has emerged as a promising approach to address these issues by optimizing components of the learning algorithm to meet desired characteristics. Additionally, a different line of work has extensively studied the use of intrinsic rewards to enhance the exploration capabilities of algorithms. This work investigates how meta-learning can improve the training signal received by RL agents. The focus is on meta-learning intrinsic rewards under a framework that doesn’t rely on the use of meta-gradients. We analyze and compare this approach to the use of extrinsic rewards and a meta-learned advantage function. The developed algorithms are evaluated on distributions of continuous control tasks with both parametric and non-parametric variations, and with only sparse rewards accessible for the evaluation tasks. Red de Universidades con Carreras en Informática Objeto de conferencia Objeto de conferencia http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) application/pdf 24-33
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
spellingShingle Ciencias Informáticas
Pappalardo, Octavio
Santos, Juan M.
Ramele, Rodrigo
Black box meta-learning intrinsic rewards for sparse-reward environments
topic_facet Ciencias Informáticas
description Despite the successes and progress of deep reinforcement learning over the last decade, several challenges remain that hinder its broader application. Some fundamental aspects to improve include data efficiency, generalization capability, and ability to learn in sparse-reward environments, which often require human-designed dense rewards. Meta-learning has emerged as a promising approach to address these issues by optimizing components of the learning algorithm to meet desired characteristics. Additionally, a different line of work has extensively studied the use of intrinsic rewards to enhance the exploration capabilities of algorithms. This work investigates how meta-learning can improve the training signal received by RL agents. The focus is on meta-learning intrinsic rewards under a framework that doesn’t rely on the use of meta-gradients. We analyze and compare this approach to the use of extrinsic rewards and a meta-learned advantage function. The developed algorithms are evaluated on distributions of continuous control tasks with both parametric and non-parametric variations, and with only sparse rewards accessible for the evaluation tasks.
format Objeto de conferencia
Objeto de conferencia
author Pappalardo, Octavio
Santos, Juan M.
Ramele, Rodrigo
author_facet Pappalardo, Octavio
Santos, Juan M.
Ramele, Rodrigo
author_sort Pappalardo, Octavio
title Black box meta-learning intrinsic rewards for sparse-reward environments
title_short Black box meta-learning intrinsic rewards for sparse-reward environments
title_full Black box meta-learning intrinsic rewards for sparse-reward environments
title_fullStr Black box meta-learning intrinsic rewards for sparse-reward environments
title_full_unstemmed Black box meta-learning intrinsic rewards for sparse-reward environments
title_sort black box meta-learning intrinsic rewards for sparse-reward environments
publishDate 2024
url http://sedici.unlp.edu.ar/handle/10915/176190
work_keys_str_mv AT pappalardooctavio blackboxmetalearningintrinsicrewardsforsparserewardenvironments
AT santosjuanm blackboxmetalearningintrinsicrewardsforsparserewardenvironments
AT ramelerodrigo blackboxmetalearningintrinsicrewardsforsparserewardenvironments
_version_ 1845116771686678528