Contribution to the study and the design of reinforcement functions

Mostrar todas las versiones(5)

We have studied the Reinforcement Function Design Process in two steps. For the first one we have considered the translation of a natural language description into an instance of our proposed Reinforcement Function General Expression. For the second step, we have gone deeply into the tuning of the p...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autor principal:	Santos, Juan Miguel
Formato:	Articulo
Lenguaje:	Inglés
Publicado:	2002
Materias:	Ciencias Informáticas Reinforcement function Reinforcement learning robot learning autonomous robot behavior-based approach
Acceso en línea:	http://sedici.unlp.edu.ar/handle/10915/135299 https://publicaciones.sadio.org.ar/index.php/EJS/article/view/111
Aporte de:	SEDICI (UNLP) de Universidad Nacional de La Plata

id	I19-R120-10915-135299
record_format	dspace
institution	Universidad Nacional de La Plata
institution_str	I-19
repository_str	R-120
collection	SEDICI (UNLP)
language	Inglés
topic	Ciencias Informáticas Reinforcement function Reinforcement learning robot learning autonomous robot behavior-based approach
spellingShingle	Ciencias Informáticas Reinforcement function Reinforcement learning robot learning autonomous robot behavior-based approach Santos, Juan Miguel Contribution to the study and the design of reinforcement functions
topic_facet	Ciencias Informáticas Reinforcement function Reinforcement learning robot learning autonomous robot behavior-based approach
description	We have studied the Reinforcement Function Design Process in two steps. For the first one we have considered the translation of a natural language description into an instance of our proposed Reinforcement Function General Expression. For the second step, we have gone deeply into the tuning of the parameters in this expression. It allowed us to obtain optimal definitions of the reinforcement function (relative to exploration). Since the General Expression is based on constraints, we have indentified them according to the type of state variable estimator on which they act, in particular: position and velocity.Using a particular, but representative Reinforcement Function (RF) expression, we study the relation between the Sum of each reinforcement type and the RF parameters during the exploration phase of the learning. For linear relations, we propose an analytic method to obtain the RF parameters values (no experimentation requires). For non-linear, but monotonous relations, we propose the Update Parameter Algorithm (UPA) and show that UPA can efficiently adjust the proportion of negative and positive reinforcements received during the exploratory phase of the learning. Additionally, we study the feasibility and consequences of adapting the RF during the learning process so as to improve the learning convergence of the system. Dynamic-UPA allows the whole learning process to maintain a desired ratio of positive and negative rewards. Thus, we introduce an approach to undertake the exploration-exploitation dilemma - a necessary step for efficient Reinforcement Learning. We show, with several experiments involving robots (mobile and arm), the performance of the proposed design methods. Finally, we emphasize the main conclusions and present some future directions of research.
format	Articulo Articulo
author	Santos, Juan Miguel
author_facet	Santos, Juan Miguel
author_sort	Santos, Juan Miguel
title	Contribution to the study and the design of reinforcement functions
title_short	Contribution to the study and the design of reinforcement functions
title_full	Contribution to the study and the design of reinforcement functions
title_fullStr	Contribution to the study and the design of reinforcement functions
title_full_unstemmed	Contribution to the study and the design of reinforcement functions
title_sort	contribution to the study and the design of reinforcement functions
publishDate	2002
url	http://sedici.unlp.edu.ar/handle/10915/135299 https://publicaciones.sadio.org.ar/index.php/EJS/article/view/111
work_keys_str_mv	AT santosjuanmiguel contributiontothestudyandthedesignofreinforcementfunctions
bdutipo_str	Repositorios
_version_	1764820456559345664

Contribution to the study and the design of reinforcement functions

Ejemplares similares