Continuity of optimal values and solutions for control of Markov chains with constraints

We consider in this paper constrained Markov decision processes. This type of control model has many applications in telecommunications and other fields. We address the issue of the convergence of the value and optimal policies of the problem with discounted costs, to the ones for the problem with e...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Tidball, M.M., Lombardi, A., Pourtallier, O., Altman, E.
Formato:	JOUR
Materias:	Control system analysis Convergence of numerical methods Decision theory Markov processes Optimization Robustness (control systems) Sensitivity analysis Constrained Markov decision processes (CMDPs) Ergodic structures Optimal control systems
Acceso en línea:	http://hdl.handle.net/20.500.12110/paper_03630129_v38_n4_p1204_Tidball
Aporte de:	Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA) de Universidad de Buenos Aires

id	todo:paper_03630129_v38_n4_p1204_Tidball
record_format	dspace
spelling	todo:paper_03630129_v38_n4_p1204_Tidball2023-10-03T15:27:28Z Continuity of optimal values and solutions for control of Markov chains with constraints Tidball, M.M. Lombardi, A. Pourtallier, O. Altman, E. Control system analysis Convergence of numerical methods Decision theory Markov processes Optimization Robustness (control systems) Sensitivity analysis Constrained Markov decision processes (CMDPs) Ergodic structures Optimal control systems We consider in this paper constrained Markov decision processes. This type of control model has many applications in telecommunications and other fields. We address the issue of the convergence of the value and optimal policies of the problem with discounted costs, to the ones for the problem with expected average cost. We consider the general multichain ergodic structure. We present two stability results in this paper. We establish the continuity of optimal values and solutions of as well as some type of robustness of some suboptimal solutions in the discount factor. Our proof relies on same general theory on continuity of values and solutions in convex optimization that relies on well-known notions of Γ-convergence. Fil:Lombardi, A. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. JOUR info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/2.5/ar http://hdl.handle.net/20.500.12110/paper_03630129_v38_n4_p1204_Tidball
institution	Universidad de Buenos Aires
institution_str	I-28
repository_str	R-134
collection	Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA)
topic	Control system analysis Convergence of numerical methods Decision theory Markov processes Optimization Robustness (control systems) Sensitivity analysis Constrained Markov decision processes (CMDPs) Ergodic structures Optimal control systems
spellingShingle	Control system analysis Convergence of numerical methods Decision theory Markov processes Optimization Robustness (control systems) Sensitivity analysis Constrained Markov decision processes (CMDPs) Ergodic structures Optimal control systems Tidball, M.M. Lombardi, A. Pourtallier, O. Altman, E. Continuity of optimal values and solutions for control of Markov chains with constraints
topic_facet	Control system analysis Convergence of numerical methods Decision theory Markov processes Optimization Robustness (control systems) Sensitivity analysis Constrained Markov decision processes (CMDPs) Ergodic structures Optimal control systems
description	We consider in this paper constrained Markov decision processes. This type of control model has many applications in telecommunications and other fields. We address the issue of the convergence of the value and optimal policies of the problem with discounted costs, to the ones for the problem with expected average cost. We consider the general multichain ergodic structure. We present two stability results in this paper. We establish the continuity of optimal values and solutions of as well as some type of robustness of some suboptimal solutions in the discount factor. Our proof relies on same general theory on continuity of values and solutions in convex optimization that relies on well-known notions of Γ-convergence.
format	JOUR
author	Tidball, M.M. Lombardi, A. Pourtallier, O. Altman, E.
author_facet	Tidball, M.M. Lombardi, A. Pourtallier, O. Altman, E.
author_sort	Tidball, M.M.
title	Continuity of optimal values and solutions for control of Markov chains with constraints
title_short	Continuity of optimal values and solutions for control of Markov chains with constraints
title_full	Continuity of optimal values and solutions for control of Markov chains with constraints
title_fullStr	Continuity of optimal values and solutions for control of Markov chains with constraints
title_full_unstemmed	Continuity of optimal values and solutions for control of Markov chains with constraints
title_sort	continuity of optimal values and solutions for control of markov chains with constraints
url	http://hdl.handle.net/20.500.12110/paper_03630129_v38_n4_p1204_Tidball
work_keys_str_mv	AT tidballmm continuityofoptimalvaluesandsolutionsforcontrolofmarkovchainswithconstraints AT lombardia continuityofoptimalvaluesandsolutionsforcontrolofmarkovchainswithconstraints AT pourtalliero continuityofoptimalvaluesandsolutionsforcontrolofmarkovchainswithconstraints AT altmane continuityofoptimalvaluesandsolutionsforcontrolofmarkovchainswithconstraints
_version_	1807317498023903232

Continuity of optimal values and solutions for control of Markov chains with constraints

Ejemplares similares