Reference : Voronoi model learning for batch mode reinforcement learning
Reports : External report
Engineering, computing & technology : Computer science
http://hdl.handle.net/2268/103539
Voronoi model learning for batch mode reinforcement learning
English
Fonteneau, Raphaël mailto [Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
Ernst, Damien mailto [Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids >]
2010
University of Liège
[en] Batch mode reinforcement learning
[en] We consider deterministic optimal control problems with continuous state spaces where the information on the system dynamics and the reward function is constrained to a set of system transitions. Each system transition gathers a state, the action taken while being in this state, the immediate reward observed and the next state reached. In such a context, we propose a new model learning--type reinforcement learning (RL) algorithm in batch mode, finite-time and deterministic setting. The algorithm, named Voronoi reinforcement learning (VRL), approximates from a sample of system transitions the system dynamics and the reward function of the optimal control problem using piecewise constant functions on a Voronoi--like partition of the state-action space.
Fonds de la Recherche Scientifique (Communauté française de Belgique) - F.R.S.-FNRS
http://hdl.handle.net/2268/103539

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
technical_report.pdfAuthor postprint199.45 kBView/Open

Bookmark and Share SFX Query

All documents in ORBi are protected by a user license.