Reference : Relaxation schemes for min max generalization in deterministic batch mode reinforcement ...
Scientific congresses and symposiums : Unpublished conference
Engineering, computing & technology : Computer science
http://hdl.handle.net/2268/103489
Relaxation schemes for min max generalization in deterministic batch mode reinforcement learning
English
Fonteneau, Raphaël mailto [Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
Ernst, Damien mailto [Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids >]
Boigelot, Bernard mailto [Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Informatique >]
Louveaux, Quentin mailto [Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Système et modélisation : Optimisation discrète >]
Dec-2011
No
International
4th International NIPS Workshop on Optimization for Machine Learning (OPT 2011)
December 16th, 2011
Sierra Nevada
Spain
[en] Batch mode reinforcement learning ; Min max generalization ; Non-convex optimization
[en] We study the min max optimization problem introduced in [Fonteneau, 2011] for computing policies for batch mode reinforcement learning in a deterministic setting. This problem is NP-hard. We focus on the two-stage case for which we provide two relaxation schemes. The first relaxation scheme works by dropping some constraints in order to obtain a problem that is solvable in polynomial time. The second relaxation scheme, based on a Lagrangian relaxation where all constraints are dualized, leads to a conic quadratic programming problem. Both relaxation schemes are shown to provide better results than those given in [Fonteneau, 2011].
Fonds de la Recherche Scientifique (Communauté française de Belgique) - F.R.S.-FNRS
http://hdl.handle.net/2268/103489

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
nips2011.pdfAuthor postprint189.14 kBView/Open

Bookmark and Share SFX Query

All documents in ORBi are protected by a user license.