|Reference : Machine Learning Solution Methods for Multistage Stochastic Programming|
|Dissertations and theses : Doctoral thesis|
|Engineering, computing & technology : Computer science|
|Machine Learning Solution Methods for Multistage Stochastic Programming|
|Defourny, Boris [Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Dép. d'électric., électron. et informat. (Inst.Montefiore) >]|
|University of Liège|
|Doctorat en sciences de l'ingénieur|
|[en] Sequential decision making under uncertainty ; scenario-tree methods ; supervised learning|
|[en] This thesis investigates the following question: Can supervised learning techniques be successfully used for finding better solutions to multistage stochastic programs? A similar question had already been posed in the context of reinforcement learning, and had led to algorithmic and conceptual advances in the field of approximate value function methods over the years. This thesis identifies several ways to exploit the combination "multistage stochastic programming/supervised learning" for sequential decision making under uncertainty.
Multistage stochastic programming is essentially the extension of stochastic programming to several recourse stages. After an introduction to multistage stochastic programming and a summary of existing approximation approaches based on scenario trees, this thesis mainly focusses on the use of supervised learning for building decision policies from scenario-tree approximations.
Two ways of exploiting learned policies in the context of the practical issues posed by the multistage stochastic programming framework are explored: the fast evaluation of performance guarantees for a given approximation, and the selection of good scenario trees. The computational efficiency of the approach allows novel investigations relative to the construction of scenario trees, from which novel insights, solution approaches and algorithms are derived. For instance, we generate and select scenario trees with random branching structures for problems over large planning horizons. Our experiments on the empirical performances of learned policies, compared to golden-standard policies, suggest that the combination of stochastic programming and machine learning techniques could also constitute a method per se for sequential decision making under uncertainty, inasmuch as learned policies are simple to use, and come with performance guarantees that can actually be quite good.
Finally, limitations of approaches that build an explicit model to represent an optimal solution mapping are studied in a simple parametric programming setting, and various insights regarding this issue are obtained.
|Systems and Modeling|
|Researchers ; Professionals ; Students|
|File(s) associated to this reference|
All documents in ORBi are protected by a user license.