References of "François-Lavet, Vincent"
     in
Bookmark and Share    
Full Text
See detailContributions to deep reinforcement learning and its applications in smartgrids
François-Lavet, Vincent ULiege

Doctoral thesis (2017)

Reinforcement learning and its extension with deep learning have led to a field of research called deep reinforcement learning. Applications of that research have recently shown the possibility to solve ... [more ▼]

Reinforcement learning and its extension with deep learning have led to a field of research called deep reinforcement learning. Applications of that research have recently shown the possibility to solve complex decision-making tasks that were previously believed extremely difficult for a computer. Yet, deep reinforcement learning requires caution and understanding of its inner mechanisms in order to be applied successfully in the different settings. As an introduction, we provide a general overview of the field of deep reinforcement learning. In the first part of this thesis, we provide an analysis of reinforcement learning in the particular setting of a limited amount of data and in the general context of partial observability. In this setting, we focus on the tradeoff between asymptotic bias (suboptimality with unlimited data) and overfitting (additional suboptimality due to limited data), and theoretically show that while potentially increasing the asymptotic bias, a smaller state representation decreases the risk of overfitting. An original theoretical contribution relies on expressing the quality of a state representation by bounding $L_1$ error terms of the associated belief states. We also discuss and empirically illustrate the role of other parameters to optimize the bias-overfitting tradeoff: the function approximator (in particular deep learning) and the discount factor. In addition, we investigate the specific case of the discount factor in the deep reinforcement learning setting case where additional data can be gathered through learning. In the second part of this thesis, we focus on a smartgrids application that falls in the context of a partially observable problem and where a limited amount of data is available (as studied in the first part of the thesis). We consider the case of microgrids featuring photovoltaic panels (PV) associated with both long-term (hydrogen) and short-term (batteries) storage devices. We propose a novel formalization of the problem of building and operating microgrids interacting with their surrounding environment. In the deterministic assumption, we show how to optimally operate and size microgrids using linear programming techniques. We then show how to use deep reinforcement learning to solve the operation of microgrids under uncertainty where, at every time-step, the uncertainty comes from the lack of knowledge about future electricity consumption and weather dependent PV production. [less ▲]

Detailed reference viewed: 59 (8 ULiège)
Full Text
Peer Reviewed
See detailOn overfitting and asymptotic bias in batch reinforcement learning with partial observability
François-Lavet, Vincent ULiege; Ernst, Damien ULiege; Fonteneau, Raphaël ULiege

E-print/Working paper (2017)

This paper stands in the context of reinforcement learning with partial observability and limited data. In this setting, we focus on the tradeoff between asymptotic bias (suboptimality with unlimited data ... [more ▼]

This paper stands in the context of reinforcement learning with partial observability and limited data. In this setting, we focus on the tradeoff between asymptotic bias (suboptimality with unlimited data) and overfitting (additional suboptimality due to limited data), and theoretically show that while potentially increasing the asymptotic bias, a smaller state representation decreases the risk of overfitting. Our analysis relies on expressing the quality of a state representation by bounding L1 error terms of the associated belief states. Theoretical results are empirically illustrated when the state representation is a truncated history of observations. Finally, we also discuss and empirically illustrate how using function approximators and adapting the discount factor may enhance the tradeoff between asymptotic bias and overfitting. [less ▲]

Detailed reference viewed: 80 (4 ULiège)
Full Text
Peer Reviewed
See detailApproximate Bayes Optimal Policy Search using Neural Networks
Castronovo, Michaël ULiege; François-Lavet, Vincent ULiege; Fonteneau, Raphaël ULiege et al

in Proceedings of the 9th International Conference on Agents and Artificial Intelligence (ICAART 2017) (2017, February)

Bayesian Reinforcement Learning (BRL) agents aim to maximise the expected collected rewards obtained when interacting with an unknown Markov Decision Process (MDP) while using some prior knowledge. State ... [more ▼]

Bayesian Reinforcement Learning (BRL) agents aim to maximise the expected collected rewards obtained when interacting with an unknown Markov Decision Process (MDP) while using some prior knowledge. State-of-the-art BRL agents rely on frequent updates of the belief on the MDP, as new observations of the environment are made. This offers theoretical guarantees to converge to an optimum, but is computationally intractable, even on small-scale problems. In this paper, we present a method that circumvents this issue by training a parametric policy able to recommend an action directly from raw observations. Artificial Neural Networks (ANNs) are used to represent this policy, and are trained on the trajectories sampled from the prior. The trained model is then used online, and is able to act on the real MDP at a very low computational cost. Our new algorithm shows strong empirical performance, on a wide range of test problems, and is robust to inaccuracies of the prior distribution. [less ▲]

Detailed reference viewed: 548 (17 ULiège)
Full Text
Peer Reviewed
See detailDeep Reinforcement Learning Solutions for Energy Microgrids Management
François-Lavet, Vincent ULiege; Taralla, David; Ernst, Damien ULiege et al

in European Workshop on Reinforcement Learning (EWRL 2016) (2016, December)

This paper addresses the problem of efficiently operating the storage devices in an electricity microgrid featuring photovoltaic (PV) panels with both short- and long-term storage capacities. The problem ... [more ▼]

This paper addresses the problem of efficiently operating the storage devices in an electricity microgrid featuring photovoltaic (PV) panels with both short- and long-term storage capacities. The problem of optimally activating the storage devices is formulated as a sequential decision making problem under uncertainty where, at every time-step, the uncertainty comes from the lack of knowledge about future electricity consumption and weather dependent PV production. This paper proposes to address this problem using deep reinforcement learning. To this purpose, a specific deep learning architecture has been designed in order to extract knowledge from past consumption and production time series as well as any available forecasts. The approach is empirically illustrated in the case of a residential customer located in Belgium. [less ▲]

Detailed reference viewed: 594 (19 ULiège)
Full Text
Peer Reviewed
See detailTowards the Minimization of the Levelized Energy Costs of Microgrids using both Long-term and Short-term Storage Devices
François-Lavet, Vincent ULiege; Gemine, Quentin ULiege; Ernst, Damien ULiege et al

in Smart Grid: Networking, Data Management, and Business Models (2016)

This chapter falls within the context of the optimization of the levelized energy cost (LEC) of microgrids featuring photovoltaic panels (PV) associated with both long-term (hydrogen) and short-term ... [more ▼]

This chapter falls within the context of the optimization of the levelized energy cost (LEC) of microgrids featuring photovoltaic panels (PV) associated with both long-term (hydrogen) and short-term (batteries) storage devices. First, we propose a novel formalization of the problem of building and operating microgrids interacting with their surrounding environment. Then we show how to optimally operate a microgrid using linear programming techniques in the context where the consumption and the production are known. It appears that this optimization technique can also be used to address the problem of optimal sizing of the microgrid, for which we propose a robust approach. These contributions are illustrated in two different settings corresponding to Belgian and Spanish data. [less ▲]

Detailed reference viewed: 302 (22 ULiège)
Full Text
Peer Reviewed
See detailImitative Learning for Online Planning in Microgrids
Aittahar, Samy ULiege; François-Lavet, Vincent ULiege; Lodeweyckx, Stefan et al

in Woon, Wei Lee; Zeyar, Aung; Stuart, Madnick (Eds.) Data Analytics for Renewable Energy Integration (2015, December 15)

This paper aims to design an algorithm dedicated to operational planning for microgrids in the challenging case where the scenarios of production and consumption are not known in advance. Using expert ... [more ▼]

This paper aims to design an algorithm dedicated to operational planning for microgrids in the challenging case where the scenarios of production and consumption are not known in advance. Using expert knowledge obtained from solving a family of linear programs, we build a learning set for training a decision-making agent. The empirical performances in terms of Levelized Energy Cost (LEC) of the obtained agent are compared to the expert performances obtained in the case where the scenarios are known in advance. Preliminary results are promising. [less ▲]

Detailed reference viewed: 147 (20 ULiège)
Full Text
Peer Reviewed
See detailHow to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies
François-Lavet, Vincent ULiege; Fonteneau, Raphaël ULiege; Ernst, Damien ULiege

in NIPS 2015 Workshop on Deep Reinforcement Learning (2015, December)

Using deep neural nets as function approximator for reinforcement learning tasks have recently been shown to be very powerful for solving problems approaching real-world complexity. Using these results as ... [more ▼]

Using deep neural nets as function approximator for reinforcement learning tasks have recently been shown to be very powerful for solving problems approaching real-world complexity. Using these results as a benchmark, we discuss the role that the discount factor may play in the quality of the learning process of a deep Q-network (DQN). When the discount factor progressively increases up to its final value, we empirically show that it is possible to significantly reduce the number of learning steps. When used in conjunction with a varying learning rate, we empirically show that it outperforms original DQN on several experiments. We relate this phenomenon with the instabilities of neural networks when they are used in an approximate Dynamic Programming setting. We also describe the possibility to fall within a local optimum during the learning process, thus connecting our discussion with the exploration/exploitation dilemma. [less ▲]

Detailed reference viewed: 140 (14 ULiège)
Full Text
Peer Reviewed
See detailElectricity storage with liquid fuels in a zone powered by 100% variable renewables
Léonard, Grégoire ULiege; François-Lavet, Vincent ULiege; Ernst, Damien ULiege et al

in Proceedings of the 12th International Conference on the European Energy Market - EEM15 (2015)

In this work, an electricity zone with 100% renewables is simulated to determine the optimal sizing of generation and storage capacities in such a zone. Using actual wind output data, the model evaluates ... [more ▼]

In this work, an electricity zone with 100% renewables is simulated to determine the optimal sizing of generation and storage capacities in such a zone. Using actual wind output data, the model evaluates the economic viability of a power-to-fuel storage technology that combines water electrolysis, CO2 capture and methanol synthesis. The main advantage of using methanol as an energy carrier is that liquid fuels are suitable for (long-term) energy storage thanks to their high energy density. The levelized electricity cost projection by 2050 equals 83.4 €/MWh in the base case configuration. The effects of storage round-trip efficiency and the storage unit lifetime are quantified and their impacts on the electricity cost discussed. Additional benefits of using methanol as a fuel substitute may be taken into account in further work. [less ▲]

Detailed reference viewed: 765 (32 ULiège)
Full Text
Peer Reviewed
See detailUsing approximate dynamic programming for estimating the revenues of a hydrogen-based high-capacity storage device
François-Lavet, Vincent ULiege; Fonteneau, Raphaël ULiege; Ernst, Damien ULiege

in IEEE Symposium Series on Computational Intelligence (2014)

This paper proposes a methodology to estimate the maximum revenue that can be generated by a company that operates a high-capacity storage device to buy or sell electricity on the day-ahead electricity ... [more ▼]

This paper proposes a methodology to estimate the maximum revenue that can be generated by a company that operates a high-capacity storage device to buy or sell electricity on the day-ahead electricity market. The methodology exploits the Dynamic Programming (DP) principle and is specified for hydrogen-based storage devices that use electrolysis to produce hydrogen and fuel cells to generate electricity from hydrogen. Experimental results are generated using historical data of energy prices on the Belgian market. They show how the storage capacity and other parameters of the storage device influence the optimal revenue. The main conclusion drawn from the experiments is that it may be advisable to invest in large storage tanks to exploit the inter-seasonal price fluctuations of electricity. [less ▲]

Detailed reference viewed: 70 (12 ULiège)
Full Text
Peer Reviewed
See detailSimple connectome inference from partial correlation statistics in calcium imaging
Sutera, Antonio ULiege; Joly, Arnaud ULiege; François-Lavet, Vincent ULiege et al

in Soriano, Jordi; Battaglia, Demian; Guyon, Isabelle (Eds.) et al Neural Connectomics Challenge (2014)

In this work, we propose a simple yet effective solution to the problem of connectome inference in calcium imaging data. The proposed algorithm consists of two steps. First, processing the raw signals to ... [more ▼]

In this work, we propose a simple yet effective solution to the problem of connectome inference in calcium imaging data. The proposed algorithm consists of two steps. First, processing the raw signals to detect neural peak activities. Second, inferring the degree of association between neurons from partial correlation statistics. This paper summarises the methodology that led us to win the Connectomics Challenge, proposes a simplified version of our method, and finally compares our results with respect to other inference methods. [less ▲]

Detailed reference viewed: 803 (126 ULiège)
Full Text
Peer Reviewed
See detailEstimating the revenues of a hydrogen-based high-capacity storage device: methodology and results
François-Lavet, Vincent ULiege; Fonteneau, Raphaël ULiege; Ernst, Damien ULiege

in Proceedings des 9èmes Journée Francophones de Planification, Décision et Apprentissage (2014, May)

This paper proposes a methodology to estimate the maximum revenue that can be generated by a company that operates a high-capacity storage device to buy or sell electricity on the day-ahead electricity ... [more ▼]

This paper proposes a methodology to estimate the maximum revenue that can be generated by a company that operates a high-capacity storage device to buy or sell electricity on the day-ahead electricity market. The methodology exploits the Dynamic Programming (DP) principle and is specified for hydrogen-based storage devices that use electrolysis to produce hydrogen and fuel cells to generate electricity from hydrogen. Experimental results are generated using historical data of energy prices on the Belgian market. They show how the storage capacity and other parameters of the storage device influence the optimal revenue. The main conclusion drawn from the experiments is that it may be interesting to invest in large storage tanks to exploit the inter-seasonal price fluctuations of electricity. [less ▲]

Detailed reference viewed: 96 (16 ULiège)
Full Text
Peer Reviewed
See detailAn Energy-Based Variational Model of Ferromagnetic Hysteresis for Finite Element Computations
François-Lavet, Vincent ULiege; Henrotte, François; Stainier, Laurent ULiege et al

in Journal of Computational & Applied Mathematics (2013), 246

This paper proposes a macroscopic model for ferromagnetic hysteresis that is well-suited for finite element implementation. The model is readily vectorial and relies on a consistent thermodynamic ... [more ▼]

This paper proposes a macroscopic model for ferromagnetic hysteresis that is well-suited for finite element implementation. The model is readily vectorial and relies on a consistent thermodynamic formulation. In particular, the stored magnetic energy and the dissipated energy are known at all times, and not solely after the completion of closed hysteresis loops as is usually the case. The obtained incremental formulation is variationally consistent, i.e., all internal variables follow from the minimization of a thermodynamic potential. [less ▲]

Detailed reference viewed: 277 (28 ULiège)
Full Text
Peer Reviewed
See detailVectorial Incremental Nonconservative Consistent Hysteresis model
François-Lavet, Vincent ULiege; Henrotte, François; Stainier, Laurent ULiege et al

in Hogge, Michel; Van Keer, Roger; Malengier, Benny (Eds.) et al Proceedings of the 5th International Conference on Advanded COmputational Methods in Engineering (ACOMEN2011) (2011, November)

This paper proposes a macroscopic model for ferromagnetic hysteresis that is well-suited for finite element implementation. The model is readily vectorial and relies on a consistent thermodynamic ... [more ▼]

This paper proposes a macroscopic model for ferromagnetic hysteresis that is well-suited for finite element implementation. The model is readily vectorial and relies on a consistent thermodynamic formulation. In particular, the stored magnetic energy and the dissipated energy are known at all times, and not solely after the completion of closed hysteresis loops as is usually the case. The obtained incremental formulation is variationally consistent, i.e., all internal variables follow from the minimization of a thermodynamic potential. [less ▲]

Detailed reference viewed: 125 (10 ULiège)