Browse ORBi by ORBi project

- Background
- Content
- Benefits and challenges
- Legal aspects
- Functions and services
- Team
- Help and tutorials

A Gaussian mixture approach to model stochastic processes in power systems Gemine, Quentin ; Cornélusse, Bertrand ; Glavic, Mevludin et al in Proceedings of the 19th Power Systems Computation Conference (PSCC'16) (in press) Probabilistic methods are emerging for operating electrical networks, driven by the integration of renewable generation. We present an algorithm that models a stochastic process as a Markov process using ... [more ▼] Probabilistic methods are emerging for operating electrical networks, driven by the integration of renewable generation. We present an algorithm that models a stochastic process as a Markov process using a multivariate Gaussian Mixture Model, as well as a model selection technique to search for the adequate Markov order and number of components. The main motivation is to sample future trajectories of these processes from their last available observations (i.e. measurements). An accurate model that can generate these synthetic trajectories is critical for applications such as security analysis or decision making based on lookahead models. The proposed approach is evaluated in a lookahead security analysis framework, i.e. by estimating the probability of future system states to respect operational constraints. The evaluation is performed using a 33-bus distribution test system, for power consumption and wind speed processes. Empirical results show that the GMM approach slightly outperforms an ARMA approach. [less ▲] Detailed reference viewed: 57 (9 ULg)Towards the Minimization of the Levelized Energy Costs of Microgrids using both Long-term and Short-term Storage Devices François-Lavet, Vincent ; Gemine, Quentin ; Ernst, Damien et al in Smart Grid: Networking, Data Management, and Business Models (2016) This chapter falls within the context of the optimization of the levelized energy cost (LEC) of microgrids featuring photovoltaic panels (PV) associated with both long-term (hydrogen) and short-term ... [more ▼] This chapter falls within the context of the optimization of the levelized energy cost (LEC) of microgrids featuring photovoltaic panels (PV) associated with both long-term (hydrogen) and short-term (batteries) storage devices. First, we propose a novel formalization of the problem of building and operating microgrids interacting with their surrounding environment. Then we show how to optimally operate a microgrid using linear programming techniques in the context where the consumption and the production are known. It appears that this optimization technique can also be used to address the problem of optimal sizing of the microgrid, for which we propose a robust approach. These contributions are illustrated in two different settings corresponding to Belgian and Spanish data. [less ▲] Detailed reference viewed: 56 (11 ULg)Artificial Intelligence and Energy Cornélusse, Bertrand ; Fonteneau, Raphaël Conference (2016, February 02) Detailed reference viewed: 80 (8 ULg)Decision Making from Confidence Measurement on the Reward Growth using Supervised Learning: A Study Intended for Large-Scale Video Games Taralla, David ; Qiu, Zixiao ; Sutera, Antonio et al in Proceedings of the 8th International Conference on Agents and Artificial Intelligence (ICAART 2016) - Volume 2 (2016, February) Video games have become more and more complex over the past decades. Today, players wander in visually and option- rich environments, and each choice they make, at any given time, can have a combinatorial ... [more ▼] Video games have become more and more complex over the past decades. Today, players wander in visually and option- rich environments, and each choice they make, at any given time, can have a combinatorial number of consequences. However, modern artificial intelligence is still usually hard-coded, and as the game environments become increasingly complex, this hard-coding becomes exponentially difficult. Recent research works started to let video game autonomous agents learn instead of being taught, which makes them more intelligent. This contribution falls under this very perspective, as it aims to develop a framework for the generic design of autonomous agents for large-scale video games. We consider a class of games for which expert knowledge is available to define a state quality function that gives how close an agent is from its objective. The decision making policy is based on a confidence measurement on the growth of the state quality function, computed by a supervised learning classification model. Additionally, no stratagems aiming to reduce the action space are used. As a proof of concept, we tested this simple approach on the collectible card game Hearthstone and obtained encouraging results. [less ▲] Detailed reference viewed: 240 (19 ULg)Imitative Learning for Online Planning in Microgrids Aittahar, Samy ; François-Lavet, Vincent ; et al in Woon, Wei Lee; Zeyar, Aung; Stuart, Madnick (Eds.) Data Analytics for Renewable Energy Integration (2015, December 15) This paper aims to design an algorithm dedicated to operational planning for microgrids in the challenging case where the scenarios of production and consumption are not known in advance. Using expert ... [more ▼] This paper aims to design an algorithm dedicated to operational planning for microgrids in the challenging case where the scenarios of production and consumption are not known in advance. Using expert knowledge obtained from solving a family of linear programs, we build a learning set for training a decision-making agent. The empirical performances in terms of Levelized Energy Cost (LEC) of the obtained agent are compared to the expert performances obtained in the case where the scenarios are known in advance. Preliminary results are promising. [less ▲] Detailed reference viewed: 105 (48 ULg)How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies François-Lavet, Vincent ; Fonteneau, Raphaël ; Ernst, Damien in NIPS 2015 Workshop on Deep Reinforcement Learning (2015, December) Using deep neural nets as function approximator for reinforcement learning tasks have recently been shown to be very powerful for solving problems approaching real-world complexity. Using these results as ... [more ▼] Using deep neural nets as function approximator for reinforcement learning tasks have recently been shown to be very powerful for solving problems approaching real-world complexity. Using these results as a benchmark, we discuss the role that the discount factor may play in the quality of the learning process of a deep Q-network (DQN). When the discount factor progressively increases up to its final value, we empirically show that it is possible to significantly reduce the number of learning steps. When used in conjunction with a varying learning rate, we empirically show that it outperforms original DQN on several experiments. We relate this phenomenon with the instabilities of neural networks when they are used in an approximate Dynamic Programming setting. We also describe the possibility to fall within a local optimum during the learning process, thus connecting our discussion with the exploration/exploitation dilemma. [less ▲] Detailed reference viewed: 118 (25 ULg)On the Dynamics of the Deployment of Renewable Energy Production Capacities Fonteneau, Raphaël Speech/Talk (2015) Detailed reference viewed: 30 (2 ULg)Benchmarking for Bayesian Reinforcement Learning Castronovo, Michaël ; Ernst, Damien ; Couëtoux, Adrien et al E-print/Working paper (2015) In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the col- lected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand ... [more ▼] In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the col- lected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but even though a few toy examples exist in the literature, there are still no extensive or rigorous benchmarks to compare them. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test prob- lems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed. [less ▲] Detailed reference viewed: 266 (18 ULg)Une histoire d'énergie: équations et transition Fonteneau, Raphaël Speech/Talk (2015) Detailed reference viewed: 23 (2 ULg)Artificial Intelligence in Video Games: Towards a Unified Framework Safadi, Firas ; Fonteneau, Raphaël ; Ernst, Damien in International Journal of Computer Games Technology (2015), 2015 With modern video games frequently featuring sophisticated and realistic environments, the need for smart and comprehensive agents that understand the various aspects of complex environments is pressing ... [more ▼] With modern video games frequently featuring sophisticated and realistic environments, the need for smart and comprehensive agents that understand the various aspects of complex environments is pressing. Since video game AI is often specifically designed for each game, video game AI tools currently focus on allowing video game developers to quickly and efficiently create specific AI. One issue with this approach is that it does not efficiently exploit the numerous similarities that exist between video games not only of the same genre, but of different genres too, resulting in a difficulty to handle the many aspects of a complex environment independently for each video game. Inspired by the human ability to detect analogies between games and apply similar behavior on a conceptual level, this paper suggests an approach based on the use of a unified conceptual framework to enable the development of conceptual AI which relies on conceptual views and actions to define basic yet reasonable and robust behavior. The approach is illustrated using two video games, Raven and StarCraft: Brood War. [less ▲] Detailed reference viewed: 455 (37 ULg)From Bad Models to Good Policies: an Intertwined Story about Energy and Reinforcement Learning Fonteneau, Raphaël Speech/Talk (2014) Batch mode reinforcement learning is a subclass of reinforcement learning for which the decision making problem has to be addressed without model, using trajectories only (no model, nor simulator nor ... [more ▼] Batch mode reinforcement learning is a subclass of reinforcement learning for which the decision making problem has to be addressed without model, using trajectories only (no model, nor simulator nor additional interactions with the actual system). In this setting, we propose a discussion about a minmax approach to generalization for deterministic problems with continuous state space. This approach aims at computing robust policies considering the fact that the sample of trajectories may be arbitrarily bad. This discussion will be intertwined with the description of a fascinating batch mode reinforcement learning-type problem with trajectories of societies as input, and for which crucial good decisions have to be taken: the energy transition. [less ▲] Detailed reference viewed: 26 (1 ULg)Using approximate dynamic programming for estimating the revenues of a hydrogen-based high-capacity storage device François-Lavet, Vincent ; Fonteneau, Raphaël ; Ernst, Damien in IEEE Symposium Series on Computational Intelligence (2014) This paper proposes a methodology to estimate the maximum revenue that can be generated by a company that operates a high-capacity storage device to buy or sell electricity on the day-ahead electricity ... [more ▼] This paper proposes a methodology to estimate the maximum revenue that can be generated by a company that operates a high-capacity storage device to buy or sell electricity on the day-ahead electricity market. The methodology exploits the Dynamic Programming (DP) principle and is specified for hydrogen-based storage devices that use electrolysis to produce hydrogen and fuel cells to generate electricity from hydrogen. Experimental results are generated using historical data of energy prices on the Belgian market. They show how the storage capacity and other parameters of the storage device influence the optimal revenue. The main conclusion drawn from the experiments is that it may be advisable to invest in large storage tanks to exploit the inter-seasonal price fluctuations of electricity. [less ▲] Detailed reference viewed: 76 (22 ULg)Une Histoire d'Energie : Equations et Transition - Energy Stories, Equations and Transition Fonteneau, Raphaël Speech/Talk (2014) Having access to abundant energy is a key component of our societies' lifestyle. The energy transition amounts in abolishing the dependency of our societies to finite and non-renewable energy resources ... [more ▼] Having access to abundant energy is a key component of our societies' lifestyle. The energy transition amounts in abolishing the dependency of our societies to finite and non-renewable energy resources. Is this manageable? Applied mathematics may help us imagine the future... [less ▲] Detailed reference viewed: 16 (2 ULg)Mathematical Modeling of HIV Dynamics After Antiretroviral Therapy Initiation: A Review ; ; et al in BioResearch Open Acces (2014), 3(5), 233-241 This review shows the potential ground-breaking impact that mathematical tools may have in the analysis and the understanding of the HIV dynamics. In the first part, early diagnosis of immunological ... [more ▼] This review shows the potential ground-breaking impact that mathematical tools may have in the analysis and the understanding of the HIV dynamics. In the first part, early diagnosis of immunological failure is inferred from the estimation of certain parameters of a mathematical model of the HIV infection dynamics. This method is supported by clinical research results from an original clinical trial: data just after 1 month following therapy initiation are used to carry out the model identification. The diagnosis is shown to be consistent with results from monitoring of the patients after 6 months. In the second part of this review, prospective research results are given for the design of individual anti-HIV treatments optimizing the recovery of the immune system and minimizing side effects. In this respect, two methods are discussed. The first one combines HIV population dynamics with pharmacokinetics and pharmacodynamics models to generate drug treatments using impulsive control systems. The second one is based on optimal control theory and uses a recently published differential equation to model the side effects produced by highly active antiretroviral therapy therapies. The main advantage of these revisited methods is that the drug treatment is computed directly in amounts of drugs, which is easier to interpret by physicians and patients. [less ▲] Detailed reference viewed: 111 (9 ULg)Mathematical modeling of HIV dynamics after antiretroviral therapy initiation: A clinical research study ; ; et al in AIDS Research and Human Retroviruses (2014), 30(9), 831-834 Immunological failure is identified from the estimation of certain parameters of a mathematical model of the HIV infection dynamics. This identification is supported by clinical research results from an ... [more ▼] Immunological failure is identified from the estimation of certain parameters of a mathematical model of the HIV infection dynamics. This identification is supported by clinical research results from an original clinical trial. Standard clinical data were collected from infected patients starting Highly Active Anti-Retroviral Therapy (HAART), just after one month following therapy initiation and were used to carry out the model identification. The early diagnosis is shown to be consistent with the patients monitoring after six months. [less ▲] Detailed reference viewed: 100 (5 ULg)Energy Transition: How Can We Succeed? Fonteneau, Raphaël Speech/Talk (2014) How can we optimize our chance to succeed in the energy transition? 70% to 80% of our energy consumption is still from nonrenewable sources. Switching to a model that would not depend on nonrenewable ... [more ▼] How can we optimize our chance to succeed in the energy transition? 70% to 80% of our energy consumption is still from nonrenewable sources. Switching to a model that would not depend on nonrenewable energy needs itself - at least for the moment - to use nonrenewable energy. This talk formalizes the energy transition as a decision making problem under the constraint that a finite budget of nonrenewable energy is given. The goal is to efficiently allocate such a budget to get rid of our dependence to nonrenewable energy before we run out of it. [less ▲] Detailed reference viewed: 14 (1 ULg)Bayes Adaptive Reinforcement Learning versus Off-line Prior-based Policy Search: an Empirical Comparison Castronovo, Michaël ; Ernst, Damien ; Fonteneau, Raphaël in Proceedings of the 23rd annual machine learning conference of Belgium and the Netherlands (BENELEARN 2014) (2014, June) This paper addresses the problem of decision making in unknown finite Markov decision processes (MDPs). The uncertainty about the MDPs is modeled using a prior distribution over a set of candidate MDPs ... [more ▼] This paper addresses the problem of decision making in unknown finite Markov decision processes (MDPs). The uncertainty about the MDPs is modeled using a prior distribution over a set of candidate MDPs. The performance criterion is the expected sum of discounted rewards collected over an infinite length trajectory. Time constraints are defined as follows: (i) an off-line phase with a given time budget can be used to exploit the prior distribution and (ii) at every time step of the on-line phase, decisions have to be computed within a given time budget. In this setting, we compare two decision-making strategies: OPPS, a recently proposed meta-learning scheme which mainly exploits the off-line phase to perform policy search and BAMCP, a state-of-the-art model-based Bayesian reinforcement learning algorithm, which mainly exploits the on-line time budget. We empirically compare these approaches in a real Bayesian setting by computing their performances over a large set of problems. To the best of our knowledge, it is the first time that this is done in the reinforcement learning literature. Several settings are considered by varying the prior distribution and the distribution from which test problems are drawn. The main finding of these experiments is that there may be a significant benefit of having an off-line prior-based optimization phase in the case of informative and accurate priors, especially when on-line time constraints are tight. [less ▲] Detailed reference viewed: 240 (76 ULg)Apprentissage par renforcement bayésien versus recherche directe de politique hors-ligne en utilisant une distribution a priori: comparaison empirique Castronovo, Michaël ; Ernst, Damien ; Fonteneau, Raphaël in Proceedings des 9èmes Journée Francophones de Planification, Décision et Apprentissage (2014, May) Cet article aborde le problème de prise de décision séquentielle dans des processus de déci- sion de Markov (MDPs) finis et inconnus. L’absence de connaissance sur le MDP est modélisée sous la forme d’une ... [more ▼] Cet article aborde le problème de prise de décision séquentielle dans des processus de déci- sion de Markov (MDPs) finis et inconnus. L’absence de connaissance sur le MDP est modélisée sous la forme d’une distribution de probabilité sur un ensemble de MDPs candidats connue a priori. Le cri- tère de performance utilisé est l’espérance de la somme des récompenses actualisées sur une trajectoire infinie. En parallèle du critère d’optimalité, les contraintes liées au temps de calcul sont formalisées rigoureusement. Tout d’abord, une phase « hors-ligne » précédant l’interaction avec le MDP inconnu offre à l’agent la possibilité d’exploiter la distribution a priori pendant un temps limité. Ensuite, durant la phase d’interaction avec le MDP, à chaque pas de temps, l’agent doit prendre une décision dans un laps de temps contraint déterminé. Dans ce contexte, nous comparons deux stratégies de prise de déci- sion : OPPS, une approche récente exploitant essentiellement la phase hors-ligne pour sélectionner une politique dans un ensemble de politiques candidates et BAMCP, une approche récente de planification en-ligne bayésienne. Nous comparons empiriquement ces approches dans un contexte bayésien, en ce sens que nous évaluons leurs performances sur un large ensemble de problèmes tirés selon une distribution de test. A notre connaissance, il s’agit des premiers tests expérimentaux de ce type en apprentissage par renforcement. Nous étudions plusieurs cas de figure en considérant diverses distributions pouvant être utilisées aussi bien en tant que distribution a priori qu’en tant que distribution de test. Les résultats obtenus suggèrent qu’exploiter une distribution a priori durant une phase d’optimisation hors-ligne est un avantage non- négligeable pour des distributions a priori précises et/ou contraintes à de petits budgets temps en-ligne. [less ▲] Detailed reference viewed: 93 (24 ULg)Estimating the revenues of a hydrogen-based high-capacity storage device: methodology and results François-Lavet, Vincent ; Fonteneau, Raphaël ; Ernst, Damien in Proceedings des 9èmes Journée Francophones de Planification, Décision et Apprentissage (2014, May) This paper proposes a methodology to estimate the maximum revenue that can be generated by a company that operates a high-capacity storage device to buy or sell electricity on the day-ahead electricity ... [more ▼] This paper proposes a methodology to estimate the maximum revenue that can be generated by a company that operates a high-capacity storage device to buy or sell electricity on the day-ahead electricity market. The methodology exploits the Dynamic Programming (DP) principle and is specified for hydrogen-based storage devices that use electrolysis to produce hydrogen and fuel cells to generate electricity from hydrogen. Experimental results are generated using historical data of energy prices on the Belgian market. They show how the storage capacity and other parameters of the storage device influence the optimal revenue. The main conclusion drawn from the experiments is that it may be interesting to invest in large storage tanks to exploit the inter-seasonal price fluctuations of electricity. [less ▲] Detailed reference viewed: 100 (26 ULg)Transition énergétique : maximiser le retour énergétique à long terme Fonteneau, Raphaël Speech/Talk (2014) Detailed reference viewed: 21 (1 ULg) |
||