References of "Ernst, Damien"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailApprentissage par renforcement bayésien versus recherche directe de politique hors-ligne en utilisant une distribution a priori: comparaison empirique
Castronovo, Michaël ULg; Ernst, Damien ULg; Fonteneau, Raphaël ULg

in Proceedings des 9èmes Journée Francophones de Planification, Décision et Apprentissage (2014, May)

Cet article aborde le problème de prise de décision séquentielle dans des processus de déci- sion de Markov (MDPs) finis et inconnus. L’absence de connaissance sur le MDP est modélisée sous la forme d’une ... [more ▼]

Cet article aborde le problème de prise de décision séquentielle dans des processus de déci- sion de Markov (MDPs) finis et inconnus. L’absence de connaissance sur le MDP est modélisée sous la forme d’une distribution de probabilité sur un ensemble de MDPs candidats connue a priori. Le cri- tère de performance utilisé est l’espérance de la somme des récompenses actualisées sur une trajectoire infinie. En parallèle du critère d’optimalité, les contraintes liées au temps de calcul sont formalisées rigoureusement. Tout d’abord, une phase « hors-ligne » précédant l’interaction avec le MDP inconnu offre à l’agent la possibilité d’exploiter la distribution a priori pendant un temps limité. Ensuite, durant la phase d’interaction avec le MDP, à chaque pas de temps, l’agent doit prendre une décision dans un laps de temps contraint déterminé. Dans ce contexte, nous comparons deux stratégies de prise de déci- sion : OPPS, une approche récente exploitant essentiellement la phase hors-ligne pour sélectionner une politique dans un ensemble de politiques candidates et BAMCP, une approche récente de planification en-ligne bayésienne. Nous comparons empiriquement ces approches dans un contexte bayésien, en ce sens que nous évaluons leurs performances sur un large ensemble de problèmes tirés selon une distribution de test. A notre connaissance, il s’agit des premiers tests expérimentaux de ce type en apprentissage par renforcement. Nous étudions plusieurs cas de figure en considérant diverses distributions pouvant être utilisées aussi bien en tant que distribution a priori qu’en tant que distribution de test. Les résultats obtenus suggèrent qu’exploiter une distribution a priori durant une phase d’optimisation hors-ligne est un avantage non- négligeable pour des distributions a priori précises et/ou contraintes à de petits budgets temps en-ligne. [less ▲]

Detailed reference viewed: 62 (19 ULg)
Full Text
See detailMicrogrids and their destructuring effects on the electrical industry
Ernst, Damien ULg

Speech/Talk (2014)

Detailed reference viewed: 113 (13 ULg)
Full Text
Peer Reviewed
See detailGestion active d’un réseau de distribution d’électricité : formulation du problème et benchmark
Gemine, Quentin ULg; Ernst, Damien ULg; Cornélusse, Bertrand ULg

in Proceedings des 9èmes Journées Francophones de Planification, Décision et Apprentissage (2014, May)

Afin d’opérer un réseau de distribution d’électricité de manière fiable et efficace, c’est-à-dire de respecter les contraintes physiques tout en évitant des coûts de renforcement prohibitifs, il devient ... [more ▼]

Afin d’opérer un réseau de distribution d’électricité de manière fiable et efficace, c’est-à-dire de respecter les contraintes physiques tout en évitant des coûts de renforcement prohibitifs, il devient nécessaire de recourir à des stratégies de gestion active du réseau. Ces stratégies, rendues nécessaires notamment par l’essor de la production distribuée, reposent sur des politiques de contrôle à court-terme du niveau de puissance des dispositifs producteurs ou consommateurs d’électricité. Alors qu’une solution simple consisterait à moduler à la baisse la production des générateurs, il paraît néan- moins plus intéressant de déplacer la consommation aux moments adéquats afin d’exploiter au mieux les sources d’énergie renouvelables sur lesquelles reposent généralement ces générateurs. Un tel moyen de contrôle introduit néanmoins un couplage temporel au problème, menant à un problème d’optimisation non-linéaire, séquentiel sous incertitude et à variables mixtes. Afin de favoriser la recherche dans ce domaine très complexe, nous proposons une formalisation générique du problème de ges- tion active d’un réseau de distribution moyenne tension (MT). Plus spécifiquement, cette formalisa- tion se présente sous la forme d’un processus de décision markovien. Dans cette article, nous pré- sentons également une spécification de ce modèle décisionnel à un réseau de 75 noeuds et pour un ensemble de services de modulation donnés. L’instance de test qui en résulte est disponible à l’adresse http://www.montefiore.ulg.ac.be/~anm/ et a pour objectif de mesurer et de comparer les performances des techniques de résolution qui seront développées. [less ▲]

Detailed reference viewed: 185 (44 ULg)
Full Text
Peer Reviewed
See detailToggling a genetic switch using reinforcement learning
Sootla, Aivar; Strelkowa, Natalja; Ernst, Damien ULg et al

in Proceedings of the 9th French Meeting on Planning, Decision Making and Learning (2014, May)

In this paper, we consider the problem of optimal exogenous control of gene regulatory networks. Our approach consists in adapting an established reinforcement learning algorithm called the fitted Q ... [more ▼]

In this paper, we consider the problem of optimal exogenous control of gene regulatory networks. Our approach consists in adapting an established reinforcement learning algorithm called the fitted Q iteration. This algorithm infers the control law directly from the measurements of the system’s response to external control inputs without the use of a mathematical model of the system. The measurement data set can either be collected from wet-lab experiments or artificially created by computer simulations of dynamical models of the system. The algorithm is applicable to a wide range of biological systems due to its ability to deal with nonlinear and stochastic system dynamics. To illustrate the application of the algorithm to a gene regulatory network, the regulation of the toggle switch system is considered. The control objective of this problem is to drive the concentrations of two specific proteins to a target region in the state space. [less ▲]

Detailed reference viewed: 35 (2 ULg)
Full Text
Peer Reviewed
See detailToggling a genetic switch using reinforcement learning
Sootla, Aivar; Strelkowa, Natalja; Ernst, Damien ULg et al

in Proceedings of the 9th French Meeting on Planning, Decision Making and Learning (2014, May)

In this paper, we consider the problem of optimal exogenous control of gene regulatory networks. Our approach consists in adapting an established reinforcement learning algorithm called the fitted Q ... [more ▼]

In this paper, we consider the problem of optimal exogenous control of gene regulatory networks. Our approach consists in adapting an established reinforcement learning algorithm called the fitted Q iteration. This algorithm infers the control law directly from the measurements of the system’s response to external control inputs without the use of a mathematical model of the system. The measurement data set can either be collected from wet-lab experiments or artificially created by computer simulations of dynamical models of the system. The algorithm is applicable to a wide range of biological systems due to its ability to deal with nonlinear and stochastic system dynamics. To illustrate the application of the algorithm to a gene regulatory network, the regulation of the toggle switch system is considered. The control objective of this problem is to drive the concentrations of two specific proteins to a target region in the state space. [less ▲]

Detailed reference viewed: 35 (2 ULg)
Full Text
Peer Reviewed
See detailToggling a genetic switch using reinforcement learning
Sootla, Aivar; Strelkowa, Natalja; Ernst, Damien ULg et al

in Proceedings of the 9th French Meeting on Planning, Decision Making and Learning (2014, May)

In this paper, we consider the problem of optimal exogenous control of gene regulatory networks. Our approach consists in adapting an established reinforcement learning algorithm called the fitted Q ... [more ▼]

In this paper, we consider the problem of optimal exogenous control of gene regulatory networks. Our approach consists in adapting an established reinforcement learning algorithm called the fitted Q iteration. This algorithm infers the control law directly from the measurements of the system’s response to external control inputs without the use of a mathematical model of the system. The measurement data set can either be collected from wet-lab experiments or artificially created by computer simulations of dynamical models of the system. The algorithm is applicable to a wide range of biological systems due to its ability to deal with nonlinear and stochastic system dynamics. To illustrate the application of the algorithm to a gene regulatory network, the regulation of the toggle switch system is considered. The control objective of this problem is to drive the concentrations of two specific proteins to a target region in the state space. [less ▲]

Detailed reference viewed: 35 (2 ULg)
Full Text
Peer Reviewed
See detailToggling a genetic switch using reinforcement learning
Sootla, Aivar; Strelkowa, Natalja; Ernst, Damien ULg et al

in Proceedings of the 9th French Meeting on Planning, Decision Making and Learning (2014, May)

In this paper, we consider the problem of optimal exogenous control of gene regulatory networks. Our approach consists in adapting an established reinforcement learning algorithm called the fitted Q ... [more ▼]

In this paper, we consider the problem of optimal exogenous control of gene regulatory networks. Our approach consists in adapting an established reinforcement learning algorithm called the fitted Q iteration. This algorithm infers the control law directly from the measurements of the system’s response to external control inputs without the use of a mathematical model of the system. The measurement data set can either be collected from wet-lab experiments or artificially created by computer simulations of dynamical models of the system. The algorithm is applicable to a wide range of biological systems due to its ability to deal with nonlinear and stochastic system dynamics. To illustrate the application of the algorithm to a gene regulatory network, the regulation of the toggle switch system is considered. The control objective of this problem is to drive the concentrations of two specific proteins to a target region in the state space. [less ▲]

Detailed reference viewed: 35 (2 ULg)
Full Text
Peer Reviewed
See detailEstimating the revenues of a hydrogen-based high-capacity storage device: methodology and results
François-Lavet, Vincent ULg; Fonteneau, Raphaël ULg; Ernst, Damien ULg

in Proceedings des 9èmes Journée Francophones de Planification, Décision et Apprentissage (2014, May)

This paper proposes a methodology to estimate the maximum revenue that can be generated by a company that operates a high-capacity storage device to buy or sell electricity on the day-ahead electricity ... [more ▼]

This paper proposes a methodology to estimate the maximum revenue that can be generated by a company that operates a high-capacity storage device to buy or sell electricity on the day-ahead electricity market. The methodology exploits the Dynamic Programming (DP) principle and is specified for hydrogen-based storage devices that use electrolysis to produce hydrogen and fuel cells to generate electricity from hydrogen. Experimental results are generated using historical data of energy prices on the Belgian market. They show how the storage capacity and other parameters of the storage device influence the optimal revenue. The main conclusion drawn from the experiments is that it may be interesting to invest in large storage tanks to exploit the inter-seasonal price fluctuations of electricity. [less ▲]

Detailed reference viewed: 68 (26 ULg)
Full Text
Peer Reviewed
See detailOptimized look-ahead tree policies: a bridge between look-ahead tree policies and direct policy search
Jung, Tobias ULg; Wehenkel, Louis ULg; Ernst, Damien ULg et al

in International Journal of Adaptive Control and Signal Processing (2014), 28(3-5), 255-289

Direct policy search (DPS) and look-ahead tree (LT) policies are two popular techniques for solving difficult sequential decision-making problems. They both are simple to implement, widely applicable ... [more ▼]

Direct policy search (DPS) and look-ahead tree (LT) policies are two popular techniques for solving difficult sequential decision-making problems. They both are simple to implement, widely applicable without making strong assumptions on the structure of the problem, and capable of producing high performance control policies. However, computationally both of them are, each in their own way, very expensive. DPS can require huge offline resources (effort required to obtain the policy) to first select an appropriate space of parameterized policies that works well for the targeted problem, and then to determine the best values of the parameters via global optimization. LT policies do not require any offline resources; however, they typically require huge online resources (effort required to calculate the best decision at each step) in order to grow trees of sufficient depth. In this paper, we propose optimized look-ahead trees (OLT), a model-based policy learning scheme that lies at the intersection of DPS and LT. In OLT, the control policy is represented indirectly through an algorithm that at each decision step develops, as in LT using a model of the dynamics, a small look-ahead tree until a prespecified online budget is exhausted. Unlike LT, the development of the tree is not driven by a generic heuristic; rather, the heuristic is optimized for the target problem and implemented as a parameterized node scoring function learned offline via DPS. We experimentally compare OLT with pure DPS and pure LT variants on optimal control benchmark domains. The results show that the LT-based representation is a versatile way of compactly representing policies in a DPS scheme (which results in OLT being easier to tune and having lower offline complexity than pure DPS); while at the same time, DPS helps to significantly reduce the size of the look-ahead trees that are required to take high-quality decisions (which results in OLT having lower online complexity than pure LT). Moreover, OLT produces overall better performing policies than pure DPS and pure LT and also results in policies that are robust with respect to perturbations of the initial conditions. [less ▲]

Detailed reference viewed: 106 (33 ULg)
Full Text
Peer Reviewed
See detailA learning procedure for sampling semantically different valid expressions
St-Pierre, David Lupien; Maes, Francis; Ernst, Damien ULg et al

in International Journal of Artificial Intelligence (2014), 12(1), 18-35

A large number of problems can be formalized as finding the best symbolic expression to maximize a given numerical objective. Most approaches to approximately solve such problems rely on random ... [more ▼]

A large number of problems can be formalized as finding the best symbolic expression to maximize a given numerical objective. Most approaches to approximately solve such problems rely on random exploration of the search space. This paper focuses on how this random exploration should be performed to take into account expressions redundancy and invalid expressions. We propose a learning algorithm that, given the set of available constants, variables and operators and given the target finite number of trials, computes a probability distribution to maximize the expected number of semantically different, valid, generated expressions. We illustrate the use of our approach on both medium-scale and large-scale expression spaces, and empirically show that such optimized distributions significantly outperform the uniform distribution in terms of the diversity of generated expressions. We further test the method in combination with the recently proposed nested Monte-Carlo algorithm on a set of benchmark symbolic regression problems and demonstrate its interest in terms of reduction of the number of required calls to the objective function. [less ▲]

Detailed reference viewed: 30 (5 ULg)
Full Text
See detailL'invité - Damien Ernst - "Nous allons vers une globalisation du marché de l'électricité"
Ernst, Damien ULg

Article for general public (2014)

En décembre 2013, Damien Ernst, Professeur à l’ULG, a donné une conférence au CESW intitulée : «Vers une globalisation du marché de l’électricité. Quel rôle pour les acteurs du secteur belge de ... [more ▼]

En décembre 2013, Damien Ernst, Professeur à l’ULG, a donné une conférence au CESW intitulée : «Vers une globalisation du marché de l’électricité. Quel rôle pour les acteurs du secteur belge de l’électricité?». Damien Ernst est un observateur privilégié du secteur énergétique belge, et plus particulièrement de tout ce qui concerne le secteur de l’électricité. Auteur de nombreuses publications et études, Damien Ernst s’est notamment interrogé sur les perspectives des énergies renouvelables en Belgique. Damien Ernst est l’invité de ce numéro 120 de la revue Wallonie. Dans son interview, il nous explique pourquoi la globalisation du marché de l’électricité est inéluctable et quelles en seront les conséquences, pour les entreprises du secteur et pour la Wallonie. [less ▲]

Detailed reference viewed: 54 (8 ULg)
Full Text
See detailActive network management for electrical distribution systems: problem formulation and benchmark
Gemine, Quentin ULg; Ernst, Damien ULg; Cornélusse, Bertrand ULg

E-print/Working paper (2014)

In order to operate an electrical distribution network in a secure and cost-efficient way, it is necessary, due to the rise of renewable energy-based distributed generation, to develop Active Network ... [more ▼]

In order to operate an electrical distribution network in a secure and cost-efficient way, it is necessary, due to the rise of renewable energy-based distributed generation, to develop Active Network Management (ANM) strategies. These strategies rely on short-term policies that control the power injected by generators and/or taken off by loads in order to avoid congestion or voltage problems. While simple ANM strategies would curtail the production of generators, more advanced ones would move the consumption of loads to relevant time periods to maximize the potential of renewable energy sources. However, such advanced strategies imply solving large-scale optimal sequential decision-making problems under uncertainty, something that is understandably complicated. In order to promote the development of computational techniques for active network management, we detail a generic procedure for formulating ANM decision problems as Markov decision processes. We also specify it to a 75-bus distribution network. The resulting test instance is available at http://www.montefiore.ulg.ac.be/~anm/ . It can be used as a test bed for comparing existing computational techniques, as well as for developing new ones. A solution technique that consists in an approximate multistage program is also illustrated on the test instance. [less ▲]

Detailed reference viewed: 26 (6 ULg)
Full Text
Peer Reviewed
See detailLipschitz robust control from off-policy trajectories
Fonteneau, Raphaël ULg; Ernst, Damien ULg; Boigelot, Bernard ULg et al

in Proceedings of the 53rd IEEE Conference on Decision and Control (IEEE CDC 2014) (2014)

We study the minmax optimization problem introduced in [Fonteneau et al. (2011), ``Towards min max reinforcement learning'', Springer CCIS, vol. 129, pp. 61-77] for computing control policies for batch ... [more ▼]

We study the minmax optimization problem introduced in [Fonteneau et al. (2011), ``Towards min max reinforcement learning'', Springer CCIS, vol. 129, pp. 61-77] for computing control policies for batch mode reinforcement learning in a deterministic setting with fixed, finite optimization horizon. First, we state that the $\min$ part of this problem is NP-hard. We then provide two relaxation schemes. The first relaxation scheme works by dropping some constraints in order to obtain a problem that is solvable in polynomial time. The second relaxation scheme, based on a Lagrangian relaxation where all constraints are dualized, can also be solved in polynomial time. We theoretically show that both relaxation schemes provide better results than those given in [Fonteneau et al. (2011)] [less ▲]

Detailed reference viewed: 32 (6 ULg)
Full Text
See detailPower system transient stability preventive and emergency control
Ruiz-Vega, Daniel; Wehenkel, Louis ULg; Ernst, Damien ULg et al

in Savulescu, Savu (Ed.) Real-Time Stability in Power Systems 2nd Edition (2014)

A general approach to real-time transient stability control is described, yielding various complementary techniques: pure preventive, open loop emergency, and closed loop emergency controls. Recent ... [more ▼]

A general approach to real-time transient stability control is described, yielding various complementary techniques: pure preventive, open loop emergency, and closed loop emergency controls. Recent progress in terms of a global transient stability constrained optimal power flow are presented, yielding in a scalable nonlinear programming formulation which allows to take near-optimal decisions for preventive control with a computing budget corresponding only to a few runs of standard optimal power flow and time domain simulations. These complementary techniques meet the stringent conditions imposed by the real-life applications. [less ▲]

Detailed reference viewed: 42 (3 ULg)
Full Text
Peer Reviewed
See detailApprentissage par renforcement batch fondé sur la reconstruction de trajectoires artificielles
Fonteneau, Raphaël ULg; Murphy, Susan A.; Wehenkel, Louis ULg et al

in Proceedings of the 9èmes Journées Francophones de Planification, Décision et Apprentissage (JFPDA 2014) (2014)

Cet article se situe dans le cadre de l’apprentissage par renforcement en mode batch, dont le problème central est d’apprendre, à partir d’un ensemble de trajectoires, une politique de décision optimisant ... [more ▼]

Cet article se situe dans le cadre de l’apprentissage par renforcement en mode batch, dont le problème central est d’apprendre, à partir d’un ensemble de trajectoires, une politique de décision optimisant un critère donné. On considère plus spécifiquement les problèmes pour lesquels l’espace d’état est continu, problèmes pour lesquels les schémas de résolution classiques se fondent sur l’utilisation d’approxima- teurs de fonctions. Cet article propose une alternative fondée sur la reconstruction de “trajectoires arti- ficielles” permettant d’aborder sous un angle nouveau les problèmes classiques de l’apprentissage par renforcement batch. [less ▲]

Detailed reference viewed: 54 (5 ULg)
Full Text
See detailThe Global Grid
Ernst, Damien ULg

Speech/Talk (2013)

Detailed reference viewed: 39 (4 ULg)
Full Text
Peer Reviewed
See detailOn periodic reference tracking using batch-mode reinforcement learning with application to gene regulatory network control
Sootla, Aivar; Strelkowa, Natajala; Ernst, Damien ULg et al

in Proceedings of the 52nd Annual Conference on Decision and Control (CDC 2013) (2013, December)

In this paper, we consider the periodic reference tracking problem in the framework of batch-mode reinforcement learning, which studies methods for solving optimal control problems from the sole knowledge ... [more ▼]

In this paper, we consider the periodic reference tracking problem in the framework of batch-mode reinforcement learning, which studies methods for solving optimal control problems from the sole knowledge of a set of trajectories. In particular, we extend an existing batch-mode reinforcement learning algorithm, known as Fitted Q Iteration, to the periodic reference tracking problem. The presented periodic reference tracking algorithm explicitly exploits a priori knowledge of the future values of the reference trajectory and its periodicity. We discuss the properties of our approach and illustrate it on the problem of reference tracking for a synthetic biology gene regulatory network known as the generalised repressilator. This system can produce decaying but long-lived oscillations, which makes it an interesting application for the tracking problem. [less ▲]

Detailed reference viewed: 13 (0 ULg)
Full Text
Peer Reviewed
See detailAn efficient algorithm for the provision of a day-ahead modulation service by a load aggregator
Mathieu, Sébastien ULg; Ernst, Damien ULg; Louveaux, Quentin ULg

in Proceedings of the 4th European Innovative Smart Grid Technologies (ISGT) (2013, October)

This article studies a decision making problem faced by an aggregator willing to offer a load modulation service to a Transmission System Operator. This service is contracted one day ahead and consists in ... [more ▼]

This article studies a decision making problem faced by an aggregator willing to offer a load modulation service to a Transmission System Operator. This service is contracted one day ahead and consists in a load modulation option, which can be called once per day. The option specifies the range of a potential modification on the demand of the loads within a certain time interval. The specific case where the loads can be modeled by a generic tank model is considered. Under this assumption, the problem of maximizing the range of the load modulation service can be formulated as a mixed integer linear programming problem. A novel heuristic-method is proposed to solve this problem in a computationally efficient manner. This method is tested on a set of problems. The results show that this approach can be orders of magnitude faster than CPLEX without significantly degrading the solution accuracy. [less ▲]

Detailed reference viewed: 131 (33 ULg)
Full Text
See detailGREDOR
Ernst, Damien ULg

Speech/Talk (2013)

Detailed reference viewed: 54 (12 ULg)