References of "Ernst, Damien"      in Complete repository Arts & humanities   Archaeology   Art & art history   Classical & oriental studies   History   Languages & linguistics   Literature   Performing arts   Philosophy & ethics   Religion & theology   Multidisciplinary, general & others Business & economic sciences   Accounting & auditing   Production, distribution & supply chain management   Finance   General management & organizational theory   Human resources management   Management information systems   Marketing   Strategy & innovation   Quantitative methods in economics & management   General economics & history of economic thought   International economics   Macroeconomics & monetary economics   Microeconomics   Economic systems & public economics   Social economics   Special economic topics (health, labor, transportation…)   Multidisciplinary, general & others Engineering, computing & technology   Aerospace & aeronautics engineering   Architecture   Chemical engineering   Civil engineering   Computer science   Electrical & electronics engineering   Energy   Geological, petroleum & mining engineering   Materials science & engineering   Mechanical engineering   Multidisciplinary, general & others Human health sciences   Alternative medicine   Anesthesia & intensive care   Cardiovascular & respiratory systems   Dentistry & oral medicine   Dermatology   Endocrinology, metabolism & nutrition   Forensic medicine   Gastroenterology & hepatology   General & internal medicine   Geriatrics   Hematology   Immunology & infectious disease   Laboratory medicine & medical technology   Neurology   Oncology   Ophthalmology   Orthopedics, rehabilitation & sports medicine   Otolaryngology   Pediatrics   Pharmacy, pharmacology & toxicology   Psychiatry   Public health, health care sciences & services   Radiology, nuclear medicine & imaging   Reproductive medicine (gynecology, andrology, obstetrics)   Rheumatology   Surgery   Urology & nephrology   Multidisciplinary, general & others Law, criminology & political science   Civil law   Criminal law & procedure   Criminology   Economic & commercial law   European & international law   Judicial law   Metalaw, Roman law, history of law & comparative law   Political science, public administration & international relations   Public law   Social law   Tax law   Multidisciplinary, general & others Life sciences   Agriculture & agronomy   Anatomy (cytology, histology, embryology...) & physiology   Animal production & animal husbandry   Aquatic sciences & oceanology   Biochemistry, biophysics & molecular biology   Biotechnology   Entomology & pest control   Environmental sciences & ecology   Food science   Genetics & genetic processes   Microbiology   Phytobiology (plant sciences, forestry, mycology...)   Veterinary medicine & animal health   Zoology   Multidisciplinary, general & others Physical, chemical, mathematical & earth Sciences   Chemistry   Earth sciences & physical geography   Mathematics   Physics   Space science, astronomy & astrophysics   Multidisciplinary, general & others Social & behavioral sciences, psychology   Animal psychology, ethology & psychobiology   Anthropology   Communication & mass media   Education & instruction   Human geography & demography   Library & information sciences   Neurosciences & behavior   Regional & inter-regional studies   Social work & social policy   Sociology & social sciences   Social, industrial & organizational psychology   Theoretical & cognitive psychology   Treatment & clinical psychology   Multidisciplinary, general & others     Showing results 101 to 120 of 260     1 2 3 4 5 6 7 8 9 10 11     Outbound SPIT Filter with Optimal Performance GuaranteesJung, Tobias ; Martin, Sylvain ; Nassar, Mohamed et alin Computer Networks (2013), 57(7), 16301643This paper presents a formal framework for identifying and filtering SPIT calls (SPam in Internet Telephony) in an outbound scenario with provable optimal performance. In so doing, our work is largely ... [more ▼]This paper presents a formal framework for identifying and filtering SPIT calls (SPam in Internet Telephony) in an outbound scenario with provable optimal performance. In so doing, our work is largely different from related previous work: our goal is to rigorously formalize the problem in terms of mathematical decision theory, find the optimal solution to the problem, and derive concrete bounds for its expected loss (number of mistakes the SPIT filter will make in the worst case). This goal is achieved by considering an abstracted scenario amenable to theoretical analysis, namely SPIT detection in an outbound scenario with pure sources. Our methodology is to first define the cost of making an error (false positive and false negative), apply Wald’s sequential probability ratio test to the individual sources, and then determine analytically error probabilities such that the resulting expected loss is minimized. The benefits of our approach are: (1) the method is optimal (in a sense defined in the paper); (2) the method does not rely on manual tuning and tweaking of parameters but is completely self-contained and mathematically justified; (3) the method is computationally simple and scalable. These are desirable features that would make our method a component of choice in larger, autonomic frameworks. [less ▲]Detailed reference viewed: 140 (38 ULg) Optimal discovery with probabilistic expert advice: finite time analysis and macroscopic optimalityBubeck, Sébastien; Ernst, Damien ; Garivier, Aurélienin Journal of Machine Learning Research (2013), 14We consider an original problem that arises from the issue of security analysis of a power system and that we name optimal discovery with probabilistic expert advice. We address it with an algorithm based ... [more ▼]We consider an original problem that arises from the issue of security analysis of a power system and that we name optimal discovery with probabilistic expert advice. We address it with an algorithm based on the optimistic paradigm and on the Good-Turing missing mass estimator. We prove two different regret bounds on the performance of this algorithm under weak assumptions on the probabilistic experts. Under more restrictive hypotheses, we also prove a macroscopic optimality result, comparing the algorithm both with an oracle strategy and with uniform sampling. Finally, we provide numerical experiments illustrating these theoretical ﬁndings. [less ▲]Detailed reference viewed: 12 (7 ULg) Scenario Trees and Policy Selection for Multistage Stochastic Programming Using Machine LearningDefourny, Boris; Ernst, Damien ; Wehenkel, Louis in INFORMS Journal on Computing (2013), 25(3), 488-501In the context of multistage stochastic optimization problems, we propose a hybrid strategy for generalizing to nonlinear decision rules, using machine learning, a finite data set of constrained vector ... [more ▼]In the context of multistage stochastic optimization problems, we propose a hybrid strategy for generalizing to nonlinear decision rules, using machine learning, a finite data set of constrained vector-valued recourse decisions optimized using scenario-tree techniques from multistage stochastic programming. The decision rules are based on a statistical model inferred from a given scenario-tree solution and are selected by out-of-sample simulation given the true problem. Because the learned rules depend on the given scenario tree, we repeat the procedure for a large number of randomly generated scenario trees and then select the best solution (policy) found for the true problem. The scheme leads to an ex post selection of the scenario tree itself. Numerical tests evaluate the dependence of the approach on the machine learning aspects and show cases where one can obtain near-optimal solutions, starting with a “weak” scenario-tree generator that randomizes the branching structure of the trees. [less ▲]Detailed reference viewed: 187 (25 ULg) Généralisation Min Max pour l'Apprentissage par Renforcement Batch et Déterministe : Relaxations pour le Cas Général T EtapesFonteneau, Raphaël ; Ernst, Damien ; Boigelot, Bernard et alin 8èmes Journées Francophones de Planification, Décision et Apprentissage pour la conduite de systèmes (JFPDA'13) (2013)Cet article aborde le problème de généralisation minmax dans le cadre de l'apprentissage par renforcement batch et déterministe. Le problème a été originellement introduit par [Fonteneau, 2011], et il a ... [more ▼]Cet article aborde le problème de généralisation minmax dans le cadre de l'apprentissage par renforcement batch et déterministe. Le problème a été originellement introduit par [Fonteneau, 2011], et il a déjà été montré qu'il est NP-dur. Deux schémas de relaxation pour le cas deux étapes ont été présentés aux JFPDA'12, et ce papier présente une généralisation de ces schémas au cas T étapes. Le premier schéma fonctionne en éliminant des contraintes afin d'obtenir un problème soluble en temps polynomial. Le deuxième schéma est une relaxation lagrangienne conduisant également à un problème soluble en temps polynomial. On montre théoriquement que ces deux schémas permettent d'obtenir de meilleurs résultats que ceux proposés par [Fonteneau, 2011]. [less ▲]Detailed reference viewed: 50 (7 ULg) Min max generalization for deterministic batch mode reinforcement learning: relaxation schemesFonteneau, Raphaël ; Ernst, Damien ; Boigelot, Bernard et alin SIAM Journal on Control & Optimization (2013), 51(5), 33553385We study the min max optimization problem introduced in Fonteneau et al. [Towards min max reinforcement learning, ICAART 2010, Springer, Heidelberg, 2011, pp. 61–77] for computing policies for batch mode ... [more ▼]We study the min max optimization problem introduced in Fonteneau et al. [Towards min max reinforcement learning, ICAART 2010, Springer, Heidelberg, 2011, pp. 61–77] for computing policies for batch mode reinforcement learning in a deterministic setting with ﬁxed, ﬁnite time horizon. First, we show that the min part of this problem is NP-hard. We then provide two relaxation schemes. The ﬁrst relaxation scheme works by dropping some constraints in order to obtain a problem that is solvable in polynomial time. The second relaxation scheme, based on a Lagrangian relaxation where all constraints are dualized, can also be solved in polynomial time. We also theoretically prove and empirically illustrate that both relaxation schemes provide better results than those given in [Fonteneau et al., 2011, as cited above]. [less ▲]Detailed reference viewed: 62 (16 ULg) Active network management: planning under uncertainty for exploiting load modulationGemine, Quentin ; Karangelos, Efthymios ; Ernst, Damien et alin Proceedings of the 2013 IREP Symposium - Bulk Power Systems Dynamics and Control - IX (2013)This paper addresses the problem faced by a distribution system operator (DSO) when planning the operation of a network in the short-term. The problem is formulated in the context of high penetration of ... [more ▼]This paper addresses the problem faced by a distribution system operator (DSO) when planning the operation of a network in the short-term. The problem is formulated in the context of high penetration of renewable energy sources (RES) and distributed generation (DG), and when flexible demand is available. The problem is expressed as a sequential decision-making problem under uncertainty, where, in the first stage, the DSO has to decide whether or not to reserve the availability of flexible demand, and, in the subsequent stages, can curtail the generation and modulate the available flexible loads. We analyze the relevance of this formulation on a small test system, discuss the assumptions made, compare our approach to related work, and indicate further research directions. [less ▲]Detailed reference viewed: 203 (58 ULg) Stratégies d'échantillonnage pour l'apprentissage par renforcement batchFonteneau, Raphaël ; Murphy, Susan A.; Wehenkel, Louis et alin Revue d'Intelligence Artificielle [=RIA] (2013), 27(2), 171-194We propose two strategies for experiment selection in the context of batch mode reinforcement learning. The ﬁrst strategy is based on the idea that the most interesting experiments to carry out at some ... [more ▼]We propose two strategies for experiment selection in the context of batch mode reinforcement learning. The ﬁrst strategy is based on the idea that the most interesting experiments to carry out at some stage are those that are the most liable to falsify the current hypothesis about the optimal control policy. We cast this idea in a context where a policy learning algorithm and a model identiﬁcation method are given a priori. The second strategy exploits recently published methods for computing bounds on the return of control policies from a set of trajectories in order to sample the state-action space so as to be able to discriminate between optimal and non-optimal policies. Both strategies are experimentally validated, showing promising results. [less ▲]Detailed reference viewed: 58 (7 ULg) Meta-learning of Exploration/Exploitation Strategies: The Multi-Armed Bandit CaseMaes, Francis; Wehenkel, Louis ; Ernst, Damien in Filipe, Joaquim; Fred, Ana (Eds.) Agents and Artificial Intelligence: 4th International Conference, ICAART 2012, Vilamoura, Portugal, February 6-8, 2012. Revised Selected Papers (2013)The exploration/exploitation (E/E) dilemma arises naturally in many subﬁelds of Science. Multi-armed bandit problems formalize this dilemma in its canonical form. Most current research in this ﬁeld ... [more ▼]The exploration/exploitation (E/E) dilemma arises naturally in many subﬁelds of Science. Multi-armed bandit problems formalize this dilemma in its canonical form. Most current research in this ﬁeld focuses on generic solutions that can be applied to a wide range of problems. However, in practice, it is often the case that a form of prior information is available about the speciﬁc class of target problems. Prior knowledge is rarely used in current solutions due to the lack of a systematic approach to incorporate it into the E/E strategy. To address a speciﬁc class of E/E problems, we propose to proceed in three steps: (i) model prior knowledge in the form of a probability distribution over the target class of E/E problems; (ii) choose a large hypothesis space of candidate E/E strategies; and (iii), solve an optimization problem to ﬁnd a candidate E/E strategy of maximal average performance over a sample of problems drawn from the prior distribution. We illustrate this meta-learning approach with two different hypothesis spaces: one where E/E strategies are numerically parameterized and another where E/E strategies are represented as small symbolic formulas. We propose appropriate optimization algorithms for both cases. Our experiments, with two-armed “Bernoulli” bandit problems and various playing budgets, show that the metalearnt E/E strategies outperform generic strategies of the literature (UCB1, UCB1-T UNED, UCB-V, KL-UCB and epsilon-GREEDY); they also evaluate the robustness of the learnt E/E strategies, by tests carried out on arms whose rewards follow a truncated Gaussian distribution. [less ▲]Detailed reference viewed: 36 (8 ULg) Optimized Look-Ahead Trees: Extensions to Large and Continuous Action SpacesJung, Tobias ; Ernst, Damien ; Maes, Francisin Proc. of IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL'13) (2013)This paper studies look-ahead tree based control policies from the viewpoint of online decision making with constraints on the computational budget allowed per decision (expressed as number of calls to ... [more ▼]This paper studies look-ahead tree based control policies from the viewpoint of online decision making with constraints on the computational budget allowed per decision (expressed as number of calls to the generative model). We consider optimized look-ahead tree (OLT) policies, a recently introduced family of hybrid techniques, which combine the advantages of look-ahead trees (high precision) with the advantages of direct policy search (low online cost) and which are specifically designed for limited online budgets. We present two extensions of the basic OLT algorithm that on the one side allow tackling deterministic optimal control problems with large and continuous action spaces and that on the other side can also help to further reduce the online complexity. [less ▲]Detailed reference viewed: 21 (9 ULg) Biorthogonalization Techniques for Least Squares Temporal Difference LearningJung, Tobias ; Ernst, Damien Poster (2012, December 07)We consider Markov reward processes and study OLS-LSTD, a framework for selecting basis functions from a set of candidates to obtain a sparse representation of the value function in the context of least ... [more ▼]We consider Markov reward processes and study OLS-LSTD, a framework for selecting basis functions from a set of candidates to obtain a sparse representation of the value function in the context of least squares temporal difference learning. To support efficient both updating and downdating operations, OLS-LSTD uses a biorthogonal representation for the selected basis vectors. Empirical comparisons with the recently proposed MP and LARS frameworks for LSTD are made. [less ▲]Detailed reference viewed: 62 (17 ULg) Batch mode reinforcement learning based on the synthesis of artificial trajectoriesErnst, Damien Speech/Talk (2012)Detailed reference viewed: 18 (1 ULg) Optimal discovery with probabilistic expert adviceBubeck, Sébastien; Ernst, Damien ; Garivier, Aurélienin Proceedings of the 51st IEEE Conference on Decision and Control (CDC 2012) (2012, December)Motivated by issues of security analysis for power systems, we analyze a new problem, called optimal discovery with probabilistic expert advice. We address it with an algorithm based on the optimistic ... [more ▼]Motivated by issues of security analysis for power systems, we analyze a new problem, called optimal discovery with probabilistic expert advice. We address it with an algorithm based on the optimistic paradigm and the Good-Turingmissing mass estimator. We show that this strategy attains the optimal discovery rate in a macroscopic limit sense, under some assumptions on the probabilistic experts. We also provide numerical experiments suggesting that this optimal behavior may still hold under weaker assumptions. [less ▲]Detailed reference viewed: 17 (2 ULg) Cooperative frequency control with a multi-terminal high-voltage DC networkSarlette, Alain; Dai, Jing; Phulpin, Yannick et alin Automatica (2012), 48(12), 31283134We consider frequency control in power systems made of several non-synchronous AC areas connected by a multi-terminal high-voltage direct current (HVDC) grid. We propose two HVDC control schemes to make ... [more ▼]We consider frequency control in power systems made of several non-synchronous AC areas connected by a multi-terminal high-voltage direct current (HVDC) grid. We propose two HVDC control schemes to make the areas collectively react to power imbalances, so that individual areas can schedule smaller power reserves. The first scheme modifies the power injected by each area into the DC grid as a function of frequency deviations of neighboring AC areas. The second scheme changes the DC voltage of each converter as a function of its own area's frequency only, relying on the physical network to obtain a collective reaction. For both schemes, we prove convergence of the closed-loop system with heterogeneous AC areas. [less ▲]Detailed reference viewed: 225 (10 ULg) A computationally efficient algorithm for the provision of a day-ahead modulation service by a load aggregatorMathieu, Sébastien ; Karangelos, Efthymios ; Louveaux, Quentin et alPoster (2012, October 08)We study a decision making problem faced by an aggregator willing to offer a load modulation service to a Transmission System Operator (TSO). In particular, we concentrate on a day-ahead service ... [more ▼]We study a decision making problem faced by an aggregator willing to offer a load modulation service to a Transmission System Operator (TSO). In particular, we concentrate on a day-ahead service consisting of a load modulation option, which can be called by the TSO once per day. The option specifies the maximum amplitude of a potential modification on the demand of the loads within a certain time interval. We consider the specific case where the loads can be modeled by a generic tank model whose inflow depends on the power consumed by the load and outflow is assumed to be known the day before for every market period. The level of the reservoir at the beginning of the market day is also assumed to be known. We show that, under these assumptions, the problem of maximizing the amplitude of the load modulation service can be formulated as a mixed integer linear programming problem (MILP). In order to solve this problem in a computationally efficient manner we introduce a novel heuristic-method. We test this method on a set of problems and demonstrate that our approach is orders of magnitude faster than CPLEX - a state-of-the-art software for solving MILP problems - without considerably compromising the solution accuracy. [less ▲]Detailed reference viewed: 84 (19 ULg) Policy search in a space of simple closed-form formulas: towards interpretability of reinforcement learningMaes, Francis ; Fonteneau, Raphaël ; Wehenkel, Louis et alin Discovery Science 15th International Conference, DS 2012, Lyon, France, October 29-31, 2012. Proceedings (2012, October)In this paper, we address the problem of computing interpretable solutions to reinforcement learning (RL) problems. To this end, we propose a search algorithm over a space of simple losed-form formulas ... [more ▼]In this paper, we address the problem of computing interpretable solutions to reinforcement learning (RL) problems. To this end, we propose a search algorithm over a space of simple losed-form formulas that are used to rank actions. We formalize the search for a high-performance policy as a multi-armed bandit problem where each arm corresponds to a candidate policy canonically represented by its shortest formula-based representation. Experiments, conducted on standard benchmarks, show that this approach manages to determine both efﬁcient and interpretable solutions. [less ▲]Detailed reference viewed: 38 (13 ULg) Contextual Multi-armed Bandits for the Prevention of Spam in VoIP NetworksJung, Tobias ; Martin, Sylvain ; Ernst, Damien et alE-print/Working paper (2012)In this paper we argue that contextual multi-armed bandit algorithms could open avenues for designing self-learning security modules for computer networks and related tasks. The paper has two ... [more ▼]In this paper we argue that contextual multi-armed bandit algorithms could open avenues for designing self-learning security modules for computer networks and related tasks. The paper has two contributions: a conceptual one and an algorithmical one. The conceptual contribution is to formulate -- as an example -- the real-world problem of preventing SPIT (Spam in VoIP networks), which is currently not satisfyingly addressed by standard techniques, as a sequential learning problem, namely as a contextual multi-armed bandit. Our second contribution is to present CMABFAS, a new algorithm for general contextual multi-armed bandit learning that specifically targets domains with finite actions. We illustrate how CMABFAS could be used to design a fully self-learning SPIT filter that does not rely on feedback from the end-user (i.e., does not require labeled data) and report first simulation results. [less ▲]Detailed reference viewed: 120 (30 ULg) Généralisation min max pour l'apprentissage par renforcement batch et déterministe : schémas de relaxationFonteneau, Raphaël ; Ernst, Damien ; Boigelot, Bernard et alin Septièmes Journées Francophones de Planification, Décision et Apprentissage pour la conduite de systèmes (JFPDA 2012) (2012, May)On s’intéresse au problème de généralisation min max dans le cadre de l’apprentissage par renforcement batch et déterministe. Le problème a été originellement introduit par Fonteneau et al. (2011). Dans ... [more ▼]On s’intéresse au problème de généralisation min max dans le cadre de l’apprentissage par renforcement batch et déterministe. Le problème a été originellement introduit par Fonteneau et al. (2011). Dans un premier temps, on montre que le problème est NP-dur. Dans le cas où l’horizon d’optimisation vaut 2, on développe deux schémas de relaxation. Le premier schéma fonctionne en éliminant des contraintes de telle sorte qu’on obtienne un problème soluble en temps polynomial. Le deuxième schéma est une relaxation Lagrangienne conduisant à un problème conique-quadratique. On montre théoriquement et empiriquement que ces deux schémas permettent d’obtenir de meilleurs résultats que ceux proposés par Fonteneau et al. (2011). [less ▲]Detailed reference viewed: 100 (11 ULg) Coordinated primary frequency control among non-synchronous systems connected by a multi-terminal high-voltage direct current gridDai, Jing; Phulpin, Yannick; Sarlette, Alain et alin IET Generation, Transmission & Distribution (2012), 6(2), 99-108The authors consider a power system composed of several non-synchronous AC areas connected by a multiterminal high-voltage direct current (HVDC) grid. In this context, the authors propose a distributed ... [more ▼]The authors consider a power system composed of several non-synchronous AC areas connected by a multiterminal high-voltage direct current (HVDC) grid. In this context, the authors propose a distributed control scheme that modiﬁes the power injections from the different AC areas into the DC grid so as to make the system collectively react to load imbalances. This collective reaction allows each individual AC area to downscale its primary reserves. The scheme is inspired by algorithms for the consensus problem extensively studied by the control theory community. It modiﬁes the power injections based on frequency deviations of the AC areas so as to make them stay close to each other. A stability analysis of the closed-loop system is reported as well as simulation results on a benchmark power system with ﬁve AC areas. These results show that with proper tuning, the control scheme makes the frequency deviations converge rapidly to a common value following a load imbalance in an area. [less ▲]Detailed reference viewed: 77 (4 ULg) Learning to play K-armed bandit problemsMaes, Francis ; Wehenkel, Louis ; Ernst, Damien in Proceedings of the 4th International Conference on Agents and Artificial Intelligence (ICAART 2012) (2012, February)We propose a learning approach to pre-compute K-armed bandit playing policies by exploiting prior information describing the class of problems targeted by the player. Our algorithm ﬁrst samples a set of K ... [more ▼]We propose a learning approach to pre-compute K-armed bandit playing policies by exploiting prior information describing the class of problems targeted by the player. Our algorithm ﬁrst samples a set of K-armed bandit problems from the given prior, and then chooses in a space of candidate policies one that gives the best average performances over these problems. The candidate policies use an index for ranking the arms and pick at each play the arm with the highest index; the index for each arm is computed in the form of a linear combination of features describing the history of plays (e.g., number of draws, average reward, variance of rewards and higher order moments), and an estimation of distribution algorithm is used to determine its optimal parameters in the form of feature weights. We carry out simulations in the case where the prior assumes a ﬁxed number of Bernoulli arms, a ﬁxed horizon, and uniformly distributed parameters of the Bernoulli arms. These simulations show that learned strategies perform very well with respect to several other strategies previously proposed in the literature (UCB1, UCB2, UCB-V, KL-UCB and $\epsilon_n$-GREEDY); they also highlight the robustness of these strategies with respect to wrong prior information. [less ▲]Detailed reference viewed: 175 (19 ULg) Multi-terminal HVDC systems and ancillary servicesErnst, Damien Speech/Talk (2012)Detailed reference viewed: 19 (3 ULg)