Browse ORBi by ORBi project

- Background
- Content
- Benefits and challenges
- Legal aspects
- Functions and services
- Team
- Help and tutorials

Monte Carlo search algorithm discovery for single-player games ; Lupien St-Pierre, David ; Ernst, Damien in IEEE Transactions on Computational Intelligence and AI in Games (2013), 5(3), 201-213 Much current research in AI and games is being devoted to Monte Carlo search (MCS) algorithms. While the quest for a single uniﬁed MCS algorithm that would perform well on all problems is of major ... [more ▼] Much current research in AI and games is being devoted to Monte Carlo search (MCS) algorithms. While the quest for a single uniﬁed MCS algorithm that would perform well on all problems is of major interest for AI, practitioners often know in advance the problem they want to solve, and spend plenty of time exploiting this knowledge to customize their MCS algorithm in a problem-driven way. We propose an MCS algorithm discovery scheme to perform this in an automatic and reproducible way. We ﬁrst introduce a grammar over MCS algorithms that enables inducing a rich space of candidate algorithms. Afterwards, we search in this space for the algorithm that performs best on average for a given distribution of training problems. We rely on multi-armed bandits to approximately solve this optimization problem. The experiments, generated on three different domains, show that our approach enables discovering algorithms that outperform several well-known MCS algorithms such as Upper Conﬁdence bounds applied to Trees and Nested Monte Carlo search. We also show that the discovered algorithms are generally quite robust with respect to changes in the distribution over the training problems. [less ▲] Detailed reference viewed: 313 (29 ULg)Contributions to Monte Carlo Search Lupien St-Pierre, David Doctoral thesis (2013) This research is motivated by improving decision making under uncertainty and in particular for games and symbolic regression. The present dissertation gathers research contributions in the field of Monte ... [more ▼] This research is motivated by improving decision making under uncertainty and in particular for games and symbolic regression. The present dissertation gathers research contributions in the field of Monte Carlo Search. These contributions are focused around the selection, the simulation and the recommendation policies. Moreover, we develop a methodology to automatically generate an MCS algorithm for a given problem. For the selection policy, in most of the bandit literature, it is assumed that there is no structure or similarities between arms. Thus each arm is independent from one another. In several instances however, arms can be closely related. We show both theoretically and empirically, that a significant improvement over the state-of-the-art selection policies is possible. For the contribution on simulation policy, we focus on the symbolic regression problem and ponder on how to consistently generate different expressions by changing the probability to draw each symbol. We formalize the situation into an optimization problem and try different approaches. We show a clear improvement in the sampling process for any length. We further test the best approach by embedding it into a MCS algorithm and it still shows an improvement. For the contribution on recommendation policy, we study the most common in combination with selection policies. A good recommendation policy is a policy that works well with a given selection policy. We show that there is a trend that seems to favor a robust recommendation policy over a riskier one. We also present a contribution where we automatically generate several MCS algorithms from a list of core components upon which most MCS algorithms are built upon and compare them to generic algorithms. The results show that it often enables discovering new variants of MCS that significantly outperform generic MCS algorithms. [less ▲] Detailed reference viewed: 60 (11 ULg)Comparison of Different Selection Strategies in Monte-Carlo Tree Search for the Game of Tron ; Lupien St-Pierre, David ; Maes, Francis et al in IEEE Conference on Computational and Intelligence in Games 2012 (2012) Monte-Carlo Tree Search (MCTS) techniques are essentially known for their performance on turn-based games, such as Go, for which players have considerable time for choosing their moves. In this paper, we ... [more ▼] Monte-Carlo Tree Search (MCTS) techniques are essentially known for their performance on turn-based games, such as Go, for which players have considerable time for choosing their moves. In this paper, we apply MCTS to the game of Tron, a simultaneous real-time two-player game. The fact that players have to react fast and that moves occur simultaneously creates an unusual setting for MCTS, in which classical selection policies such as UCB1 may be suboptimal. In this paper, we perform an empirical comparison of a wide range of selection policies for MCTS applied to Tron, with both deterministic policies (UCB1, UCB1-Tuned, UCB-V, UCBMinimal, OMC-Deterministic, MOSS) and stochastic policies (Epsilon-greedy, EXP3, Thompson Sampling, OMC-Stochastic, PBBM). From the experiments, we observe that UCB1-Tuned has the best behavior shortly followed by UCB1. Even if UCB-Minimal is ranked fourth, this is a remarkable result for this recently introduced selection policy found through automatic discovery of good policies on generic multi-armed bandit problems. We also show that deterministic policies perform better than stochastic ones for this problem. [less ▲] Detailed reference viewed: 43 (5 ULg)Online Sparse Bandit for Card Games Lupien St-Pierre, David ; Louveaux, Quentin ; in Advance in Computer Games (2011) Finding an approximation of a Nash equilibria in matrix games is an important topic that reaches beyond the strict application to matrix games. A bandit algorithm commonly used to approximate a Nash ... [more ▼] Finding an approximation of a Nash equilibria in matrix games is an important topic that reaches beyond the strict application to matrix games. A bandit algorithm commonly used to approximate a Nash equilibrium is EXP3. However, the solution to many problems is often sparse, yet EXP3 inherently fails to exploit this property. To the knowledge of the authors, there exist only an offline truncation to tackle such issue. In this paper, we propose a variation of EXP3 to exploit the fact that solution is sparse by dynamically removing arms; the resulting algorithm empirically performs better than previous versions. We apply the resulting algorithm to a MCTS program for the Urban Rivals card game. [less ▲] Detailed reference viewed: 44 (13 ULg)A Selective Move Generator for the Game Axis and Allies Lupien St-Pierre, David ; ; in Computational Intelligence and Games (2010) We consider the move generation in a modern board game where the set of all the possible moves is too large to be generated. The idea is to provide a set of simple abstract tactics that would allow enough ... [more ▼] We consider the move generation in a modern board game where the set of all the possible moves is too large to be generated. The idea is to provide a set of simple abstract tactics that would allow enough combinations to provide strong opposition. The reduced search space is then traversed using the alpha beta search. We also propose a technique that allows us to remove the stochasticity from the search space. The model was tested in a game called Axis and Allies: a modern, turn-based, perfect information, non-deterministic, strategy board game. We first show that a tree search technique based on a restrained set of moves can beat the actual scripted AI engine --- E.Z. Fodder. We can conclude from the experiments that searching deeper generates complex maneuvers which in turn significantly increase the likelihood of victory. [less ▲] Detailed reference viewed: 28 (1 ULg) |
||