Statistical mechanics approach to a reinforcement learning model with memory

[en] We introduce a two-player model of reinforcement learning with memory. Past actions of an iterated game are stored in a memory and used to determine player's next action. To examine the behaviour of the model some approximate methods are used and confronted against numerical simulations and exact master equation. When the length of memory of players increases to infinity the model undergoes ail absorbing-state phase transition. Performance of examined strategies is checked in the prisonor' dilemma game. It turns out that it is advantageous to have a large memory in symmetric games, but it is better to have a short memory in asymmetric ones. (C) 2009 Elsevier B.V. All rights reserved.

Disciplines :

Physics
Computer science

Author, co-author :

Lipowski, A.

Gontarek, K.

Ausloos, Marcel ; Université de Liège - ULiège > Département de physique > Département de physique

Language :

English

Title :

Statistical mechanics approach to a reinforcement learning model with memory

Publication date :

2009

Journal title :

Physica A. Statistical Mechanics and its Applications

ISSN :

0378-4371

eISSN :

1873-2119

Publisher :

Elsevier Science, Amsterdam, Netherlands

Volume :

388

Issue :

Pages :

1849-1856

Peer reviewed :

Peer Reviewed verified by ORBi

Available on ORBi :

since 25 September 2012

Statistics

Number of views

56 (0 by ULiège)

Number of downloads

0 (0 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

Bibliography

Fudenberg D., and Tirole J. Game Theory (1991), MIT Press, Cambridge, Massachusetts
Axelrod R. The Evolution of Cooperation (1984), Basic Books, New York
Nowak M., and Sigmund K. Nature 364 (1993) 56
Golbeck J. Evolving Strategies for the Prisoners Dilemma. In Advances in Intelligent Systems, Fuzzy Systems, and Evolutionary Computation vol. 2002 (2002) 299
Laslier J., Topol R., and Walliser B. Games Econom. Behav. 37 (2001) 340
Howard R.A. Dynamic Programming and Markov Processes (1960), The MIT Press, Cambridge, Massachusetts
Barto A.G., et al. In: Gabriel M., and Moore J. (Eds). Learning and Computational Neuroscience: Foundations of Adaptive Networks (1991), The MIT Press, Cambridge, Massachusetts
Littman M.L. Proceedings of the Eleventh International Conference on Machine Learning (1994), Morgan Kaufmann, San Francisco, CA p. 157
Beggs A.W. J. Econom. Theory 122 (2005) 1
Darmon E., and Waldeck R. Physica A 355 (2005) 119
Erev I., and Roth A.E. Amer. Econom. Rev. 88 (1998) 848
Bush R., and Mosteller F. Stochastic Models of Learning (1955), John Wiliey & Son, New York
Ódor G. Rev. Modern Phys. 76 (2004) 663
Hinrichsen H. Adv. Phys. 49 (2000) 815
Hauert Ch., and Szabó G. Am. J. Phys. 73 (2005) 405
D. Phan, R. Waldeck, M.B. Gordon, J.-P. Nadal, Adoption and cooperation in communities: Mixed equilibrium in polymorphic populations, in: Proceedings of Wehia05, University of Essex, United Kingdom, 2005
Gordon M.B., Phan D., Waldeck R., and Nadal J.P. Cooperation and free-riding with moral costs. Adv. Cogn. Econ., NBU Series in Cognitive Science (2005), Sofia NBU Press 294
Fechner G.T. Elemente der Psychophysik (1860), Breitkopf und Hartel, Leipzig
Stevens S.S. Psychophysics: Introduction to its Perceptual, Neural and Social Prospects (1975), Wiley, New York
Copelli M., Roque A.C., Oliveira R.F., and Kinouchi O. Phys. Rev. E 65 (2002) 060901
Kinouchi O., and Copelli M. Nature Phys. 2 (2006) 348
Miȩkisz J. Lecture Notes in Math. 1940 (2008) 269
Baldwin J.M. Am. Nat. 30 (1896) 441
Hingston P., and Kendall G. Learning versus Evolution in Iterated Prisoner's Dilemma. Proceedings of Congress on Evolutionary Computation 2004, CEC'04, Portland, Oregon (2004), IEEE, Piscataway NJ 364
R. Suzuki, T. Arita, Proceedings of 7th International Conference on Neural Information Processing, Taejon, Korea, 2000, p. 738