Mustansar Fiaz, Mubashir Noman, et al.
IGARSS 2025
TD-Gammon is a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results. Starting from random initial play, TD-Gammon's self-teaching methodology results in a surprisingly strong program: without lookahead, its positional judgement rivals that of human experts, and when combined with shallow lookahead, it reaches a level of play that surpasses even the best human players. The success of TD-Gammon has also been replicated by several other programmers; at least two other neural net programs also appear to be capable of superhuman play. Previous papers on TD-Gammon have focused on developing a scientific understanding of its reinforcement learning methodology. This paper views machine learning as a tool in a programmer's toolkit, and considers how it can be combined with other programming techniques to achieve and surpass world-class backgammon play. Particular emphasis is placed on programming shallow-depth search algorithms, and on TD-Gammon's doubling algorithm, which is described in print here for the first time. © 2002 Elsevier Science B.V. All rights reserved.
Mustansar Fiaz, Mubashir Noman, et al.
IGARSS 2025
Gang Liu, Michael Sun, et al.
ICLR 2025
Ryan Johnson, Ippokratis Pandis
CIDR 2013
Arnold L. Rosenberg
Journal of the ACM