Annina Riedhauser, Viacheslav Snigirev, et al.
CLEO 2023
The results obtained by Pollack and Blair substantially underperform my 1992 TD Learning results. This is shown by directly benchmarking the 1992 TD nets against Pubeval. A plausible hypothesis for this underperformance is that, unlike TD learning, the hillclimbing algorithm fails to capture nonlinear structure inherent in the problem, and despite the presence of hidden units, only obtains a linear approximation to the optimal policy for backgammon. Two lines of evidence supporting this hypothesis are discussed, the first coming from the structure of the Pubeval benchmark program, and the second coming from experiments replicating the Pollack and Blair results. © 1998 Kluwer Academic Publishers.
Annina Riedhauser, Viacheslav Snigirev, et al.
CLEO 2023
David W. Jacobs, Daphna Weinshall, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence
Arnon Amir, Michael Lindenbaum
IEEE Transactions on Pattern Analysis and Machine Intelligence
Mustansar Fiaz, Mubashir Noman, et al.
IGARSS 2025