ISGS

International Society of Go Studies

Combinatorial Game Theory Meets Deep Learning:Efficient Endgame Analysis

Stanisław Frejlak /December 18, 2024

How to cite this article:
Frejlak, S. (2024). Combinatorial Game Theory Meets Deep Learning:Efficient Endgame Analysis. Journal of Go Studies, 18(2), 17-50. doi: 10.62578/504852

Abstract
      The endgame stage of Go presents a unique challenge for scientific research. Contrary to previous stages, in the endgame the key to a successful analysis is board decomposition into smaller, independent local positions. Go players typically analyze these positions separately and prioritize moves based on their value. In this paper, I introduce a novel program that automates this decomposition-based analysis for the endgame stage of Go.

      AlphaZero has revolutionized Computer Go, by applying a generic move-selection mechanism, based on neural network judgments and the MCTS search algorithm. However, it does not specifically address the complexity of endgame in the aforementioned manner. On the other hand, by leveraging the decomposition-based analysis, my program reaches decisions in the endgame with relatively little computation. Additionally, it offers insights for Go practitioners by providing accurate move value evaluations.

      Notable prior work on automated endgame analysis was done by Martin Müller (1995). His program Explorer checked all possible variations in every undecided position and aggregated the results based on an algorithm inspired by the Combinatorial Game Theory (CGT). However, due to the exponential growth of the number of variations, Explorer’s application was limited to small, tightly bounded local positions.

      In contrast, my program leverages a neural network to predict optimal local moves, dramatically reducing the number of variations that need to be explored. Provided that the neural network’s predictions are correct, the program can accurately evaluate move values by considering relatively few variations, just like human Go experts do. Thanks to this approach, it is the first program capable of analyzing large, unbounded local positions, which are commonly encountered in real games.

      The neural network was fine-tuned from a pre-trained AlphaZero reimplementation on the task of optimal local move prediction. Training data was gathered from KataGo self-play games, utilizing KataGo’s network to perform board decomposition.

Keywords: AlphaZero, Fine-Tuning, Combinatorial Game Theory, Temperature, Move Values

References

Berlekamp, E. (1996) The economist's view of combinatorial games, Games of No Chance: 365-405

Berlekamp, E. and Wolfe, D. (1994) Mathematical Go: Chilling Gets the Last Point. doi: 10.1201/9781439863558

Conway, J. (1976) On numbers and games

Frejlak, S. (2020) Katago's yose: Go endgame theory and deep residual networks, https://github.com/siasio/EndgameBot/blob/main/1000-LIC-MAT-297313.pdf

Frejlak, S. (2022) Video course "endgame for nerds", Go Magic, https://gomagic.org/courses/endgame-for-nerds/

Frejlak, S. (2024) Deep learning and combinatorial game theory; evaluating temperatures in the game of go, https://github.com/siasio/EndgameBot/blob/main/Frejlak-masters_thesis-final.pdf

Müller, M. (1995) Computer go as a sum of local games: an application of combinatorial game theory

Nguyen, T. (2022) Alphazero in jax using deepmind mctx library, https://github.com/NTT123/a0-jax

Lew, Ł. and Frejlak, S. (2024) Finding temperatures through iterativepseudo-stop search

Siegel, A. (2013) Combinatorial Game Theory. doi: 10.1090/gsm/146

Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A. et al. (2017) Mastering the game of go without human knowledge, nature 550(7676): 354-359. doi: 10.1038/nature24270

Törmänen, A. (2019), Rational Endgame

Wolfe, D. (2002) Go endgames are pspace-hard

Wu, D. (2020) Accelerating self-play learning in Go

List

서비스 바로가기