8 - Monte Carlo Tree Search

ucla | CS 161 | 2024-02-12 14:25

Algo

select node with highest UCB $UCB=v_i+c\sqrt{\frac{\ln N}{n_i}}\quad\text{s.t.}\quad v_i=\frac{w_i}{n_i}$
Generalization
taking to the limit of max children nodes tells us MCTS will approach the full minimax tree
Probabilistic Nodes
introduction of chance nodes between max and min players
each nodes $n_i$ is a probability between its sister nodes
total probability law applies between a subtree such that probs of all sister nodes under the same parent add up to 1
the value we assign is the expectation (weighted sum)
probs can be assigned using some model, independence assumption is not required, probs can be conditional