8 - Monte Carlo Tree Search

ucla | CS 161 | 2024-02-12 14:25


Table of Contents

Algo

  • random simulation
  • start with root node of value $N$
  • each child node has two characteristics $n_i$ and a value $v_i$

    Steps

Upper Confidence Bound (UCB)

  • select node with highest UCB \(UCB=v_i+c\sqrt{\frac{\ln N}{n_i}}\quad\text{s.t.}\quad v_i=\frac{w_i}{n_i}\)

    Generalization

  • taking to the limit of max children nodes tells us MCTS will approach the full minimax tree

    Probabilistic Nodes

  • introduction of chance nodes between max and min players
  • each nodes $n_i$ is a probability between its sister nodes
  • total probability law applies between a subtree such that probs of all sister nodes under the same parent add up to 1
  • the value we assign is the expectation (weighted sum)
  • probs can be assigned using some model, independence assumption is not required, probs can be conditional