8 - Monte Carlo Tree Search

ucla | CS 161 | 2024-02-12 14:25


Table of Contents

Algo

  • random simulation
  • start with root node of value N
  • each child node has two characteristics ni and a value vi

    Steps

Upper Confidence Bound (UCB)

  • select node with highest UCB UCB=vi+clnNnis.t.vi=wini

    Generalization

  • taking to the limit of max children nodes tells us MCTS will approach the full minimax tree

    Probabilistic Nodes

  • introduction of chance nodes between max and min players
  • each nodes ni is a probability between its sister nodes
  • total probability law applies between a subtree such that probs of all sister nodes under the same parent add up to 1
  • the value we assign is the expectation (weighted sum)
  • probs can be assigned using some model, independence assumption is not required, probs can be conditional