HearthSim — Modeling Unpredictable Outcomes

By | July 23, 2014

We have so far employed an AI strategy similar to how modern day chess or go AIs work: working out all possible outcomes of all possible actions that the player can perform and picking out the best series of actions. In games such as chess, the strategy works out well (witness recent computer vs humans chess games) as chess is inherently a deterministic game. However, Hearthstone is not deterministic. There are various cards with random effects that prevents us from simply following the footsteps of successful chess AIs.

There are primarily three ways where randomness comes into play in Hearthstone:

  1. Random effect cardsMad Bomber, Arcane Missiles, etc
  2. Card draw effectsNovice Engineer, Northshire Cleric, etc
  3. SecretsCounterspell, Explosive Trap, etc

The first item is pretty obvious. The second and third effects are problematic since the AI does not (supposed to not) know which cards are drawn or which secrets were played. So, it is not right to just draw a card immediately during the BoardState search tree generation.

So, we will introduce the concept of a StopNode. A StopNode is a BoardState search tree node that triggers a “stop” in the search tree generation. All nodes subsequent to a StopNode must be discarded, and the score of the StopNode must be computed solely from the information available at the StopNode without actually playing out the outcome of the random effect.

Each type of random effect node will have different strategies when it comes to deciding the score for its StopNode.

  1. Random effect cards — For random effects node, the strategy should be to work out the outcome of all possible random outcome and pick the expected value of the random outcomes’ scores. In practice, this is almost always impossible because the number of subsequent nodes explodes exponentially. For example, given a full opponent board, there are potentially up to 16.8 million ways in which Avenging Wrath can hit, with potentially millions of subsequent nodes for each Avenging Wrath outcome. In these situations, we will resort to a Monte Carlo simulation of the outcomes.
  2. Card draw effects — For card draw effects, the AI StopNode is necessary because the card to be drawn is unknown to the AI, and after the card is drawn, the AI must not go back and pick a sequence of moves that does not involve drawing a card (that would be cheating). The strategy to use here is, at a card draw StopNode, to pretend that a card draw does not happen and continue with the simulation. The score for the card draw node becomes the score of the best possible outcome without the card draw, plus the expected increase in the score due to the cards to be drawn. The expected increase can be computed because the AI knows exactly which cards remain in the deck, even though it doesn’t know which order the cards are in.
  3. Secret — The strategy for the AI will be to simulate all possible secrets that the opponents might have and work out the best subsequent moves for each one. Once the moves are computed, the AI will assume the best moves from the most penalizing secret and will start playing that sequence, until the actual secret is triggered. If the AI’s guess turns out to be correct, fine, at least it will be able to make the best out of it. If the AI’s guess turns out to be wrong, great, it can now go and play an even more optimal sequence of moves.

That’s the summary. I will write more details for each type once the implementations are done and tested.

2 thoughts on “HearthSim — Modeling Unpredictable Outcomes

  1. Markus Persson

    I guess you could handle card draw effects as secrets, no? Simulate all possible card that can be drawn, assume the best moves from the most “penalizing” card and start playing that sequence. If the AI’s guess turns out to be correct, fine. If wrong, great.

    This will of course open up a lot more outcomes…

    1. oyachai Post author

      In theory, I think that would be great. Like you said though, it increases the possible number of outcomes greatly (if there are 20 cards left in the deck, then the number of outcomes is increased by a factor of 20), so I don’t know it that is practical. For secrets, it’s (hopefully) manageable as there are only a few (5-ish) secrets that can be played.

      Also, the case where no card is drawn is basically the most “penalizing” outcome anyways. So, starting to play that sequence is not too different from playing assuming the “worst” card is drawn.


Leave a Reply

Your email address will not be published. Required fields are marked *