# Monthly Archives: July 2014

## HearthSim — Modeling Unpredictable Outcomes

We have so far employed an AI strategy similar to how modern day chess or go AIs work: working out all possible outcomes of all possible actions that the player can perform and picking out the best series of actions. In games such as chess, the strategy works out well (witness recent computer vs humans chess games) as chess is inherently a deterministic game. However, Hearthstone is not deterministic. There are various cards with random effects that prevents us from simply following the footsteps of successful chess AIs.

There are primarily three ways where randomness comes into play in Hearthstone:

1. Random effect cardsMad Bomber, Arcane Missiles, etc
2. Card draw effectsNovice Engineer, Northshire Cleric, etc
3. SecretsCounterspell, Explosive Trap, etc

The first item is pretty obvious. The second and third effects are problematic since the AI does not (supposed to not) know which cards are drawn or which secrets were played. So, it is not right to just draw a card immediately during the BoardState search tree generation.

So, we will introduce the concept of a StopNode. A StopNode is a BoardState search tree node that triggers a “stop” in the search tree generation. All nodes subsequent to a StopNode must be discarded, and the score of the StopNode must be computed solely from the information available at the StopNode without actually playing out the outcome of the random effect.

Each type of random effect node will have different strategies when it comes to deciding the score for its StopNode.

1. Random effect cards — For random effects node, the strategy should be to work out the outcome of all possible random outcome and pick the expected value of the random outcomes’ scores. In practice, this is almost always impossible because the number of subsequent nodes explodes exponentially. For example, given a full opponent board, there are potentially up to 16.8 million ways in which Avenging Wrath can hit, with potentially millions of subsequent nodes for each Avenging Wrath outcome. In these situations, we will resort to a Monte Carlo simulation of the outcomes.
2. Card draw effects — For card draw effects, the AI StopNode is necessary because the card to be drawn is unknown to the AI, and after the card is drawn, the AI must not go back and pick a sequence of moves that does not involve drawing a card (that would be cheating). The strategy to use here is, at a card draw StopNode, to pretend that a card draw does not happen and continue with the simulation. The score for the card draw node becomes the score of the best possible outcome without the card draw, plus the expected increase in the score due to the cards to be drawn. The expected increase can be computed because the AI knows exactly which cards remain in the deck, even though it doesn’t know which order the cards are in.
3. Secret — The strategy for the AI will be to simulate all possible secrets that the opponents might have and work out the best subsequent moves for each one. Once the moves are computed, the AI will assume the best moves from the most penalizing secret and will start playing that sequence, until the actual secret is triggered. If the AI’s guess turns out to be correct, fine, at least it will be able to make the best out of it. If the AI’s guess turns out to be wrong, great, it can now go and play an even more optimal sequence of moves.

That’s the summary. I will write more details for each type once the implementations are done and tested.

## HearthSim: Divine Shield Modeling — Part 2

This is part 2 of our Divine Shield modeling.

### Setup

Let’s take our Super Basic Deck, with a generic no-good Hero (with no hero abilities), and pit it against an opponent with the same deck with Scarlet Crusader replaced by Magma Rager. We will call the player with Scarlet Crusader Player0, and the opponent Player1.

We take $$w_{\rm ds} = 0$$ as a base case and will compare the performance of our AI as we tweak $$w_{\rm ds}$$.

As a side note… the Scarlet Crusader is a much better card than the Magma Rager under this circumstance (and probably under any other circumstances). Player0’s base win rate is something like 64.8%, compared to 57.1% if Player0 used the same deck as Player0 (i.e., Magma Rager instead of Scarlet Crusader).

### Results

The result of running the simulation with different divine shield weighting looks like this:

Raw Data

Recall from part 1 that our initial guess was that the optimal weighting will be somewhere between 0 and 1. Well, we were close. At least in this situation, the optimal weighting seems to be 1. Let’s try to understand the results in more detail.

##### Case: $$w_{\rm ds} < 0$$

When the weight is negative, the AI thinks that having a divine shield on a minion is a disadvantage, and it will prioritize removing the DS. Because attacking the enemy hero doesn’t remove the DS, the AI will pretty much always attack another minion with the Scarlet Crusader. This strategy turns out to be ok, and it maintains a >60% win rate, but it’s obvious that it’s not optimal.

As a side note, with the weight less than -1, the AI will never put the Scarlet Crusader onto the board because doing so will result in decreasing the score. So, there is a lower limit on $$w_{\rm st}$$.

##### Case: $$w_{\rm ds} = 1$$

With $$w_{\rm ds}$$ set to 1, the AI seems to perform optimally. This is because it ends up making decent trade decisions. At this weight, the divine shield on a Scarlet Crusader is worth 4, while if it is used to attack an enemy minion but fails to kill it, it does 3 damage. So, in the attack, the AI loses 4 score from DS and gains 3 from the enemy minion health going down — a losing proposition. The AI typically won’t make that attack and will go after something else with better value, such as hitting the hero. The simulation suggests that that is indeed the correct thing to do.

##### Case: $$w_{\rm ds} > 1$$

When $$w_{\rm ds} > 1$$, the AI starts to consider the DS more and more valuable. At $$w_{\rm ds} = 1.25$$, the DS is worth 5; at $$w_{\rm ds} = 1.5$$, it’s 6; and so on. This increase makes the AI less eager to attack the another minion with the Scarlet Crusader, and it ends up pretty much always going for the face. Needless to say, always going for the face is a losing strategy (see this post for a demonstration of that), so the win rate plummets.

##### Miscellaneous Results

As suggested in a previous comment, I thought it would be interesting to look at the distribution of the duration of the game; i.e, how many turns does a typical game last given different strategies employed by the AI. In this setup, let’s compare the cases where $$w_{\rm ds} = 0$$, $$w_{\rm ds} = 1$$, and $$w_{\rm ds} = 2$$. Below is the plot of the fraction of games that end at a given turn.

Games played with $$w_{\rm ds} = 2$$ clearly tend to end earlier. This trend makes sense from the strategy point: the AI is being quite aggressive, and aggro games tend to be shorter. It’s also a hint that the AI at such weight isn’t losing close games; rather, it is usually on the receiving end of a beat down.

### Summary

In conclusion… I think this is a fair divine shield model for the AI. DS certainly seems to help make the deck better, and as long as you use it to smack and kill other minions, it provides good value for its cost. You’d certainly want to pick Scarlet Crusader over Magma Rager, and most likely Argent Squire over Murloc Raider and so on.

As always, leave any comments or suggestions on our on our discussion board!

## HearthSim: Divine Shield Modeling — Part 1

Let’s take a look at how we might go about modeling Divine Shield.

Recall that the AI scoring function is given by
$$S = S_{\rm b} + \tilde{S}_{\rm b} + S_{\rm c} + S_{\rm h} + \tilde{S}_{\rm h}$$
(see the original post). In particular, for each minion that one has on the board, the score goes up proportionally to the attack and health value of the minion:
$$S_{\rm b} = \sum_{i} (w_{\rm a} a_{i} + w_{\rm h} h_{i})$$
When a minion has divine shield, we expect it to be valued higher than a regular minion. But, by how much?

The essence of divine shield is that it allows the minion to be used (attack) once for “free.” That free attack does not damage or kill the minion, removing the divine shield instead and leaving a regular minion on the board. This observation suggests that we can think of a divine shield as effectively doubling the score of the minion. So, let’s propose to model each minion’s score as follows: for each minion $$i$$ that has divine shield,
$$S_{\rm b} = (w_{\rm a} a_{i} + w_{\rm h} h_{i} + (a_{i} + h_{i}) * w_{\rm ds})$$
where $$w_{\rm ds}$$ is the divine shield weight. When the divine shield is removed, it goes back to the regular minion score. Note that setting $$w_{\rm ds} = 0$$ means that the AI pretty much ignores divine shield, while setting it to a high number (say, 2) means the AI highly values the divine shield and will try to keep it as much as possible.

In real game situations, we don’t expect divine shield to provide us with exactly twice the value that a minion would have without DS. In fact, there are quite a lot of ways in which the opponent can efficiently remove the DS: silence, any battle cry damage, hit it with a weak or almost dead minion, etc. Thus, we should expect the optimal weighting to be somewhat lower than 1, though by how much is a question we can only answer by running some simulations. Part2 will go into some simulation results.

Continues on part 2.

## HearthSim: Super Basic Deck for Testing

Here’s the first real deck that I’ll be using for testing HearthSim in the next couple of posts.

Goldshire Footman × 2
Murloc Raider × 2
Bloodfen Raptor × 2
Frostwolf Grunt × 2
River Crocolisk × 2
Ironfur Grizzly × 2
Silverback Patriarch × 2
Chillwind Yeti × 2
Oasis Snapjaw × 2
[Sen’jin Shieldmasta] × 2
Booty Bay Bodyguard × 2
Fen Creeper × 2
Boulderfist Ogre × 2
War Golem × 2

Hearthpwn deck list here.

It’s a super basic deck with no special abilities / battle cries, so the AI should be able to handle the deck well. The only untested part is the AI’s divine shield modeling, so that’s the first thing I’ll test and tune using this deck.

You can find the deck as part of “example2” in the examples directory of HearthSim. As usual, head over to HearthSim-dev board for any questions or discussions.

## HearthSim: To Taunt or not to Taunt

Here’s a quick set of results highlighting the Taunt mechanics.

Taunt is a game mechanic where the opponent minions can direct where your attack goes. When there are Taunt minions on the opponents board, your hero and minions can only physically attack Taunt’ed minions. Spell cards not affected by this restriction. But… I’m sure most readers are quite familiar with the mechanics… so on with the simulation and the results.

We take the standard minion only random deck (described in more details here), but with max minion attack of 5 and max minion health of 4. I think this set makes the game more even in terms of the Coin mechanic, but the overall conclusion here seems to be insensitive to this choice. We start with a 0 taunt minion setup, and give Player0 increasing number of Taunt cards to see how that changes the performance of Player0’s deck.

# of Taunts P0 Win P1 Win P0 Win % CP 95% Int.
0 19200 20800 48.0% 47.51% — 48.49%
2 20614 19386 51.54% 51.04% — 52.03%
4 21952 18048 54.88% 54.39% — 55.37%
8 23702 16298 59.26% 58.77% — 59.74%
12 24812 15188 62.03% 61.55% — 62.51%
16 25403 14597 63.51% 63.03% — 63.98%
20 25108 14892 62.77% 62.29% — 63.24%
24 23718 16282 59.30% 58.81% — 59.78%
28 21510 18490 53.78% 53.28% — 54.26%
30 20018 19982 50.05% 49.55% — 50.54%

Plotted, it looks like this:

The crux of Taunt’s value is that it forces your opponent into making unfavorable (to your opponent) trades while you get to roam free and make the optimal plays. So, at first, I expected that the P0 win rate will go up as it has more and more Taunts in his deck. To a certain extent, it does. Up at 16 Taunts, P0 win rate increases by about 15% over the base case, which is a significant increase considering the strong randomness of this game.

Beyond 16 Taunts though, the win rate starts to decline. This at first puzzled me, but my current guess as to why is that, by having too many Taunts, you force the opponents into clearing your board constantly. The opponent is trying to play the game as best it can, so of course it is going to try to clear the board as efficiently as it can. Thus, P0, by the virtue of its Taunts, forces its opponents to play more optimally. Not a good strategy when you are trying to win.

Taunt is an interesting mechanic though. The above result suggests that one can actually go tune the AI to beat Taunt decks. I should definitely look into that.

## HearthSim on github

HearthSim is now on github at github:HearthSim!

The current version has all the code necessary for the previous simulation results. Next up is writing more test cases and implementing more game mechanics (i.e., more cards).