Author Archives: oyachai

HearthSim — Modeling Unpredictable Outcomes

We have so far employed an AI strategy similar to how modern day chess or go AIs work: working out all possible outcomes of all possible actions that the player can perform and picking out the best series of actions. In games such as chess, the strategy works out well (witness recent computer vs humans chess games) as chess is inherently a deterministic game. However, Hearthstone is not deterministic. There are various cards with random effects that prevents us from simply following the footsteps of successful chess AIs.

There are primarily three ways where randomness comes into play in Hearthstone:

  1. Random effect cardsMad Bomber, Arcane Missiles, etc
  2. Card draw effectsNovice Engineer, Northshire Cleric, etc
  3. SecretsCounterspell, Explosive Trap, etc

The first item is pretty obvious. The second and third effects are problematic since the AI does not (supposed to not) know which cards are drawn or which secrets were played. So, it is not right to just draw a card immediately during the BoardState search tree generation.

So, we will introduce the concept of a StopNode. A StopNode is a BoardState search tree node that triggers a “stop” in the search tree generation. All nodes subsequent to a StopNode must be discarded, and the score of the StopNode must be computed solely from the information available at the StopNode without actually playing out the outcome of the random effect.

Each type of random effect node will have different strategies when it comes to deciding the score for its StopNode.

  1. Random effect cards — For random effects node, the strategy should be to work out the outcome of all possible random outcome and pick the expected value of the random outcomes’ scores. In practice, this is almost always impossible because the number of subsequent nodes explodes exponentially. For example, given a full opponent board, there are potentially up to 16.8 million ways in which Avenging Wrath can hit, with potentially millions of subsequent nodes for each Avenging Wrath outcome. In these situations, we will resort to a Monte Carlo simulation of the outcomes.
  2. Card draw effects — For card draw effects, the AI StopNode is necessary because the card to be drawn is unknown to the AI, and after the card is drawn, the AI must not go back and pick a sequence of moves that does not involve drawing a card (that would be cheating). The strategy to use here is, at a card draw StopNode, to pretend that a card draw does not happen and continue with the simulation. The score for the card draw node becomes the score of the best possible outcome without the card draw, plus the expected increase in the score due to the cards to be drawn. The expected increase can be computed because the AI knows exactly which cards remain in the deck, even though it doesn’t know which order the cards are in.
  3. Secret — The strategy for the AI will be to simulate all possible secrets that the opponents might have and work out the best subsequent moves for each one. Once the moves are computed, the AI will assume the best moves from the most penalizing secret and will start playing that sequence, until the actual secret is triggered. If the AI’s guess turns out to be correct, fine, at least it will be able to make the best out of it. If the AI’s guess turns out to be wrong, great, it can now go and play an even more optimal sequence of moves.

That’s the summary. I will write more details for each type once the implementations are done and tested.

HearthSim: Divine Shield Modeling — Part 2

This is part 2 of our Divine Shield modeling.

Setup

Let’s take our Super Basic Deck, with a generic no-good Hero (with no hero abilities), and pit it against an opponent with the same deck with Scarlet Crusader replaced by Magma Rager. We will call the player with Scarlet Crusader Player0, and the opponent Player1.

We take \(w_{\rm ds} = 0\) as a base case and will compare the performance of our AI as we tweak \(w_{\rm ds}\).

As a side note… the Scarlet Crusader is a much better card than the Magma Rager under this circumstance (and probably under any other circumstances). Player0’s base win rate is something like 64.8%, compared to 57.1% if Player0 used the same deck as Player0 (i.e., Magma Rager instead of Scarlet Crusader).

Results

The result of running the simulation with different divine shield weighting looks like this:

Raw Data

Recall from part 1 that our initial guess was that the optimal weighting will be somewhere between 0 and 1. Well, we were close. At least in this situation, the optimal weighting seems to be 1. Let’s try to understand the results in more detail.

Case: \(w_{\rm ds} < 0\)

When the weight is negative, the AI thinks that having a divine shield on a minion is a disadvantage, and it will prioritize removing the DS. Because attacking the enemy hero doesn’t remove the DS, the AI will pretty much always attack another minion with the Scarlet Crusader. This strategy turns out to be ok, and it maintains a >60% win rate, but it’s obvious that it’s not optimal.

As a side note, with the weight less than -1, the AI will never put the Scarlet Crusader onto the board because doing so will result in decreasing the score. So, there is a lower limit on \(w_{\rm st}\).

Case: \(w_{\rm ds} = 1\)

With \(w_{\rm ds}\) set to 1, the AI seems to perform optimally. This is because it ends up making decent trade decisions. At this weight, the divine shield on a Scarlet Crusader is worth 4, while if it is used to attack an enemy minion but fails to kill it, it does 3 damage. So, in the attack, the AI loses 4 score from DS and gains 3 from the enemy minion health going down — a losing proposition. The AI typically won’t make that attack and will go after something else with better value, such as hitting the hero. The simulation suggests that that is indeed the correct thing to do.

Case: \(w_{\rm ds} > 1\)

When \(w_{\rm ds} > 1\), the AI starts to consider the DS more and more valuable. At \(w_{\rm ds} = 1.25\), the DS is worth 5; at \(w_{\rm ds} = 1.5\), it’s 6; and so on. This increase makes the AI less eager to attack the another minion with the Scarlet Crusader, and it ends up pretty much always going for the face. Needless to say, always going for the face is a losing strategy (see this post for a demonstration of that), so the win rate plummets.

Miscellaneous Results

As suggested in a previous comment, I thought it would be interesting to look at the distribution of the duration of the game; i.e, how many turns does a typical game last given different strategies employed by the AI. In this setup, let’s compare the cases where \(w_{\rm ds} = 0\), \(w_{\rm ds} = 1\), and \(w_{\rm ds} = 2\). Below is the plot of the fraction of games that end at a given turn.

Games played with \(w_{\rm ds} = 2\) clearly tend to end earlier. This trend makes sense from the strategy point: the AI is being quite aggressive, and aggro games tend to be shorter. It’s also a hint that the AI at such weight isn’t losing close games; rather, it is usually on the receiving end of a beat down.

Summary

In conclusion… I think this is a fair divine shield model for the AI. DS certainly seems to help make the deck better, and as long as you use it to smack and kill other minions, it provides good value for its cost. You’d certainly want to pick Scarlet Crusader over Magma Rager, and most likely Argent Squire over Murloc Raider and so on.

As always, leave any comments or suggestions on our on our discussion board!

HearthSim: Divine Shield Modeling — Part 1

Let’s take a look at how we might go about modeling Divine Shield.

Recall that the AI scoring function is given by
$$S = S_{\rm b} + \tilde{S}_{\rm b} + S_{\rm c} + S_{\rm h} + \tilde{S}_{\rm h} $$
(see the original post). In particular, for each minion that one has on the board, the score goes up proportionally to the attack and health value of the minion:
$$S_{\rm b} = \sum_{i} (w_{\rm a} a_{i} + w_{\rm h} h_{i})$$
When a minion has divine shield, we expect it to be valued higher than a regular minion. But, by how much?

The essence of divine shield is that it allows the minion to be used (attack) once for “free.” That free attack does not damage or kill the minion, removing the divine shield instead and leaving a regular minion on the board. This observation suggests that we can think of a divine shield as effectively doubling the score of the minion. So, let’s propose to model each minion’s score as follows: for each minion \(i\) that has divine shield,
$$S_{\rm b} = (w_{\rm a} a_{i} + w_{\rm h} h_{i} + (a_{i} + h_{i}) * w_{\rm ds})$$
where \(w_{\rm ds}\) is the divine shield weight. When the divine shield is removed, it goes back to the regular minion score. Note that setting \(w_{\rm ds} = 0\) means that the AI pretty much ignores divine shield, while setting it to a high number (say, 2) means the AI highly values the divine shield and will try to keep it as much as possible.

In real game situations, we don’t expect divine shield to provide us with exactly twice the value that a minion would have without DS. In fact, there are quite a lot of ways in which the opponent can efficiently remove the DS: silence, any battle cry damage, hit it with a weak or almost dead minion, etc. Thus, we should expect the optimal weighting to be somewhat lower than 1, though by how much is a question we can only answer by running some simulations. Part2 will go into some simulation results.

Continues on part 2.

HearthSim: Super Basic Deck for Testing

Here’s the first real deck that I’ll be using for testing HearthSim in the next couple of posts.

Goldshire Footman × 2
Murloc Raider × 2
Bloodfen Raptor × 2
Frostwolf Grunt × 2
River Crocolisk × 2
Ironfur Grizzly × 2
Scarlet Crusader × 2
Silverback Patriarch × 2
Chillwind Yeti × 2
Oasis Snapjaw × 2
[Sen’jin Shieldmasta] × 2
Booty Bay Bodyguard × 2
Fen Creeper × 2
Boulderfist Ogre × 2
War Golem × 2

Hearthpwn deck list here.

It’s a super basic deck with no special abilities / battle cries, so the AI should be able to handle the deck well. The only untested part is the AI’s divine shield modeling, so that’s the first thing I’ll test and tune using this deck.

You can find the deck as part of “example2” in the examples directory of HearthSim. As usual, head over to HearthSim-dev board for any questions or discussions.

HearthSim: To Taunt or not to Taunt

Here’s a quick set of results highlighting the Taunt mechanics.

Taunt is a game mechanic where the opponent minions can direct where your attack goes. When there are Taunt minions on the opponents board, your hero and minions can only physically attack Taunt’ed minions. Spell cards not affected by this restriction. But… I’m sure most readers are quite familiar with the mechanics… so on with the simulation and the results.

We take the standard minion only random deck (described in more details here), but with max minion attack of 5 and max minion health of 4. I think this set makes the game more even in terms of the Coin mechanic, but the overall conclusion here seems to be insensitive to this choice. We start with a 0 taunt minion setup, and give Player0 increasing number of Taunt cards to see how that changes the performance of Player0’s deck.

# of Taunts P0 Win P1 Win P0 Win % CP 95% Int.
0 19200 20800 48.0% 47.51% — 48.49%
2 20614 19386 51.54% 51.04% — 52.03%
4 21952 18048 54.88% 54.39% — 55.37%
8 23702 16298 59.26% 58.77% — 59.74%
12 24812 15188 62.03% 61.55% — 62.51%
16 25403 14597 63.51% 63.03% — 63.98%
20 25108 14892 62.77% 62.29% — 63.24%
24 23718 16282 59.30% 58.81% — 59.78%
28 21510 18490 53.78% 53.28% — 54.26%
30 20018 19982 50.05% 49.55% — 50.54%

Plotted, it looks like this:

The crux of Taunt’s value is that it forces your opponent into making unfavorable (to your opponent) trades while you get to roam free and make the optimal plays. So, at first, I expected that the P0 win rate will go up as it has more and more Taunts in his deck. To a certain extent, it does. Up at 16 Taunts, P0 win rate increases by about 15% over the base case, which is a significant increase considering the strong randomness of this game.

Beyond 16 Taunts though, the win rate starts to decline. This at first puzzled me, but my current guess as to why is that, by having too many Taunts, you force the opponents into clearing your board constantly. The opponent is trying to play the game as best it can, so of course it is going to try to clear the board as efficiently as it can. Thus, P0, by the virtue of its Taunts, forces its opponents to play more optimally. Not a good strategy when you are trying to win.

Taunt is an interesting mechanic though. The above result suggests that one can actually go tune the AI to beat Taunt decks. I should definitely look into that.

HearthSim on github

HearthSim is now on github at github:HearthSim!

The current version has all the code necessary for the previous simulation results. Next up is writing more test cases and implementing more game mechanics (i.e., more cards).

HearthSim — Direct damage spells

Let’s take a look at the effect of another Hearthstone card mechanic: direct damage spells.

We are talking about cards like Holy Smite and Fireball, the spells that deal a set amount of damage to a minion or a hero. These cards are quite popular in many decks as they provide great utility and flexibility, though some players shy away from them because they think that the spells are too much of tempo disruption.

409

To model spells like this, we will have to modify the “Hand Score” term of the AI score functio. Recall that the hand score is give by
$$S_{\rm c} = w_m \sum_{k} m_{k}$$
where \(m_k\) is the mana cost of card \(k\). In the case of a direct damage spell though, it is not clear if the player should use it at the first opportunity or if the player should hold it and use it to get maximum value out of it. It is a matter of balancing the opportunity cost of not using the spell right away versus holding it for value. So, we modify the hand score to
$$S_{\rm c} = w_m \sum_{\rm minions} m_{k} + w_{\rm dds} \sum_{\rm dds}( a_{l} + c_{\rm dds}) $$
where the subscript “dds” stands for “direct damage spell,” \(a_{l}\) is the attack value (damage amount) of the spell, and where we introduce two new parameters, \(w_{\rm dds}\) and \(c_{\rm dds}\).

The new parameters \(w_{\rm dds}\) and \(c_{\rm dds}\), like most other model parameters, control how aggressive or conservative the AI’s spell usage is. In this case, the higher the weights are, the less aggressive the AI is when it comes to using the spell. This is because the hand score reflects how valuable the spell card is to have in your hand.

Naturally, there is an optimal balance between aggressiveness and conservativeness. To see this, let’s do an experiment using our standard setup:

  • The simulator modeling here
  • Both players have no hero ability
  • Player 0 and player 1 decks consisting of random sets of minions with attack value between 1 and 4, health value between 1 and 3, and mana cost equal to attack plus health divided by 2.
  • Player 0 goes first, gets 3 cards.
  • Player 1 goes second, gets 4 cards.
  • Both players use the optimized AI, with \(w_{\rm a} = w_{\rm h} =0.9\) and \(\tilde{w}_{\rm a} = \tilde{w}_{\rm h} = 1\)

This time, we will see what happens to Player 0 when we give it different number of Holy Smite spell cards and tune the AI’s \(w_{\rm dds}\) parameter. We set the value of \(c_{\rm dds} = 0.9\), though it turns out that the results are not too sensitive to this number.

And… here are the results. All simulations are based on 40,000 games played.

  \(w_{\rm dds} = 0.5\) \(w_{\rm dds} = 1.0\) \(w_{\rm dds} = 2.0\)
# of HS P0 Win % CP 95% Int. P0 Win % CP 95% Int. P0 Win % CP 95% Int.
0 42.5% 42.0%–43.0% 42.5% 42.0%–43.0% 42.5% 42.0%–43.0%
1 41.1% 40.6%–41.6% 44.6% 44.1%–45.1% 42.0% 41.5%–42.5%
2 39.9% 39.4%–40.4% 46.5% 46.1%–47.0% 40.6% 40.1%–41.0%
4 36.8% 36.3%–37.2% 48.7% 48.2%–49.2% 36.6% 36.1%–37.0%
6 32.0% 31.5%–32.4% 49.0% 48.5%–49.5% 28.5% 28.0%–28.9%
8 27.1% 26.6%–27.5% 47.2% 46.7%–47.7% 19.9% 19.5%–20.2%
12 16.7% 16.3%–17.0% 40.2% 39.7%–40.6% 6.1% 5.9%–6.4%
16 7.8% 7.6%–8.1% 28.2% 27.8%–28.6% 1.3% 1.2%–1.4%
20 2.5% 2.4%–2.7% 13.8% 13.4%–14.1% 0.19% 0.15%–0.24%
24 0.4% 0.3%–0.5% 2.9% 2.8%–3.1% 0.0% 0.0%–0.0%
28 0.0% 0.0%–0.0% 0.1% 0.1%–0.1% 0.0% 0.0%–0.0%

It’s probably much easier understanding the results visually. Below is a plot of Player 0’s win rate versus the number of Holy Smites it has in its deck:

The effect of varying \(w_{\rm dds}\) is striking. At \(w_{\rm dds} = 0.5\) and \(w_{\rm dds} = 2.0\), the simulations suggests that using any Holy Smite in your deck is actually detrimental to your performance. Meanwhile, using a balanced weight of 1.0 greatly increases the value of Holy Smite, and it becomes preferable to have about 6 of them in your deck. Let’s analyze this in more details.

In the case of \(w_{\rm dds} = 0.5\), each Holy Smite is valued at 1.9 (remember, the constant is set to 0.9). Thus, the AI is happy to trade (kill) almost any enemy minion it can kill, including a 1-1 minion valued at 2 with the standard enemy board score weights (the AI loses 1.9 because the Holy Smite spell disappears after use, but gains 2 because it killed the enemy minion, a net gain of 0.1). The AI is very trigger happy with Holy Smite and ends up making a lot of unfavorable trades.

In the case of \(w_{\rm dds} = 2.0\), the AI swings to the other extreme. This time, the AI scores a Holy Smite in his hand at 4.9, and therefore will only trade with minions that are 3-2 or stronger. The AI also becomes hesitant to use Holy Smite in combination with a friendly minion to take out an enemy minion. This conservativeness seems to degrade the AI’s performance to the point where the AI is better off not having any Holy Smite in its deck. Often times, the AI finds itself being beaten by weak minions while it hesitates to use Holy Smite because it doesn’t see “value” in it.

So, the best use of Holy Smite seems to be a flexible middle ground strategy. Note though that having too many in the deck turns ugly fast. With this setup, the optimal number seems to be around 6, but this number is likely dependent on the setup, aka the meta.

Look for value, but don’t go hunting for it.

As usual, I look forward to any discussions on All Things Hearthstone.

HearthSim — Tuning the AI

The HearthSim AI is controlled by various model parameters, and it is difficult if not impossible to find the optimal set of parameters that will perform well under all circumstances. So, we need to be able to break down the parameters and understand them in more manageable chunks.

For those of you unfamiliar with the AI model, see this post.

Today, let’s look at the board score weighting. In the model notation, they are \(w_{\rm a}\), \(w_{\rm h}\), \(\tilde{w}_{\rm a}\), and \(\tilde{w}_{\rm h}\).

Roughly speaking, these weights represent how much you value your minions and your opponent’s minions. If your own board score weights (the \(w\)’s without the tilde) are high compared to your opponent’s board score weights (the \(\tilde{w}\)’s), then you (the AI) is not likely to make trades with your minions; the value (the score) that you lose by losing your minions would be much higher than the score you gain by killing your opponent’s minions. On the other hand, if your board score weights are low, you (again, the AI) are likely to make aggressive trades, as such moves will help to maximize your score. In other words, the difference between the opponent board score weight and the self-board score weight determines how aggressive / control-oriented the AI is.

Here is a numerical experiment. We take two players playing standard minion only decks. Player 0 is going to be controlled by an AI with different weights \(w_{\rm a}\) and \(w_{\rm h}\), while player 1 is going to be controlled by the standard AI with \( w_{\rm a} = w_{\rm h} = \tilde{w}_{\rm a} = \tilde{w}_{\rm h} = 1\). Under this circumstance we record the win rate of player 0. For the save of brevity, we will take \(w_{\rm a} = w_{\rm h}\) and call them “aggressiveness.” Note that the higher the aggressiveness, the less willing the AI is to trade minions with the opponent instead of going for the face.

The results are summarized below:

Aggressiveness P0 Win P0 Loss P0 Win % Clopper-Pearson 95% Interval
0.0 1501 8499 15.01% 14.32% — 15.73%
0.1 2477 7553 24.47% 23.36% — 25.32%
0.2 3114 6886 31.14% 30.23% — 32.06%
0.3 3528 6472 35.38% 34.34% — 36.23%
0.4 8032 11968 40.16% 39.48% — 40.84%
0.6 18697 21303 46.74% 46.25% — 47.23%
0.8 19288 20712 48.22% 47.73% — 48.71%
0.9 19870 20130 49.68% 49.18% — 50.17%
1.0 8214 11786 41.07% 40.39% — 41.76%
1.2 14639 25361 36.60% 36.13% — 37.07%
1.5 5868 14134 29.34% 28.71% — 29.97%
2.0 4707 25296 15.68% 15.27% — 16.10%

More visually, below is a plot of P0 win rate:

The results are quite interesting. A super non-aggressive AI (low aggressiveness) tends to fare badly against a standard (aggressiveness = 1.0) opponent. This result makes intuitive sense; the AI is trying too hard to clear the opponent’s board and ends up making unfavorable trades. At the opposite end of the spectrum, a super aggressive AI also doesn’t fare too well. This AI’s problem is that it tries too hard to go for the face, ignoring the opponent’s board, and ends up giving the opponents many opportunities to make very favorable trades. So, logically, it makes sense that the best AI should be somewhere between super-aggresive and super-controling.

The optimal aggressiveness in this setup seems to be around 0.9. To explain this, let’s think about the change in score when a play is made my the AI. When the AI decides to make a minion trade, what happens is that the AI gains score by killing an enemy minion, but also loses score by losing one of its own minion. If the two minions involved in the trade are of the same value (sum of health and attack weighted by their respective weights), the net change in score, assuming aggressiveness of 1, is 0. Thus, in this situation, the AI will think that the trade is not advantageous and will instead go for the face, which is at least score increasing. In summary, an aggressiveness = 1 AI will not make “equal value” trades. On the other hand, an AI with aggressiveness = 0.9 will usually make the equal value trade, because it scores its own minions slightly less than the enemy minions of equal value.

The fact that an aggressiveness = 0.9 AI fares better than an aggressiveness = 0.9 AI tell us that, when there is an equal value trade, it is usually better to make the trade than to go for the face. In fact, the results suggest that it is usually correct to go for a 2 for 1 trade as well, since an aggressiveness = 0.5 AI still performs better than an aggressiveness = 1.0 AI. Somewhat counterintuitive, but now wholly unbelievable.

HearthSim — The Power of “The Coin”

Let’s take a look at the most basic card of all, The Coin.


141

The Coin is a 0 mana spell that gives you one mana crystal upon use. The player who goes second gets The Coin in his starting hand.

How valuable is The Coin? To get a hint, we run HearthSim with the following setup:

  • The simulator modeling here
  • Both players have no hero ability
  • Player 0 and player 1 decks consisting of random sets of minions with attack value between 1 and 4, health value between 1 and 3, and mana cost equal to attack plus health divided by 2.
  • Player 0 goes first, gets 3 cards.
  • Player 1 goes second, gets 4 cards.
  • Both players use the standard AI, with \(w_{\rm a} = w_{\rm h} = \tilde{w}_{\rm a} = \tilde{w}_{\rm h} = 1\)

We compare two cases: Case 1 where Player 1 gets The Coin, and Case 2 where Player1 does not.

The results are summarized below:

  With The Coin Without The Coin
P0 wins 8214 24569
P0 losses 11786 15431
P0 win % 41.07% 61.42%
Clopper-Pearson 95% Interval 40.39% — 41.75% 60.94% — 61.90%

The numbers are striking. The Coin basically makes what used to be a first player advantage into a first player disadvantage. Not bad for a single card!

As usual, I welcome any discussions on All Things Hearthstone!

HearthSim — Intro

HearthSim is a general Hearthstone simulator that I am developing. It is designed to allow one to perform simplified theoretical analysis on Hearthstone decks.

The Setup

The simulator pits two AI controlled players against each other, playing a large number of simulated games. The game setup is similar to that of Hearthstone, but mulliganing is not supported at this time due to the difficulty in modeling it.

The AI Model

Currently, the AI is a simple, rudimentary score maximizing model. On each turn, the AI simulates all the possible moves that can be played and assigns scores to them based on how good the outcome is. Thus, the AI’s performance is going to be determined mostly by the function used to assign the scores.

Scoring Function

Note: Henceforth, variables with tilde denote values associated with the opponent (enemy).

The scoring function can be written as
$$S = S_{\rm b} + \tilde{S}_{\rm b} + S_{\rm c} + S_{\rm h} + \tilde{S}_{\rm h} $$
where \(S_{\rm b}\) is the friendly board score, \(\tilde{S}_{\rm b}\) is the enemy board score, \(S_{\rm c}\) is the hand’s score, \(\tilde{S}_{\rm h}\) is the enemy health score, and \(S_{\rm h}\) is the own hero’s health score. Let’s go through each one in a little more details.

\(S_{\rm b}\) — Friendly board score

The friendly board score is the sum of all attack and health value of the friendly minions that you have out on the battlefield. In other words,
$$S_{\rm b} = \sum_{i} (w_{\rm a} a_{i} + w_{\rm h} h_{i})$$
where, \(a\) is the attack value and \(h\) is the health value for all friendly minion \(i\). The attack and health values are weighted by \(w_a\) and \(w_h\) respectively. The rational for this score function is straight forward. The more attack value and health value your minions have, the better it is for you. The weightings are tunable parameters so that the AI can potentially have different focus: one AI might value health values more and play a vary control oriented game, while another AI can value attack values more and play an aggressive game. The weightings are currently set to 1.

\(\tilde{S}_{\rm b}\) — Enemy board score

The enemy board score is pretty much the same as the friendly board score, except it is the negative of it. That is,
$$\tilde{S}_{\rm b} = -\sum_{j} (\tilde{w}_{\rm a} \tilde{a}_{j} + \tilde{w}_{\rm h} \tilde{h}_{j})$$
for all enemy minion \(j\). In other words, the higher the health and the attack values of your opponent’s minions, the worse it is for you.

\(S_{\rm c}\) — Hand (card) score

In general, the more cards you have in your hand, the better it is for you. So, the hand score is pretty straight forward:
$$S_{\rm c} = w_m \sum_{k} m_{k}$$
where \(m_k\) is the mana cost of card \(k\). It’s debatable whether higher mana cost cards are actually more “valuable” or not, but for now this seems to work pretty well.

\(\tilde{S}_{\rm h}\) — Enemy health score

Most often, the less health the enemy hero has, the better it is for you. The enemy health score captures this observation:
$$\tilde{S}_{\rm h} = \left\{ \begin{array}{1 1} -\tilde{w}_{\rm hh} \tilde{h} && \quad \text{if $\tilde{h} > 0$} \\ \infty && \quad \text{if $\tilde{h} \le 0$} \end{array} \right. $$
where \(\tilde{h}\) is the enemy hero’s health value and the weight \(\tilde{w}_{\rm hh}\) is taken to be \(0.1\). The infinity is there to make sure that if there is a lethal, then it is always picked. In practice, it’s not infinity but rather a large positive number like 1e12.

\(S_{\rm s}\) — Your hero health score

Most often, the more health your hero has, the better. So,
$$S_{\rm s} = \left\{ \begin{array}{1 1} w_{\rm hh} h_{\rm s} && \quad \text{if $h_{\rm s} > 0$} \\ -\infty && \quad \text{if $h_{s} \le 0$} \end{array} \right. $$
where \(h\) is the enemy hero’s health value and the weight \(w_{\rm hh}\) is taken to be \(0.1\). The negative infinity is there to make sure that you never to anything that would kill yourself.

Move Generation

The simulation AI engine tries to brute force all possible moves that one can make given the current hand and board in the current turn. The number of possible moves scale exponentially with the number of minions on board and cards in hand, so the AI is limited to 30 seconds of computation, after which it gives up and plays the best sequence of moves that it has found thus far. On my Macbook Pro, this limits the total number of candidate moves to about 1.5 million moves per turn. It’s fair enough for now… after all, there is a 30 second limit per turn in the real game too.

One thing that the AI does not consider at this time is the possible moves beyond the current turn. I’m working on modeling this, but the sheer number of possibilities prohibit brute force approach to this. For now, it’s ok because I’m only looking at simple, idealized cases, but it’s definitely something I’m working on.

The Code

I plan to open up the code base in the near future.

Discussion

Head over to All Things Hearthstone on Versify to discuss or ask me any questions. I’ll be happy to answer anythings.