In this post I tried to explore *mixed strategy Nash equilibria*
for a few strategies – all-in, standard, and defensive. Games often
feature these archetypal strategies with a rock-paper-scissors
relationship between them. I will also make it more interesting by
introducing other variables like player skill and added randomness.

Disclaimer: This is a fun project. I'm no expert on game theory. I was interested in how this would look like, how I would implement and visualize it, and learn something along the way.

Rock-paper-scissors

Rock-paper-scissors is a simple game that will serve as a good example. There are three strategies, each one countering one other with a 100% winrate: rock < paper < scissors < rock.

We can represent that by a payoff matrix. We don't have to use winrates as payoffs, but it will be good for later. On the diagonal (rock-rock, paper-paper, scissors-scissors), I have put 50% for a draw. Also, I don't have to write payoffs for the second player as this is a zero-sum game. Those will be 100% minus payoffs for the first player (winrates adding up to 100%).

[P1↓ P2→] | Rock | Paper | Scissors |

Rock | 50% | 0% | 100% |

Paper | 100% | 50% | 0% |

Scissors | 0% | 100% | 50% |

Going for one *pure strategy*, for example always going for rock,
isn't a good idea as it can be easily exploited by the opponent. If we did
a game-theoretic calculation, we would discover that if both players are
playing smart, they would choose each sign with an equal probability.

That's a *mixed strategy Nash equilibrium*. It's "mixed" because
it's mixing different strategies – like going sometimes rock and sometimes
paper. It's a *Nash equilibrium* because there is no incentive for
one player to deviate from it given the opponent's strategy.

Model

This model is similar to rock-paper-scissors as we still have two players and three strategies. However, to make it more interesting (1) no strategy will have a 100% winrate against another, (2) we will take into account player skill, (3) and one strategy will be more affected by randomness.

How does that work?

- Three strategies (all-in, defensive, and standard)
- Player skill is represented by ELO rating
- Strategy advantage over another is represented as a bonus to ELO rating
- The all-in strategy represents a strategy where randomness plays a significant role. This is done by reducing the ELO rating difference by 50% when there is one all-in played, and by 75% with two all-ins.

I have chosen ELO because of its simplicity. It's an easy way to
represent player skill, translate it into winrates, and represent strategy
matchups as giving a certain ELO rating bonus. Alternatively, I could have
used TrueSkill with *mu *as skill, and all-ins increasing *sigma
*or *beta *for given matchups.

The randomness and reduction of skill difference for all-ins can come from for example doing coin-flip builds, not managing to scout a hidden building, or by focusing on a single early rush with minimal player interactions where the better player might not get the chance to outplay the opponent, or the randomness of those few interactions won't get averaged out.

To represent the relationship between strategies: all-in < defensive < standard < all-in, I have chosen these payoffs (winrates) for equally skilled opponents:

[P1↓ P2→] | All-in | Standard | Defensive |

All-in | 50% | 55% | 35% |

Standard | 45% | 50% | 60% |

Defensive | 65% | 40% | 50% |

And this is how it looks like when there is a 200 ELO difference between players:

[P1↓ P2→] | All-in | Standard | Defensive |

All-in | 57% | 68% | 49% |

Standard | 59% | 76% | 83% |

Defensive | 77% | 68% | 76% |

As expected it's now favoring the player one. Values on the diagonal aren't the same anymore. Since all-in strategy is more affected by randomness, two players all-ining each other will be more random and hence closer to 50% than a standard-standard game.

★ ★ ★

Now what if we tried to find mixed strategy Nash equilibria for different ELO differences between players? It would show how players should mix their strategies based on how good or bad their opponent is. This assumes both players are rational and knowing all this.

The second chart belongs to the opponent – letting you compare which mixes of strategies face each other.

A significantly worse player (> 355 ELO difference) should always all-in as that effectively reduces the difference between players' skills. At the other end of the spectrum, the player is better off always being defensive, as they can outplay the opponent later in the game, and it's all about surviving the all-in.

In this model there are 5 phases where different strategies are viable:

ELO difference |
Viable strategies |

? – -355 | All-in |

-355 – -69 | All-in | Standard |

-69 – 70 | All-in | Standard | Defensive |

70 – 356 | Standard | Defensive |

356 – ? | Defensive |

This shows the importance of good matchmaking as the most strategies are viable for even matches. The same will be the case for close matches in tournaments.

Surprisingly, in phase 2 we see the number of all-ins increase while the player is getting worse opponents. It's caused by the rise of standard strategy for the opposing player. That shows the dynamics between even with three strategies might not be intuitive.

Another counter-intuitive thing is that a strategy being buffed against
another can lead to the buffed strategy being used less. It will cause its
counter-strategy to be used more and subsequently stifle the buffed
strategy. Here is an interesting example where all-in is significantly
improved but only against the standard strategy. This made all-in better
for some ELO differences, but significantly worse at others where the
defensive strategy became more frequent.

For comparison, I will also include how it looks when all strategies are affected by randomness exactly the same way. In this case, all strategies stay viable at any ELO difference. However, I don't think that's a realistic assumption for most RTS games.

The impact of randomness

Let's plot how winrate scales for certain strategies that are affected differently by randomness.

It's not surprising that the more a strategy is affected by randomness, the closer its winrate is to the 50% line. Flipping a coin would be fully on the 50% line regardless of the player's skill (-100% skill reduction).

This is an interesting result as well – mixed strategies scale pretty much the same as standard vs standard. You would think that by choosing strategies at random, you would introduce some randomness and move closer to the 50% line. But for this effect to be visible I had to significantly increase the winrates of strategies that counter each other (from ~55% to 99% and 99.999999% winrates in the two following charts).

Other pure strategies are stacked under the all-in curve as there is no difference here.

Final words

Thank you for reading. My main goal for this post was to see how I would implement and visualize this, and learn things along the way. And there were things I didn't expect – the effect of added randomness to a certain strategy, or sometimes non-obvious behavior of mixed strategies.

Here is the __repository
with my code__. This includes naive solutions to 2x2 and 3x3
payoff matrices. If I wanted the project to scale to more strategies, I
would be smarter about that or use a library. Overall it was a fun project
and the charts look nice. This cannot be directly applied to a game like
StarCraft II where players aren't fully rational agents, builds are more
on a continuous scale, and there are other variables like balance, maps,
and more.

Links to check out: