Computing bounds for ranks and TI qualification of the DPC 2023 Tour 3

with Integer Programming

Tim
9 min readJun 5, 2023

Introduction

The Elimination Problem in sports asks whether a team can still achieve a particular result (relegation, qualification, winning a league, …) given the current standings of the league and its structure.
In DotA, the major goal of many teams during a season is to qualify for The International, as its prize money and prestige are unmatched. This motivates the question to find individual provable bounds for qualification to TI — in other words: How many points does a certain team need to be qualified for TI right now?

So far, this problem has been solved by hand and/or by exhausting all possible options. This comes at the cost of redoing all steps whenever information is updated as the DPC goes on. It is also computationally expensive, so it is effectively only possible at the very end of a season, when most games have already been played.
My goal was to find a solution that would:

a) be able to run on a home computer in seconds
b) provide a general framework for all types of questions one might ask
c) provide proven optimal bounds and corresponding scenarios
d) be applicable in a more general context (other sports, future seasons, …)

A brief intro to Modelling & Integer Programming

A method suitable for (b-d) is Discrete Optimization (with (a), we can never be so sure). Unfortunately, there is no shortcut to explaining Discrete Optimization / Integer Programming in a 10-min read article like this.
The main takeaway is that provided we can precisely formulate the rules of our problem through a set of (preferably linear) inequalities, equalities, and decision variables, we might be able to find the proven optimal value of our target function. A minimal modelling example:

Suppose our league consists of only two teams, PotM Bottom and Ivy, who play two Bo1s against each other. Each win grants 3 points.
We model each game with a decision variable, that is 0 if PotM wins and 1 if Ivy wins. We express the number of points that both teams end up with as:

Let us think of questions to ask the model, i.e., define our objective function. One might be: What is the maximum amount of points Ivy can get?

We have now modeled our example and defined an objective. It’s time to throw a solver at it. Below is a small code example in Python using Gurobi.

This yields the optimum of 6, meaning the maximum amount of points Ivy can get is 6. Admittedly there was no need to model or even compute this particular example, but it serves to introduce the concepts of decision variables, objective function, and optimality.

Further, we can examine the values of our decision variables in the solution found (they will be equal to 1 each, as Ivy wins both games). Note that in this example, we found one optimal solution (Solution count 1), meaning it is a unique optimal solution. This may not always be the case.
If we had asked: Is it possible for Ivy to score exactly 3 points? We would get two possible solutions, winning game 1 and losing game 2 and vice versa.

We can also ask: Is it possible for Ivy to score 2 points? or Is it possible for Ivy to score more than 6 points? Both of which will yield infeasibility, meaning that this outcome is proven to not be possible.
This will become interesting later on, as it allows us to ask questions such as Is it possible for [some Team] to qualify while certain other conditions hold?
Alternatively, we will be able to compute bounds by asking What is the maximum number of points [some Team] can obtain without qualifying for TI?

Modelling the DPC

Ok, if you made it this far, you are probably either a Liquipedia editor, CS major or a really dedicated nerd. Much appreciated. Before we tackle the mess that is the DPC, here is a brief literature review: There are a few papers on related problems, my first source and inspiration is:
Gotzes, Uwe, and Kai Hoppmann. “Bounding the final rank during a round robin tournament with integer programming.” Operational Research (2022): 1–9
An overview of different sports and league formats is given by:
Raack, Christian, et al. “Standings in sports competitions using integer programming.” Journal of Quantitative Analysis in Sports 10.2 (2014): 131–137
To the best of my knowledge, there is no publication that deals with a format as complex as the DPC. However, modelling the DPC in that way is mostly just diligent work and builds on the same foundation.

For the modelling part, we will start with a single division and then work our way up, adding the major and the overall standings (I thought about how to structure this article and thought it might be best to use illustrative examples and code with increasing complexity). I will not explain the structure of the DPC; if you are not yet familiar with its format, read this.

Modelling a single regional league

The most obvious choice for a regional league might be to model every single game with a binary decision variable, as we did in the first example. This is also the approach that most literature follows.
A simpler solution is to model the final placement of teams in their league as they determine the number of DPC points earned. We introduce binary decision variables for each combination of team and place in the league. It will be 1 if that team achieves said place and 0 otherwise. We can then write the amount of DPC points a team earns as:

Lets take DPC SA 2023 Tour 3 Division I as an example:

We have now modeled standings and DPC points for a single league. Actually, we haven’t — If you have been paying attention, you might have noticed that at no point have we formulated that each place has to be assigned to one and only one team. With the logic described above, it would be entirely possible for everyone to score 1st place at the same time.
Our matrix structure helps us to understand and fix this problem. We demand that each place has to be assigned to one and only one team — the column sum has to be equal to 1 for all columns in our matrix. Further, the row sums have to be less or equal than 1, for all rows so that one team cannot be assigned multiple places:

We add these constraints to our model.
Finally, since we are on Tour 3, we must add the current DPC points to the equation. Later, we will need a binary variable that tells us whether a team is qualified for Bali Major. We can add the first two columns of our matrix to do so since the result will be exactly 1 if a team is 1st or 2nd place (and thus qualified) and 0 otherwise. Summarizing:

Again, here is the code for the example discussed:

We are making all this effort to model a single league compactly (using as few variables as possible and not using fancy constraints) because scaling up to the entire DPC will significantly improve computational time. Initially, I had coded the rules of the format straightforwardly. The resulting model solved some questions in seconds while others did not terminate in 10h+. Through condensing the model, variable bounds, and manual reformulation of IF/OR logic, the worst-case runtime is now in the order of seconds. Note that the runtime greatly varies depending on the question asked.

To model an ongoing league, we may add constraints describing its current state (like EG will come 1st or 2nd). For that, we might consider a sub-model that considers individual games and infer the possible outcomes of the league. This way, we get the best of both worlds.

Mathematical modelling can get messy if you don’t keep track of indices.

Modelling all regional leagues

Analog to the example on South America, we can repeat the same steps for all other regional leagues. Differences occur in the allocation of qualifier slots, and we have to add the current DPC standings manually.

Modelling Bali Major

Once again, we can apply the same structure for the regional leagues. This time, the allocated points will differ. To ensure that only teams who qualified at Bali also play and earn points there, we do the following: We pretend all teams were playing at Bali, but only those who qualified may score points. This we can ensure by demanding the row sums of our Bali matrix (analog to the one described for regional leagues) to be less or equal than q_t (binary that is 1 if team t is qualified):

If team t is not qualified, then q_t = 0, and the constraint will make sure that the team cannot score any points at Bali Major. If a team is qualified, then q_t = 1, and the team may score but does not have to (since not all teams who qualify for Bali will earn points there).

(Note that the placement variables here differ from the ones in the regional leagues, and the teams here include all teams, not just those from one region. This is to avoid overloading the formulation with indices in this example, but formally incorrect.)

We update the equation counting the DPC points for each team as:

Modelling ranks in the DPC

As the top 12 teams directly qualify for TI, we need to develop a way to formulate constraints that address a team’s rank on the DPC ladder. We do this by introducing a new variable, rank, and comparing a team’s DPC points with the DPC points of each other team. rank of a team is the number of comparisons where our team had fewer DPC points than the team we compared to (try this yourself to check that it works). The formulation follows the Big-M method with suitable M:

To compute the maximum number of DPC points a team can get without qualifying for TI, we formulate Maximize the number of DPC points for team t while team t has a rank greater than 12! and run the model for all teams.

This last step is left as an exercise for the reader while I clean my code. It will be uploaded here eventually. The final model cannot only compute bounds for qualification to TI but may answer any question you can think of and formulate addressing rank and DPC points.

As a final note, this is a relatively informal write-up; therefore, some minor details are left out. The model also does not account for teams disbanding or players changing teams.

The list of team-wise thresholds changes daily as we approach the end of tour 3. I do not maintain an updated list (only on request); therefore, Liquipedia is likely more up-to-date than my last computational results. If anyone is interested in automating the process for the next season, i.e., setting up a variant of the program on a server to be automatically updated with match data, I would be happy to collaborate.
[Update, September 2023: The DPC format will be discontinued in 2024. What qualification system will take its place remains to be seen. In any case, the techniques described above might be suitable then and are so for other sports with comparable league & qualification structures.]

If you actually read through the end, here is a small bee.

--

--