OBJECTIVE
In my previous blog post I suggest that you can improve your predictions of the World Cup Group Stage by never predicting ties. In this follow up post I analyze the bracket phase of the World Cup (which does not allow ties), and propose that if you always select the team that has the highest Group Stage points, you have a higher probability of making the right prediction compared to randomly guessing.
ABOUT THE BRACKET
In the Group Stage of the World Cup, each team accumulates 3 points for every win, 1 for every tie and 0 for every loss. Since each team plays three times, they can accumulate 0, 1, 2, 3, 4, 5, 6, 7 and 9 points. Note that there is no outcome that provides 8 points.
After the Group Stage of the World Cup, 16 teams move into the Round of 16, where we have eight games, and the winners move on to the Quarter Finals, while the losing teams go home. The Quarter Finals have four games, and the winners move to the two Semi-Finals games. Finally the two losing teams of the Semi-Finals play for the Third Place, while the winning teams play on the Final. There are 16 games in this stage of the tournament, and there is no option for ties, there must always be a winning team. If a user is randomly guessing the outcomes, in average they will only get half (8 games or 50%) of the games right.
DATA
For this study I used a data set of the outcome of World Cup games obtained from Kaggle. I only used the data for the years between 1986 and 2014. This data is used to explore patterns about the game outcomes.
MODEL
Each match has two teams playing, and each team has accumulated points from the Group Stage. Let's say team A has 6 Group Stage points and is playing against team B who has 4 Group Stage points. Does team A have a higher probability of winning that match?
I analyzed all the games per stage for the years 1986 and 2014, and calculated the percentage of times that the team with the highest Group Stage points won. I ignored the games where both teams had the same Group Stage points, such as the case of team A with 5 points playing against a team B with also 5 points.
In Figure 1 below, I present the 5 stages of the bracket games (Round of 16, Quarter-Finals, Semi-Finals, Third Place and Final). The red bar indicates the percentage of games won by the team with higher Group Stage points, while the gray bar indicates the percentage of games won by the team with the lower Group Stage points. The white dashed line indicates the 50% threshold.
According to the results above, if you always select the team with the highest Group Stage points, you will have a greater than 50% chance of making the right prediction for all stages, except for the Quarter-Finals, where it would have been a toss-up.
So, how would we have done in the 2018 World Cup predictions?
Round of 16
In this phase we would of had correctly predicted 7 out of the 8 games. The eighth game was a toss-up of two teams having the same Group Stage Points (England-Colombia).
Winning Team | Winning Team Group Stage Points | Losing Team Group Stage Points | Losing Team | Model Outcome |
---|---|---|---|---|
Uruguay | 9 | 5 | Portugal | Predicted |
France | 7 | 4 | Argentina | Predicted |
Brazil | 7 | 6 | Mexico | Predicted |
Belgium | 9 | 4 | Japan | Predicted |
Russia | 6 | 5 | Spain | Predicted |
Croatia | 9 | 5 | Denmark | Predicted |
Sweden | 6 | 5 | Switzerland | Predicted |
England | 6 | 6 | Colombia | Toss-up |
Quarter-Finals
Figure 1 indicates a 50% chance of predicting the Quarter-Finals based on this model. In these results, I correctly predicted 2 out of 4 games (inline with the 50%). I did not predict one game (France-Uruguay), and the last game was a toss-up due to the same amount of Group Stage points (England-Sweden).
Winning Team | Winning Team Group Stage Points | Losing Team Group Stage Points | Losing Team | Model Outcome |
---|---|---|---|---|
France | 7 | 9 | Uruguay | Not Predicted |
Belgium | 9 | 7 | Brazil | Predicted |
Croatia | 9 | 6 | Russia | Predicted |
England | 6 | 6 | Sweden | Toss-up |
Semi-Finals, Third Place and Final
So far the model would of predicted 9 to 1 1 of the 12 games in the Round of 16 and Quarter-Finals. Since we are entering the last week of the tournament with four remaining games, I can use the model to make the following predictions (today's date is 7/8/2018):
Winning Team | Winning Team Group Stage Points | Losing Team Group Stage Points | Losing Team | Remaining Stage |
---|---|---|---|---|
Belgium | 9 | 7 | France | Semi-Finals |
Croatia | 9 | 6 | England | Semi-Finals |
France | 7 | 6 | England | Third Place |
Belgium | 9 | 9 | Croatia | Final |
In the Semi-Finals, Belgium and Croatia have higher Group Stage Points than their rivals, so they have a higher probability of winning against France and England, respectively. If so, the Third Place will be taken by France who has more Group Stage points than England. Finally, the Belgium and Croatia match would be a toss-up in the Final game of the World Cup!
CONCLUSION
In this post, I explored the data from the previous eight World Cups, and determined that at every stage of the Bracket phase, 50% or more of the winning teams had higher Groups Stage points than their rivals. A potential limitation of this approach is that our dataset is relatively small, for example I am only analyzing eight Final matches. If this strategy would of have been used in the Round of 16 and Quarter-Finals of the current World Cup, we would of already have done better than randomly guessing!
I'll update the blog in the next few days to reflect the actual outcomes, and update the predictions for the Third Place and Final game based on the outcomes of the Semi-Finals.
UPDATE 07/12/2018
The Semi-Finals are over, and the model correctly predicted Croatia moving to the finals, but did not predict France.
Winning Team | Winning Team Group Stage Points | Losing Team Group Stage Points | Losing Team | Model Outcome |
---|---|---|---|---|
France | 7 | 9 | Belgium | Not Predicted |
Croatia | 9 | 6 | England | Predicted |
Below are the updated predictions.
Winning Team | Winning Team Group Stage Points | Losing Team Group Stage Points | Losing Team | Remaining Stage |
---|---|---|---|---|
Belgium | 9 | 6 | England | Third Place |
Croatia | 9 | 7 | France | Final |