Opportunity Cost of Expected Goals

While studying economics, it’s almost impossible to get through a lecture without the mention of opportunity cost. This is unsurprising, the concept describes the existence of the next best alternative, which has been forfeited when making a decision. With economics being the study of scarcity and decision making it is clear why opportunity cost forms the basis of study in the subject.

Like economics, in football there is always another option of what to do when the ball is at your feet. While some choices are clearly more beneficial than others, Chupo-Moting clearing his own team’s shot off the line against Strasbourg comes to mind as a less favourable option, these choices are not always taken and thereby negatively impact the team’s chances of winning. It is commonplace to hear commentators express their disbelief when a player offers up a shot instead of a final pass, with some of these situations sticking in the memories of fans for years after. With the rise of statistics such as expected goals being used to analyse the quality of the chance, it begs the question as to whether we should look a step further into what options the player gave up to take the shot in question?

This forms the basis of this article, would the quality of the chance (measured by its xG value) have been improved if the player had passed the ball? I set about to answer this by utilising Statsbomb’s free data from Euro 2024. The key pieces of information which this data provides is a snapshot of player locations when a shot is taken, as well as the expected goals value of each shot and much surrounding information about the chance in question.

I set out with a five step approach to try and find the next best alternative in each chance looked at. The pictures provided as examples will be from what the model suggested was Kylian Mbappe’s worst shot by passes forfeited.

Key assumptions

This model provides a simple dive into the question and in the future I would hope it could be made better by removing or limiting some of these assumptions, but additional data may be required to do so. The primary assumption is that teammates are stationary throughout the process of the shot. This allows each scenario to be seen as the player on the ball saw it when about to shoot, while also keeping the options from blossoming into infinite different combinations. The second key assumption is about the movement of defenders, that being they all move at constant speed and perpendicular to the angle that the shot will take towards the center of the goal. With further modelling this assumption lends itself to future relaxation, play by play data in the NFL contains information about current movement speed, acceleration and body angle, all which would improve predictions of player movements. Another improvement would be using predictive modelling to see how players have moved in similar scenarios where passes have been taken, but here for simplicity they move as if teammates of the ball-holder don’t exist. Finally, all passes are going to be played along the floor, not only does this theoretically maximise the xG of the resulting shot, allowing for the model to be maintained in a 2D plane for calculations that need to be done.

Which passes are possible

To know which positions are better to shoot from, you need to know which positions are feasible to get to first. With the prior stipulation that teammates have their feet glued to the floor, the question becomes that of simple geometry.

Here is an image of a shot Mbappe took against Portugal, his position is represented by a green dot, teammates in blue and opponents in red. From Mbappe are straight lines drawn to all of his teammates, either coloured in red if they are predicted to be intercepted or black if they are not. The further red lines are from each opponent who was predicted to reach their crossover point with the pass line before the ball was, hence giving them ample opportunity to intercept the ball. Although very simple, this geometric approach still provides some estimate as to whether the ball would be able to reach its intended target if the ball carrier chose to offload it.

New defender positioning

For each of these options which are determined to be completed passes, the task becomes plotting the actions of the defenders for the duration of the time elapsed between the pass being made by Mbappe and the shot being taken by the new player.

As previously stated, the defenders’ are coded to move perpendicularly to the new angle of the shot (that being to the center of the goal). The distance they move is coded in two different ways depending on their distance to the shot line. If they are predicted to be able to cross the line within the allotted time between the ball carrier passing to the new shooter, they stay on the crossover point as their new position, whereas if they don’t cross they move as far as possible. Even though it isn’t necessarily true that this is the optimal place to be, it seemed counter productive to me to have defenders crossing even further than the line and possibly ending up in an even less fruitful area.

New Position Expected Goals

At this point we have a data frame constructed which contains each possible alternate shot, with the predicted positions of the defenders when this new shot will take place, the next stage is predicting the xG value for this new shot. The method I used to apply this was running a logistic regression on all the shots in the Euro 2024 data. This is a relatively small sample of 1300 shots (excluding penalties), which allowed me to utilise boolean values provided by Statsbomb, while also adding distance and angle to the goal of the shot, as well as distance and angle to shot of the nearest five defenders to the ball.

This is part of the process which could be upgraded with the least hassle. By utilising a model bearing greater similarity to Statsbomb, these values generated for the new shots would have much greater precision. To find the values in relation to the xG provided I took the percentage difference from the predicted xG value for the original shot and the alternative shot, then applied this percentage change to the xG provided by Statsbomb to return a value scaled similarly to an actual xG value.

Finding out who wasted opportunities

Running the code through a loop for all 1300 non-penalty-shots in Euro 2024 allowed me to find an alternative xG value for all the shots, if a better shot was deemed to be available, the difference between that and the old value was deemed as wasted xG.

In this particular shot by Mbappe (his most wasted of the tournament) it was estimated that a pass central to Ousmane Dembele would have raised the xG value from 0.0661 to 0.2083.

Who were the main culprits of wasting their teammates’ talents then?

Of players with five or more shots, Mbappe and Ronaldo were fairly neck and neck of who wasted the most cumulative xG across the tournament, around 0.49 xG, but both were not too egregious on a per rate basis, only wasting 0.02 per shot each.

On a per rate basis Pedri (Pedro González López) was deemed the most wasteful player. The greater prominence of midfielders in the top ten on a per rate basis makes me think they share a greater concentration of longer range efforts which xG as a model doesn’t like very much, therefore this alternative xG model will see simple passes which reduces distance and instantly regards these as a better option.

Could this actually be a useful tool going forward?

Players don’t see the world as ones and zeros like computers do and it is very unlikely that you will ever ask one why they made a pass and they will mention the increased xG which resulted from it, but this does have potential to aid player evaluation and recruitment. Similarly to how xThreat looks into how players actions improve the chances leading to a goal, alternate xG if or when it is better should allow for grading of players decision making when under pressure. A better model in predicting defender movement has potential to help create tactics for coaches and scouting teams more effectively when they have to face common situations. Although some of these models may currently be limited by data availability, I don’t know.

The simple next stages for this are the improvements to the xG model which is being used in it and releasing some assumptions such as static teammates, which would allow calculation of possible teammate movements that could further maximise xG. However, for now I think the simple model provides an interesting look into what sort of insights could be drawn from how players think and if players can be coached to improve in this area. All the code from this project will hopefully be uploaded to https://github.com/jnbrooker/oppocosteuros and any questions or discussion points can be emailed to proceduresport@gmail.com

Leave a comment

Design a site like this with WordPress.com
Get started