A Guide To Soccer’s Evolving Data Dictionary

In sports, games or matches are won by the players on the field and the coaches on the sidelines. That was the popular mentality that pervaded the world of athletics for decades. However, if you talk to many individuals within teams or even to the members of sports media, you’ll notice that maybe this idea is changing. While players are still the only ones capable of making a tackle or catching a ball and coaches still have the responsibility of picking the starting lineup and calling the plays, there’s a whole other factor that has shaped the landscape of contemporary athletics. Data and statistics have become an integral part of the way that teams formulate their playstyle, choose which players to sign, figure out why they’re winning/losing, and so much more. 

Certain sports, like baseball or basketball are inherently more data-driven. “America’s Game” was one of the first adopters of statistics to determine the effectiveness of players. It introduced stats like batting average back in the 19th century and then evolved to incorporate figures like WAR and OPS. Similarly, basketball relied on measurements like field goal percentage before eventually valuing other statistics like +/- and true shooting. These sports lend themselves toward the infusion of data much easier than others because of the fixed nature of the outcomes involved in the sport and the sample sizes that could be drawn. In baseball, there are only a couple of things that can happen when a pitch is thrown and in basketball there are only two things (aside from a block) that can happen when a shot is put up. The problem for many years, however, was trying to find ways to quantitatively analyze sports that have greater variance and fewer games. 

Thankfully, due to advancements in technology and the general acceptance of the role of statistics in sports, data has now solidified itself as a major component of almost every sport. One of the more recent adopters of data and statistics has been the game of soccer. “The Beautiful Game” is uniquely hard to analyze both in terms of statistics and even through the all-important “eye-test”. When the English Premier League started in 1992/93, only basic stats were officially calculated, figures like goals, appearances, and cards were essentially the only numbers of value. Now, the amount of data and statistics involved in soccer rivals many of the other biggest sports in the world. Despite the prevalence of statistical figures across many of the top leagues, discerning the meanings of random acronyms or computer-generated graphs can be difficult. Luckily, PSF is here to help. Below is an explanation of some of the most important and pivotal figures used in contemporary soccer. 

Expected Goals

Expected goals (xG) is by far the most prevalent and common advanced statistic used in the game of soccer. Calculated by a predictive model that factors in variables like shot distance, angle, type of shot, and match scenario (set piece, open play, etc.), xG simply looks to determine what are the chances of a shot going into the net. An xG value ranges from zero to one, with a greater number signaling a higher percentage chance that a shot is a goal. Expected goals can help to signal to players, coaches, or fans whether a missed shot was really a chance that should’ve been capitalized on or if it just appeared that way. If a player is outperforming their xG it could raise alarms that maybe their goal return isn’t sustainable. On the other end of the spectrum, if a player is underperforming their xG it might be a sign that they need to work on their finishing. 

A list of the English Premier League’s best performers in terms of xG in the 2022-23 season. Credit: premierleague.com

Expected Assists

Expected assists (xA) is a figure that is similar to expected goals. xA is based on a predictive model that determines what the possibility of an assist is with every completed pass in a match. The model incorporates variables like type of pass, location on the pitch, phase of play, and distance covered, to create a numerical value. Each pass, even if the ensuing touch does not result in a shot, is assigned an xA. xGA is very closely related to expected assists but differs slightly. While xA is calculated with every pass, xGA is only calculated when there is a shot taken after the pass. xGA is generally a more useful statistic when looking at the effectiveness of a player or team.

Shot Creating Actions

Shot Creating Actions (SCA) tracks the two actions that lead directly to a shot on goal. Several different plays can count as an SCA, including live passes (passes made in open play), dead ball passes, dribbles, other shots (which can lead to rebounds), drawn fouls, and even defensive actions (tackle or clearance). Goal Creating Actions (GCA) is basically the same thing as SCA, however it only tracks actions that lead directly to a goal.

A table of the SCA and GCA of each LaLiga team in the 2023-24 season. Credit: footballreference.com (FBRef)

Progressive Passes

One of the hardest things to quantitatively analyze in soccer is passing. Due to the sheer number of passes that are played in a match, it can be hard to determine what is a “good” or “bad” pass. Progressive passes, however, attempts to add some clarity to this issue. A progressive pass is any ball successfully played from one teammate to another teammate that advances a team’s position on the field. While different models each use slightly different calculations to create these figures, it is usually the case that any pass that moves the ball 10 yards further toward the opponents goal is a progressive pass. Wyscout factors in which side of field a pass is made when calculating whether the ball is progressive or not. Its model determines that a pass is considered progressive if the distance between the starting point and the next touch is at least 30 meters closer to the opponent’s goal if the starting and finishing points are within a team’s own half, at least 15 meters closer to the opponent’s goal if the starting and finishing points are in different halves, and at least 10 meters closer to the opponent’s goal if the starting and finishing points are in the opponent’s half.”

Dribbles/Successful Take-on

These two statistics are largely used interchangeably and quantify essentially the same thing. A dribble/successful take-on is recorded when any player moves past a defender while maintaining possession of the ball. 

Progressive Carry

A progressive carry, much like a progressive pass, attempts to determine how “attacking” a specific action is. In this case, a progressive carry is any “carry” (a player moving the ball with their feet) that advances the ball up the field toward an opponent's goal line. Once again, different models vary slightly on the yards/meters that must be covered in order to qualify a carry as “progressive.” FBRef maintains that a carry becomes progressive if it moves the ball at “least 10 yards from its furthest point in the last six passes,” or if the carry enters the opponent’s penalty area. 

Previous
Previous

Arsenal-Manchester City: Matchups to Watch

Next
Next

The Striker Problem