Residual Win Percentage and the Johnson Effect in Scottish Football
By Seth Dobson (@fitbametrics)
In my previous article, I introduced expected win percentage (xWP) as a way of ranking Scottish Premiership teams using basic information about shots, shots on target, and home/away form as inputs to a Random Forests machine learning algorithm. I showed that xWP is a better predictor of future win percentage (season over season) than other, more traditional metrics, including win percentage itself.
In this article, I will take xWP a step further by introducing the concept of residual win percentage or rWP, which is the difference between actual win percentage and xWP, expressed in percentage points (pp).
I will show that rWP is a potentially useful metric for understanding season over season changes in win percentage. I end the article by taking a look at the implications of rWP for the 2018/19 Scottish Premiership season.
RESIDUAL WINS AND THE JOHNSON EFFECT
One of the main applications of residual wins is to estimate the likelihood that a team will regress or bounce back in the subsequent season.
For example, Jim Albert, author of Analyzing Baseball Data with R, writes that 85% of residual win totals fall between -5 and 5 in Major League Baseball. In other words, it is rare for actual wins to deviate from expected wins by more than 5 games (note, the current MLB season is 162 games).
However, in 2016, the Texas Rangers won 13 more games than expected with a win percentage of 58.6%. The following season they won just 78 games, 17 fewer than the previous season, and their win percentage dropped to 48.1%. This an example of the Johnson Effect (named after baseball journalist Bryan Johnson).
According to sabermetrics pioneer Bill James:
“The Johnson effect states that when a team wins more games than it could be expected to win in view of the number of runs scored and runs allowed, that team will tend to decline in the following season. When a team wins significantly fewer games than could be expected in view of its runs scored and runs allowed, that team will tend to improve in the following season.”
This tendency, which is a special case of regression toward the mean, has also been observed in other sports. For example, Brett Leiblich at Football Outsiders showed that there is a strong correlation between residual wins in the current season and change in win percentage next season in the NFL. Teams that get better results than expected in the current season tend to win fewer games in the subsequent season; the reverse is true for under-performing teams.
To drive this point home, here’s basketball analytics guru Dean Oliver describing the Johnson Effect in the NBA:
“The '85-86 Clippers won 32 games, while their point totals led to an expectation of only 21 wins. The '86-87 Clippers came back to reality, going through a pitiful 12-70 season in a daze. This sort of collapse can be seen throughout the history of basketball…”
In the subsequent section, I will examine whether the Johnson Effect exists in Scottish football as well.
THE JOHNSON EFFECT IN SCOTLAND
First, let’s take a look at how rWP, or the difference between actual and expected win percentage, varies within and between seasons using data from the last five Scottish Premiership seasons (source: football-data.co.uk).
As you can see in Figure 1, most rWP values cluster around 0 in any given season. It is rare to find a team with rWP outside +/- 5 percentage points (pp).
If we pool all 5 seasons together, we get the following summary statistics for rWP (in pp):
Max: 19.7 (Motherwell 2013/14)
3rd Quartile: 3.0
1st Quartile: -3.9
Min: -19.0 (Inverness CT 2016/17)
Next let’s look at the teams with the highest rWP over the period 2013/14 to 2016/17, and see how their win percentage changed in the the subsequent season (Table 1).
At the top of the list, Stuart McCall’s 2013/14 Motherwell, experienced a particularly sharp decline the following season. After a terrible start to the 2014/15 campaign, McCall left his post in November, and the team ended up in a relegation playoff against...Stuart McCall’s Rangers, which Rangers lost (funny old game).
The team with the second highest rWP over this period was Brendan Rodgers’ invincible 2016/17 Celtic team, who won nearly 90% of their matches. Much was made of Celtic’s subsequent decline in 2017/18; they only won 63% of their matches (still excellent, but a marked regression nonetheless). Celtic ended up winning the treble in 2017/18 anyway of course, since they were still the best team in Scotland by far.
To further illustrate the perils of high rWP, consider Dundee United 2014/15. They are 3rd on the list in Table 1 with the third largest drop in subsequent win percentage. Jackie McNamara’s men started the 2014/15 season on fire, scoring goals at an unsustainable rate, but faded toward the end. Next season, McNamara was sacked after a bad start, and United were relegated to the Scottish Championship, where they remain today. McNamara is currently without a club.
Overall, we can see in Table 1, that 70% of the teams with the highest rWP won fewer games in the subsequent season, with an average decline in win percentage of -12.1 pp. This is consistent with the Johnson Effect observed in other sports.
Now let’s look at the teams with lowest rWP over the period 2013/14 to 2016/17, and see how their win percentage changed in the subsequent season (Table 2).
As you can see, the Johnson Effect is less apparent in this group of teams, as only 60% of the teams with the lowest rWP bounced back to win more games in the subsequent season. The average change in win percentage is 4.5 percentage points.
There is undoubtedly a fair amount of selection bias in teams with low rWP. Since teams with low rWP will often get relegated, we do not have the opportunity to examine subsequent changes in win percentage in the Premiership for these teams.
Of note in Table 2 is Heart of Midlothian 2016/17. Ian Cathro’s Hearts won 13.0 pp fewer matches than expected based on my expected wins model. Interestingly, their win percentage did not bounce back at all under Craig Levein in 2017/18. Make of that what you will.
Lastly, let’s look at the overall relationship between rWP and change in win percentage in the full dataset (N = 43). If the Johnson Effect applies to the Scottish Premiership, we should expect to see a negative relationship between the two variables.
As you can see in in Figure 2, there is a statistically significant negative correlation between rWP and change in subsequent win percentage in the Scottish Premiership over the last 5 seasons. The correlation coefficient is a respectable -0.66. The coefficient of determination indicates that 44% of the variance in win percentage change is explained by rWP in the previous season.
While many factors can influence changes in win percentage from one season to the next, there appears to be a statistical tendency for high-rWP teams to regress the following season, and low-rWP teams to bounce back (assuming they don’t get relegated). In other words, the Johnson Effect appears to be a thing in Scottish football.
IMPLICATIONS FOR THE 2018/19 SEASON
Finally, let’s look at rWP in the Scottish Premiership last season (2017/18) and talk about what it implies about the upcoming season.
Figure 3 plots rWP vs xWP for each team. This is a nice way of looking at which teams under-performed and which teams over-performed over the course of the season. The horizontal dashed lines indicate rWP of +/- 5 pp. Teams that fall above or below the dashed lines would be the most likely to experience a marked change in win percentage next season.
At the high end, the two most likely candidates for regression in 2018/19 are Hibs and Rangers. Hibs’ rWP of 9.1 last season was the 10th highest in the Premiership since 2013/14. Their position in the chart does not bode well for Neil Lennon’s men as it would suggest they might even struggle to remain in the top 6, depending on how much they regress. Indeed, with just 5 points from 4 matches so far this season, Hibs already appear to be conforming to the Johnson Effect. Similarly, Rangers’ rWP of 6.4 last season was the 14th highest since 2013/14. They are also showing signs of regression so far this season by recording their worst league start in 29 years under new manager Steven Gerrard.
At the other extreme, the two teams with the lowest rWP last season were Celtic and Ross County. As such, both would be expected to win a higher percentage of their matches in 2018/19. Unfortunately, Ross County were relegated, and so will not get the opportunity to benefit from the Johnson Effect in the Premiership this season (it is worth noting however that Ross County are currently top of the Championship having won 3 of their first 4 matches). Celtic, on the other hand, are well positioned to run away with the league title once again. Their -10.9 rWP last season was the 9th lowest since 2013/14. Therefore, despite all the transfer window and Champions League drama at the start of the season, we can expect Celtic to bounce back and win more matches this season than last season on their way to 8 in a row.
The Johnson Effect is the statistical tendency for teams to regress (up or down) toward their expected win percentage from one season to the next. This general pattern was first observed in MLB, and later in the NFL, NBA, and now SPFL, as demonstrated in this article.
While the Johnson Effect is not perfectly predictive of future changes in win percentage, there is enough of a correlation for teams to take note of rWP. For example, a team could save money on an unwarranted contract extension for a manager if they knew that the team’s rWP was excessively high, reflecting unsustainable aspects of performance. Likewise, it might be prudent to hold off on sacking a manager with a rWP < -5 pp, especially if the team’s xWP is reasonably good.
As always, thanks for reading!