Last week I watched D.C. United’s frustrating match against the Red Bulls. Early on DC had many excellent shots but couldn’t get a goal; then NYRB got two late goals to win. I couldn’t get this week’s game against FC Dallas on TV, but word on the twitternets was that it was a similar story. I wondered if DC’s offense could really be as dismal as it seemed, or if they had a run of bad luck with good shots.
The good news: no, DCU probably aren’t as bad as they seem. The bad news: they probably are pretty bad.
Here’s why. DC have converted a dismal 14% of shots on goal this season, by far the worst in the league. See here:
(All data from the official MLS site, and I use per-game stats everywhere to account for the changing season lengths)
The line represents the long-term trend of 26.2% of shots on goal going in the net. You can see that DC are far below the trend. Can this low rate be sustained? I speculated that shots on goal are a better measure of a team’s offense than actual goals scored. The Fink Tank has often hinted that shots on goal are important, and DCU’s maddening performance against NYRB prompted me to investigate further.
Of course, goals are a much better measure of, you know, winning the match– but in the long term, better offenses score goals by taking more quality shots. Whether any shot actually goes in is basically random.
I pulled data on goals and shots on goal from the last five MLS seasons. Take a look at this plot, showing the distribution of shot percentage by team:
Notice anything? Not only does the worst shooting percentage happen this year (yup, that’s DCU), so does the third worst. And the five best! Either this season is seriously wacky, or (more likely) it’s simply the effect of small sample size. For full seasons, shot percentages are pretty tightly clumped in the 20% – 30% range, and we can expect this season to converge as it goes on. Basically, MLS teams put roughly 1 in 4 shots on goal in the net. Contrast the plot of shots on goal per game:
First, the historical distribution is spread out evenly instead of clumped around the middle. Furthermore, the 2010 season does have 3 outliers on the low end, but it looks like a better fit than the shot percentage graph.
We can also look at year-over-year correlations, a trick I copped from Football Outsiders. If shots on goals are a better indicator of a team’s offensive quality, then shots on goal this year should be a better predictor of goals next year than goals this year are. And the data back this up. The correlation of goals to goals next year is 0.165; the correlation of shots on goal to goals next year is 0.307. Correlation is a pretty coarse tool, but this is a fairly solid indicator that shots on goal are a more stable measure of offensive quality.
If we wanted a more rigorous analysis, we could bust out the binomial distribution. And in fact I did so, but the details are longish so I’ll save it for a future post.
For the moment, let’s suppose that all teams will converge on the magic 26.2% shot percentage. What would the rest of the season look like? We can project the number of goals this season using a couple extremely simple models:
- The “goals matter” model: assume each team will score the same number of goals per game for the rest of the season.
- The “shots matter” model: assume each team will take the same number of shots on goal per game for the rest of the season and convert on 26.2% of them. Ignore their current shot percentage.
Here are the results:
|club||games||goals||sog||shots matter||goals matter||diff|
So DCU are on a 17-goal pace, which is so bad there’s nothing in my data set even close. The worst offense of the past 5 years was TFC’s dismal 2007 season (25 goals). But based on shots, they look more like a 28-goal team, which is, uh, still pretty bad. But just regular bad, like Real Salt Lake 2005 or the Pink Cows from last year. On the flip side, the Galaxy are on a 56-goal pace, which would match DCU’s 2007 Supporters’ Shield season (the best in my data set). But shots indicate more like 43 goals, which is still quite solid, on par with Houston’s MLS Cup season.
So this model would suggest that Columbus, LA, Real Salt Lake, Houston, Toronto, and San Jose are all overperforming on offense and will probably slow down some. DC and Kansas City are way below trend and are likely to improve. In Kansas City’s case, they’re already at a reasonable 1 goal per game pace and could evolve into a pretty scary offense.
Of course, we can only talk about probabilities, not facts. My most confident prediction is that at least one of the above predictions will be wrong. But I’ll follow this over the course of the season and we’ll see how the model fits.