Let’s say you and a hundred friends each have a quarter.  After marveling at your newfound wealth, you each flip your coin ten times and write down how many times heads came up.  A bunch of you end up with five heads, with four and six not far behind.  People with three or seven heads are pretty rare.  And just one person in your crew got heads only once.  She nonchalantly explains that she’s a “clutch flipper” with “ice water in her veins.”

At this point your BS detector should be ringing. But remember that if your friend really could bias her coin towards tails, the results would look exactly the same– you’d just demand stronger evidence to believe it.

OK, now change the scenario.  Instead of 100 friends, you have 100 MLS teams.  Instead of flipping coins with a 50-50 chance of landing heads, teams take shots, each of which has a 26% chance of going in.  The one difference is, instead of 10 flips each, the teams take a different number of shots– between 100 and 200, say.  What would we see?

Both these hypothetical scenarios would follow the binomial distribution, which is just a basic statistical tool for describing these kinds of models.  What it lets us do is take a real-world event and say exactly how likely that event would be, if our simple model were accurate.

For example: take your friend with all the tails as our real-world event.  Our model is that each coin flip has a 50-50 chance of heads.  With our model, we can plug in 1 heads, 10 tries, p = 0.5 to a formula and find that there’s only a 1.07% chance of getting one or fewer heads in ten flips.  That’s pretty unlikely– does that mean your friend really can bias the coin?  Not at all.  Remember that this was the most extreme of all the trials.  You’d expect something with p = 0.01 to happen roughly once in a hundred tries.

This is a really roundabout way of getting back to the shots on goal data from last week.  Take DCU, with 4 goals in 28 shots.  If their true shooting percentage were 26.2%, they’d have a 10.7% chance of scoring four goals or fewer in 28 tries.  Again, that’s low, but the only reason we’re looking at DCU in the first place is because they’re the worst shooting team of the season.

For each MLS team since 2005, I calculated the probability of seeing that team’s goal count given a 26% shooting percentage; statisticians sometimes call this figure a “p-value.”  Here are the most extreme teams:

year club goals sog shot pct p-value
2007 TFC 25 152 16.4% 0.0055
2008 LA 55 161 34.1% 0.0249
2005 RSL 30 157 19.1% 0.0456
2010 Hou 11 26 42.3% 0.0737
2005 Hou 53 164 32.3% 0.0762

(Note: if you’re playing along at home, you may notice that I used the two-tailed test here, whereas the examples above are based on one-sided test.)

Given that we have 82 total seasons in our data set, it’s really hard to just look at the most extreme p-values and figure out if they’re fishy or not.  Fortunately you can construct a formula for aggregating these p-values, and even more fortunately a fellow named Lou Jost did so.  Using Jost’s formula we get an aggregate p-value of 0.843– i.e., there is an 84.3% chance of seeing a distribution like this, or more extreme, given our model.  In other words, in aggregate, MLS shooting percentages are consistent with our model.

What does this mean?  Do I really believe that every on-target shot in MLS has a 26.2% chance of going in?  No.  My claim is that the shot-to-shot variances are small and cancel out over the course of a season.  This model (like all models) is a simplification, but it’s close enough to reality that we can use it as a starting point.