“There are three kinds of lies,” Mark Twain famously wrote, “lies, damned lies, and statistics”.
And yet I think that even Mr. Clemens himself would agree that the importance of statistics in baseball cannot be overstated. Maybe it’s because of the idiosyncratic nature of the game, or its rhythm and pace, which leaves ample time to ponder events probabilistically. But whatever the reason, it’s clear that numbers are stitched tightly into the rawhide fabric of the game.
Sometimes, however, statistics can be misleading. Sometimes we can rely on them too heavily, and in so doing, make the ironic mistake of distorting an analytical tool until it resembles the very kind of ignorance it was intended to diminish.
After 14 long years of futility, the 2012 Baltimore Orioles seem poised to contend for a playoff berth and possibly a division title. How are they doing it? With an incredible bullpen, an opportunistic (but not spectacular) offense, improved defense (particularly in the second half), and a great manager. That would be my analysis. When baseball sabermetristas analyze the 2012 Orioles, however, they often propose a very different mechanism for their success.
I would like to argue that this analysis, which typically stems from blithe regurgitation of “run differential” argument is unfair. It’s also flawed.
What is Run Differential?
One of the most frequently cited pieces of evidence by “experts” claiming that the Orioles 2012 success is unmerited is their negative run differential.
In case you aren’t in the habit of monitoring such statistical minutiae, a teams run differential is simply the difference between the number of runs that they have scored versus the number of runs they have allowed in the aggregate. For example, if over the course of a season a team were to score 1000 runs while allowing 500, its run differential would be +500. Conversely, if a team were score only 500 runs while allowing 1000, its run differential for the season would be −500.
By then using something called the Pythagorean expectation, popularized by baseball luminary Bill James, baseball hipsters fancy that they can accurately predict the number of wins a baseball team is “supposed” to have, once again, on the basis of their run differential. Using this method, the Orioles “should” only win 78 games this season. And yet, given that the already have 77 wins as of today it’s pretty safe to assume they will exceed this projected target.
Run differential can he a handy way to get a very quick overview of a team’s performance. Obviously, if a team is scoring more runs than they give up by a wide margin, hey, they must be pretty good. The converse is also true.
Unfortunately, some critics get themselves into trouble when they insist on viewing a team entirely through this limited statistical lens. Because the 2012 Orioles currently have a run differential of −20, it has been widely supposed that the Orioles are, in fact, a bad team, who have managed to remain competitive this far into the season through some combination of witchcraft, or divine providence, or both.
The Problem With Run Differential
The problem with run differential is that (like an arithmetic mean) it is distorted by extreme values. If your team loses just a handful of games by a very large margin, your run differential for the season would be negatively skewed. But would that be a fair representation of the teams overall ability? After all, take away just a couple of major beat downs, and what looks on the surface like a bad team on the basis of run differential might actually be a pretty good team. Unfortunately, this is precisely the kind of extra effort that most self-proclaimed baseball experts don’t bother to make.
If you follow the Orioles, then you know two things are true about the 2012 team. First, their offense is mostly average and is probably best described as opportunistic. Second, the rotation has undergone nearly constant reshuffling due to injuries and poor performance.
And when you put these things together, what do you get? A team that plays in lots of close games, and that sometimes has a disastrous starting pitching performance from the AAA carousel / disabled list / waiver wire. If you examine the numbers, you’ll notice that the Orioles negative run differential can be accounted for entirely by just a few lopsided losses.
For example, if, for the purposes of analysis, we were to disregard games in which the Orioles were outscored by more than 7 runs (all nine of them), their run differential for the season jumps from −20 to +63. If we disregard only games in which they were outscored by more than 9 runs (all six of them) their run differential would jump from −20 to +42. And if we were to disregard only three games in which they were outscored by more than 11 runs, their negative 2012 run differential would jump from −20 down to +15.
In just four of the most lopsided losses, pitchers who are no longer even a part of the starting staff are responsible for 26 earned runs alone.
|Date||Final Score||Run Differential||Starting Pitcher|
|5/7||14–3||−11||Matusz (8 ER)|
|5/8||10–3||−7||Arrieta (6 ER)|
|6/7||7–0||−7||Matusz (4 ER)|
|8/22||12–3||−9||Hunter (8 ER)|
At a minimum, it would be preferable to use the median runs scored per game and then compare that with median runs allowed per game as the basis for a predictive metric of success. Doing so would help to remove some of the skewing that occurs in the data due to outliers—instances where the Orioles were crushed or crushed others. Even this approach, however, is problematic.
It Gets Worse
There is another larger problem with blindly relying on the run differential argument.
What the run differential argument tells us is that two independent variables (runs scored and runs allowed) are sufficient to explain all of the variability in winning percentage from one team to another. And it is often supposed that anything that isn’t explained by these two variables must (of course) be directly attributable to luck.
So it this a good idea? Would this kind of streamlined analysis hold up to real empirical scrutiny? Will it work?
While I shan’t commence to bore you with the vagaries of conducting a multivariate regression to measure the accuracy (and statistical significance) of relying totally on two independent variables to explain 100% of the variability in winning percentage from team to team, let me offer a hypothesis.
Umm, no, it totally won’t.
Are we to believe that there are no other statistically significant independent variables that influence a team’s winning percentage? What about team ERA? Bullpen ERA? Fielding percentage? Previous managerial success? Number of All-Star players? Payroll?
When talking heads who favor the run differential argument cite that whatever difference not attributable to runs scored and runs allowed must be attributable to luck, they’re also implicitly stating that no other independent variables deserve consideration—not payroll, not fielding percentage, nothing.
They’re also implicitly suggesting that no other unquantifiable intangibles are important—not leadership, or team chemistry, or strategic approach.
But let’s suppose for the sake of argument that these two independent variables were sufficient to explain nearly all of the variability in winning percentage from team to team. There still wouldn’t be any way that we could quantify to what degree luck, or belief in Santa Claus, or any other combination of non-quantifiable variables contributed to the observed differences in team success.
I’m not denying that there is an element of luck in nearly any kind of competition. Of course there is. But to posture yourself as amazed when real data fails to perfectly align with two (two!) explanatory variables, and carelessly attribute the remainder to luck is, at best, an oversimplification.
A Referendum on Rooting Interest
Can we all not agree that we esteem competition because it obviates the need for esoteric arguments about which teams are best on paper? And if so, then what’s the objection to the Orioles success?
I understand that it can be tempting to say that one team has less talent than another, but to whatever extent we insist that a team’s statistical profile belies its worthiness as a competitor, we call into question the respectability of every team—even the ones our own biases predispose us to favor.
To those among us who would fail to recognize the legitimacy of the Orioles success, I would ask this: are you sure it’s even baseball that you actually care about? It seems to me that you can accept baseball as a conduit for real data that doesn’t denigrate improbability, or you can embrace mathematical analysis unconditionally and often find yourself on the wrong side of reality. But you can’t do both.
Perhaps, then, instead of condescending to teams that “overperform” we should just acknowledge that statistics, while incredibly useful, are not perfect. Or perhaps we should take care not to dismiss teams that fail to meet every preconceived notion of merit. After all, as I write this now, the Orioles −20 run differential is the same as the 1987 Twins, a team that did manage to make the playoffs. And won the World Series.
But whatever we do, let’s not insist that we can predetermine success definitively with one equation and two variables.
Baseball, and the fandom it inspires, is far more complicated than that.