David Cameron is a good baseball analyst (or I wouldn't bother critiquing him), but I think he's dead wrong about this.
There always seems to be a crowd that is skeptical that anything could be quantified, just because the possibility for error exists. With defensive stats, the question always comes up about whether or not one man's line drive is another man's fliner or if someone collecting data is going to say a ball hit deep in the infield instead of in the shallow outfield. And I'm quite certain that people collecting the data make mistakes from time-to-time. The question that no one bothers to answer is how often those people make mistakes and how much those mistakes matter. There's a good reason no one answers these questions--we don't get to watch the play-by-play scorers from Baseball Info Solutions (or whichever your favorite service is) in action so we can't really say one way or another how well they are doing.
My major beef, though, is with the idea that no human error is involved in the collection of offensive statistics.
When we talk about something like on base percentage, it is a statistic based on indisputable factual results - Player X reached base Y times in Z plate appearances. There’s no gray area - it happened, it was recorded, and no one disagrees.
Emphasis mine. Anyone who has ever watched a baseball game (including David Cameron) knows that this is incorrect, whether or not they realize it. Nearly every game, an umpire makes a call that someone doesn't like. Hell, umpires made enough bad calls on HR/not HR that we've decided to institute a replay system whose very existence proves that people doubt the supposedly "indisputable" nature of offensive statistics. In the case of nationally televised games, not only do some people disagree with the recorded result, but sometimes millions of people disagree with the recorded outcome.
And if we talk about hits, which every offensive valuation system known to man includes, we have to start worrying about whether or not a scorekeeper decided that a batted ball should be a hit or an error. Even just limiting the jury to sports broadcasters, it is clear that not everyone agrees with the decisions that scorekeepers make.
The only reason that anyone actually believes that these are the "incontrovertible facts" is that we've all agreed ahead of time who gets to make the decisions. This is tantamount to saying that if MLB were to choose Baseball Info Solutions as its official batted ball judgment team that suddenly defensive statistics would become "incontrovertible facts."
So, there is human error in taking the data that records defensive and offensive stats, but guess what? THAT'S OKAY. That some human error exists in collecting the data does not inherently make it the most important source of variability in the statistics that we collect.
There are a few reasons that I am really comfortable in saying that the variability in defensive statistics is not mainly due to human error in the data collection:
1. Yes, some defensive stats disagree on which defenders are good and which are bad, but these differences exist even for stats that use the same exact set of data, whether it is from BIS or whoever. My intuition is that because none of these stats are really open source, and some are highly proprietary, we're not really advancing towards a consensus on how to value the various bits of information that we are given. (From a personal standpoint, I kind of get this--it's much more fun to tinker around with the formulas to improve them than it is to sit down, spell everything out (anyone who has had to write documentation knows that it is not fun), cut through all of the erroneous criticisms to get to the real criticisms, and make the hard choices on where you were wrong and should incorporate someone else's viewpoint.)
2. Not all of the all-inclusive offensive stats give us the same picture of how valuable a hitter is. If our offensive stats, supposedly built on infallible data, can disagree, it seems as though we are holding defensive stats to an unfair standard.
3. Sample size.
4. Sample size.
5. Sample size.
6. Sample size.
7. Sample size.
8. Sample size.
9. Sample size.
10. Sample size.
Let me expound a bit, using Justin Morneau, everyone's favorite Kent Hrbek clone. RZR is one of the most straightforward defensive stats that I consider to be reasonable (though it's not really as good as +/- or UZR.) If you track each kind of batted ball for a season, you can figure out which types of batted balls are fielded by a particular fielder over 50% of the time. RZR then defines that as the fielder's zone. RZR is the number of successful plays made (Plays) on balls in the fielder's zone divided by the total number of balls in that zone (BIZ). After that, the Hardball Times also reports plays made out of the defender's zone (OOZ). For Morneau over the last five years, we have:
| Year | BIZ | Plays | RZR | OOZ |
| 2004 | 60 | 46 | .767 | 20 |
| 2005 | 132 | 120 | .909 | 42 |
| 2006 | 128 | 98 | .766 | 71 |
| 2007 | 223 | 171 | .767 | 18 |
| 2008 | 183 | 128 | .699 | 22 |
The average RZR in 2008 for all first basemen was .739. Overall, Morneau looks pretty good compared to that average, coming in at a five-year average of .776 and beating the average in all but last year. Three of the five RZR's are actually remarkably similar to one another.
But what I want to focus on here is his total opportunities. Over five seasons, where he compiled over 2,500 cumulative at-bats, Morneau had a mere 726 balls in his zone--726 plays where an average first baseman had a better-than-even shot at making the play. That's only 145 chances per season.
Now let's look at Morneau's hits and extra-base hits from 2007-8 if we divide his at-bats to somewhere close to a 60/132/128/223/183 breakdown. (Span is the span of games I chose. 463-496, for instance, means Morneau's 463rd through 496th major league career games played.)
| AB | H | AVG | XBH | Span |
| 60 | 18 | .300 | 7 | (413-428) |
| 134 | 37 | .276 | 18 | (439-462) |
| 130 | 41 | .315 | 17 | (463-496) |
| 222 | 56 | .252 | 22 | (497-555) |
| 181 | 48 | .265 | 12 | (556-606) |
Now here we have Morneau's average split up into comparable sample sizes to his defensive data, and all the sudden his incontrovertible batting average is jumping all over the place. Is he an average hitter or a superstar? Apparently batting average--really at the heart of all offensive stats out there--is a completely useless statistic that has been completely soiled by the human error of umpires misjudging safe/out calls.
Certainly I can find positions with larger sample sizes. [Okay, it looks like first base is the lowest, which I should have suspected because that's where everyone puts their worst defender, the Cardinals notwithstanding.] Center fielders and middle infielders tend to get more opportunities than anyone else (which is at the heart of why they are the most important positions, but says nothing in and of itself about how difficult they are to play), but even then, the effective sample sizes are pretty small.
If we look at MLB as a whole at each position, here are the number of BIZ per 140 games played (estimated as 8.5 innings per game.)
BIZ/140G -- Position
352 -- 2B
350 -- SS
292 -- 3B
288 -- CF
241 -- RF
228 -- LF
180 -- 1B
I nearly went into a huge RZR tangent, but I'm here to talk sample size. At best, we're looking at about 352 data points*, and some positions are clearly going to be more problematic than others. For fairness, I'll re-run the Morneau analysis, but with Torii Hunter.
*My tangent would have involved talking about how at some positions there are more "gimme" plays than there are at others, so not every BIZ is equally useful to us.
| Year | BIZ | Plays | RZR | OOZ |
| 2004 | 287 | 236 | .822 | 65 |
| 2005 | 222 | 185 | .833 | 33 |
| 2006 | 330 | 295 | .894 | 48 |
| 2007 | 384 | 342 | .891 | 47 |
| 2008 | 289 | 257 | .889 | 93 |
Note that the average RZR for centerfielders in 2008 was .922, and they had on average 80 OOZ plays per 140 games (with games estimated as 8.5 innings per game.) Now if we take the first approximately 287/222/330/384/289 AB's from those seasons, we get:
| AB | H | AVG | XBH | Span |
| 288 | 78 | .271 | 34 | (692-768) |
| 220 | 61 | .277 | 26 | (830-886) |
| 329 | 90 | .274 | 28 | (928-1016) |
| 382 | 110 | .288 | 50 | (1075-1178) |
| 287 | 81 | .282 | 32 | (1235-1309) |
Torii's actually pretty consistent with these endpoints, basically as consistent as his season totals. Then again, his RZR is fairly consistent, too, if you accept that something changed between '05 and '06. I'd like to use his injury history to explain that away, but '06 was the season where he probably played the most games while clearly hobbled with an injury. (Though in general, I think his play in the field suffered from him playing through injuries.)
At any rate, in the very best case scenario, you're looking at about half the sample size for defense that you get on offense, and on top of that, I think there are a lot of plays out there (especially in the outfield) that don't inform us as much as a typical at-bat. We can sit back and blame STATS or BIS all day, but ultimately I see no reason to blame them for the variability--we just have less data to work with and we need to figure out how to work under that limitation.
Last but not least---There is still time to let Casey Blake walk, Bill Smith. Back out while you can! I know it stings when you let one get away, but remember the sage words of The Hold Steady: There's always other boys, and you can make them like you!

Recent Letters to the Editor
In Response to Cup of Coffee: March 13-14, 2010,
Milt on Tilt wrote: Road to the Show is the greatest thing since sliced bread. And apparently the Reds are treating me like the Mets. They called up the young prospect with less than 500 professional at bats. I'm…
spookymilk wrote: Daylight savings … morning, guys. Don't be late for, you know, whatever.
CarterHayes wrote: unless you're a … fightin' words. The DAM is definitely on the list. We didn't allow for enough time at the Seattle Art Museum last year, a mistake we won't be making twice. Those brewery recommendations…
Beau wrote: Sweetwater, Texas is my favorite song by Fastball. You can listen to it on Lala here:
meat wrote: I concur.
meat wrote: CH, the Cherry Creek Arts Fest is pretty cool, and the new DAM is a great place to get some culture. As far as breweries are concerned you shouldn't miss Oskar Blues (don't screw around…
meat wrote: Fixed, and I don't know what more to say. Damn, Texas is one hell of place.
E-6 wrote: I'm afraid yer gonna have to say more, Tex. Or fix your link...
meat wrote: Dr. Chop and I just returned home from the annual Rattlesnake Roundup . Wow, that's all I can say. I have never seen anything like it before, nor will I see anything like it again.
AMR wrote: I have three TCs: Red, Blue, and Throwback Blue. The only difference is Throwback Blue has a blue button, no MLB logo on the back, and flat embroidery. It is my newest and my going…
In Response to Happy Birthday--March 13,
AMR wrote: If it were a different Bass, we'd have an all-pitcher day!
CarterHayes wrote: .
cheaptoy wrote: …
SBG wrote: Santana is just another Latino taking the job of some poor black kid in Detroit.
In Response to Cup of Coffee: March 12, 2010,
Rhubarb_Runner wrote: I got no problem with that. ;)
Rhubarb_Runner wrote: Delmon was drafted out of Miskatonic University??
AMR wrote: Oh jeez, don't mention Olive Garden and Beer in the same post. I think I could have had a premium tap, like Killians or Michelob Amber.
Milt on Tilt wrote: he's still awesome in The Show. I signed him as the Pirates number one starter!
meat wrote: Sorry to hear that, New Guy. My thoughts are with you.
Jeff A wrote: Our condolences to all of your family.
Milt on Tilt wrote: Tickets came in the mail today. Section 323 row 11 seats 13 and 14
cheaptoy wrote: Worse, I recall seeing the suggestion by a stribbie commentor in the story about Nathan's injury. I'd hate for the front office to be doing things suggested by that lot.
cheaptoy wrote: My wife loves it too. I don't hate it to the point where I refuse to go, but I'm starting to wonder about her tastes when she'd prefer to go there than a new…
hungry joe wrote: my sympathies as well. and thirded.
In Response to Music Day,
Milt on Tilt wrote: hater