Categories

In Defense Of Human Error

David Cameron is a good baseball analyst (or I wouldn't bother critiquing him), but I think he's dead wrong about this.

There always seems to be a crowd that is skeptical that anything could be quantified, just because the possibility for error exists. With defensive stats, the question always comes up about whether or not one man's line drive is another man's fliner or if someone collecting data is going to say a ball hit deep in the infield instead of in the shallow outfield. And I'm quite certain that people collecting the data make mistakes from time-to-time. The question that no one bothers to answer is how often those people make mistakes and how much those mistakes matter. There's a good reason no one answers these questions--we don't get to watch the play-by-play scorers from Baseball Info Solutions (or whichever your favorite service is) in action so we can't really say one way or another how well they are doing.

My major beef, though, is with the idea that no human error is involved in the collection of offensive statistics.

When we talk about something like on base percentage, it is a statistic based on indisputable factual results - Player X reached base Y times in Z plate appearances. There’s no gray area - it happened, it was recorded, and no one disagrees.

Emphasis mine. Anyone who has ever watched a baseball game (including David Cameron) knows that this is incorrect, whether or not they realize it. Nearly every game, an umpire makes a call that someone doesn't like. Hell, umpires made enough bad calls on HR/not HR that we've decided to institute a replay system whose very existence proves that people doubt the supposedly "indisputable" nature of offensive statistics. In the case of nationally televised games, not only do some people disagree with the recorded result, but sometimes millions of people disagree with the recorded outcome.

And if we talk about hits, which every offensive valuation system known to man includes, we have to start worrying about whether or not a scorekeeper decided that a batted ball should be a hit or an error. Even just limiting the jury to sports broadcasters, it is clear that not everyone agrees with the decisions that scorekeepers make.

The only reason that anyone actually believes that these are the "incontrovertible facts" is that we've all agreed ahead of time who gets to make the decisions. This is tantamount to saying that if MLB were to choose Baseball Info Solutions as its official batted ball judgment team that suddenly defensive statistics would become "incontrovertible facts."

So, there is human error in taking the data that records defensive and offensive stats, but guess what? THAT'S OKAY. That some human error exists in collecting the data does not inherently make it the most important source of variability in the statistics that we collect.

There are a few reasons that I am really comfortable in saying that the variability in defensive statistics is not mainly due to human error in the data collection:

1. Yes, some defensive stats disagree on which defenders are good and which are bad, but these differences exist even for stats that use the same exact set of data, whether it is from BIS or whoever. My intuition is that because none of these stats are really open source, and some are highly proprietary, we're not really advancing towards a consensus on how to value the various bits of information that we are given. (From a personal standpoint, I kind of get this--it's much more fun to tinker around with the formulas to improve them than it is to sit down, spell everything out (anyone who has had to write documentation knows that it is not fun), cut through all of the erroneous criticisms to get to the real criticisms, and make the hard choices on where you were wrong and should incorporate someone else's viewpoint.)

2. Not all of the all-inclusive offensive stats give us the same picture of how valuable a hitter is. If our offensive stats, supposedly built on infallible data, can disagree, it seems as though we are holding defensive stats to an unfair standard.

3. Sample size.

4. Sample size.

5. Sample size.

6. Sample size.

7. Sample size.

8. Sample size.

9. Sample size.

10. Sample size.

Let me expound a bit, using Justin Morneau, everyone's favorite Kent Hrbek clone. RZR is one of the most straightforward defensive stats that I consider to be reasonable (though it's not really as good as +/- or UZR.) If you track each kind of batted ball for a season, you can figure out which types of batted balls are fielded by a particular fielder over 50% of the time. RZR then defines that as the fielder's zone. RZR is the number of successful plays made (Plays) on balls in the fielder's zone divided by the total number of balls in that zone (BIZ). After that, the Hardball Times also reports plays made out of the defender's zone (OOZ). For Morneau over the last five years, we have:

Year BIZ Plays RZR OOZ
2004 60 46 .767 20
2005 132 120 .909 42
2006 128 98 .766 71
2007 223 171 .767 18
2008 183 128 .699 22

The average RZR in 2008 for all first basemen was .739. Overall, Morneau looks pretty good compared to that average, coming in at a five-year average of .776 and beating the average in all but last year. Three of the five RZR's are actually remarkably similar to one another.

But what I want to focus on here is his total opportunities. Over five seasons, where he compiled over 2,500 cumulative at-bats, Morneau had a mere 726 balls in his zone--726 plays where an average first baseman had a better-than-even shot at making the play. That's only 145 chances per season.

Now let's look at Morneau's hits and extra-base hits from 2007-8 if we divide his at-bats to somewhere close to a 60/132/128/223/183 breakdown. (Span is the span of games I chose. 463-496, for instance, means Morneau's 463rd through 496th major league career games played.)

AB H AVG XBH Span
60 18 .300 7 (413-428)
134 37 .276 18 (439-462)
130 41 .315 17 (463-496)
222 56 .252 22 (497-555)
181 48 .265 12 (556-606)

Now here we have Morneau's average split up into comparable sample sizes to his defensive data, and all the sudden his incontrovertible batting average is jumping all over the place. Is he an average hitter or a superstar? Apparently batting average--really at the heart of all offensive stats out there--is a completely useless statistic that has been completely soiled by the human error of umpires misjudging safe/out calls.

Certainly I can find positions with larger sample sizes. [Okay, it looks like first base is the lowest, which I should have suspected because that's where everyone puts their worst defender, the Cardinals notwithstanding.] Center fielders and middle infielders tend to get more opportunities than anyone else (which is at the heart of why they are the most important positions, but says nothing in and of itself about how difficult they are to play), but even then, the effective sample sizes are pretty small.

If we look at MLB as a whole at each position, here are the number of BIZ per 140 games played (estimated as 8.5 innings per game.)

BIZ/140G -- Position
352 -- 2B
350 -- SS
292 -- 3B
288 -- CF
241 -- RF
228 -- LF
180 -- 1B

I nearly went into a huge RZR tangent, but I'm here to talk sample size. At best, we're looking at about 352 data points*, and some positions are clearly going to be more problematic than others. For fairness, I'll re-run the Morneau analysis, but with Torii Hunter.

*My tangent would have involved talking about how at some positions there are more "gimme" plays than there are at others, so not every BIZ is equally useful to us.

Year BIZ Plays RZR OOZ
2004 287 236 .822 65
2005 222 185 .833 33
2006 330 295 .894 48
2007 384 342 .891 47
2008 289 257 .889 93

Note that the average RZR for centerfielders in 2008 was .922, and they had on average 80 OOZ plays per 140 games (with games estimated as 8.5 innings per game.) Now if we take the first approximately 287/222/330/384/289 AB's from those seasons, we get:

AB H AVG XBH Span
288 78 .271 34 (692-768)
220 61 .277 26 (830-886)
329 90 .274 28 (928-1016)
382 110 .288 50 (1075-1178)
287 81 .282 32 (1235-1309)

Torii's actually pretty consistent with these endpoints, basically as consistent as his season totals. Then again, his RZR is fairly consistent, too, if you accept that something changed between '05 and '06. I'd like to use his injury history to explain that away, but '06 was the season where he probably played the most games while clearly hobbled with an injury. (Though in general, I think his play in the field suffered from him playing through injuries.)

At any rate, in the very best case scenario, you're looking at about half the sample size for defense that you get on offense, and on top of that, I think there are a lot of plays out there (especially in the outfield) that don't inform us as much as a typical at-bat. We can sit back and blame STATS or BIS all day, but ultimately I see no reason to blame them for the variability--we just have less data to work with and we need to figure out how to work under that limitation.

Last but not least---There is still time to let Casey Blake walk, Bill Smith. Back out while you can! I know it stings when you let one get away, but remember the sage words of The Hold Steady: There's always other boys, and you can make them like you!

Alexi Casilla Is An Awful Fielder?/Expect Team Defense To Improve

The question mark is there because I'm a little surprised--though not completely surprised. (Also, I know it kind of seems like I'm rubbing salt in his finger or something, but sometimes you can't help but notice things at the wrong time.

I was going to write about Adam Everett, since I advocated acquiring him in the offseason and so far he's given the Twins jack squat. And I'll get to that in a second.

But what I really wanted to emphasize was that defense has been a problem for the Twins (and I'm hardly the first person to point this out around here.) I thought that I would probably wind up arguing that Casilla was an average defender, so it would be tough for the Twins to improve much by replacing him defensively, but then the silly facts got in the way of my thoughts. Here's a list of RZR for qualified second basemen this year:

.903 -- Mark Ellis
.835 -- Robinson Cano
.833 -- Placido Polanco
.831 -- Grudz
.823 -- Pedroia
.815 -- Roberts
.800 -- Lopez
.799 -- Kinsler
.799 -- Iwamura

Now, the first thing that stands out to me in that list is that Mark Ellis is lapping everyone else at second base in the AL. Just smoking them. It's almost like when Tiger Woods won the 2000 Masters by shooting 12 under par while the runners up shot 3 over par. That's just a huge, huge gap.

Everyone else is pretty closely bunched around .815. (If you include the NL, almost the only difference is that Brandon Phillips is pretty good at .863, but otherwise you still basically have a clump of guys from .800 to .830.) Now, let's take a look at Twins second basemen who have had at least 150 balls-in-zone in some season from 2004-8:

.847 -- Punto, 2005
.831 -- Castillo, 2006
.801 -- Castillo, 2007
.789 -- Rivas, 2004
.768 -- Casilla, 2008

And, well, that doesn't look so great for Casilla, does it? Now, that's only 151 BIZ, which is really a very small sample size, so that's another reason for the question mark on the title of this post. But it's not exactly making me feel good about his defense. Also, in 2007, Casilla had a .784 RZR with 134 BIZ. Somewhat better, but still bad.

RZR isn't everything, but it's just about all the fielding information I know of for Casilla. The only other bit that I've run across is Dan Fox's* Simple Fielding Runs--a stat he devised while he was at Baseball Prospectus. SFR had Casilla down as 4.5 runs below average (on ~150 plays) at second base in Rochester last year, and 5.1 runs below average (on ~150 plays) at shortstop. For reference, Luis Rivas was at 5.4 runs below average (on ~190 plays) at second base and 6.6 runs below average (on ~250 plays) at shortstop.

*Dan now has a position as a sort of data consultant for the Pirates' new front office, if memory serves.

Also potentially of note, SFR had Deibinson Romero as the best infielder in the 2007 Appy league, at 13 runs above average on about 185 plays.

Anyway, all of the data I have points to Casilla being below average as a defender. When I watch him play, I am unimpressed. Yes, he's got some speed, but I don't feel like he gets great jumps on the ball, or plays deep enough to get to a lot of stuff up the middle or in the hole. He also doesn't seem to have great hands and has a decent but not great arm. Plus, there are the mental lapses. As much as we might think that they will go away with more experience, I don't think that that is necessarily a good assumption.

The Hardball Times has the Twins at -16 runs on defense overall, which puts them about even with the Tigers, but ahead of Seattle, KC, and Texas. (Holy crap do KC and Texas have awful fielding numbers.) When you break it down between infield and outfield contributions, the outfield is below average (.829) at an .821 RZR, but not by a ton, and their out-of-zone plays are pretty close to average. The infield, though, has the worst RZR in the AL, and the second-fewest out-of-zone plays. So if we're going to point to a defensive weakness, it's gotta be in the infield somewhere.

Certainly, I can't imagine that Brendan Harris has been helping matters whatsoever, SSS RZR at 2B notwithstanding, he's been a defensive disaster basically everywhere he's gone. Mike Lamb was pretty awful at 3B (just as bad as Batista in 2006 by most measures I can find), so getting him out of the mix helps. Buscher has been better, but it's been an improvement from awful to bad--Buscher has a .702 RZR compared to Lamb's nearly unfathomable .602, but .702 puts him at the back of the pack in the AL, with guys like Alex Gordon and Casey Blake. (Adrian Beltre's .707 RZR makes a bad first impression, but his 55 out-of-zone plays is far and away the best in the league. He's not the best fielding 3B in the AL, but he's above average.)

So that brings us to shortstop. I remember not many people being impressed with Adam Everett when he came up, and certainly his shoulder injury made his fielding look horrific at times. But, let's check the record:

.861 RZR, 79 BIZ -- Everett
.860 RZR, 86 BIZ -- Punto
.790 RZR, 119 BIZ -- Harris

At least going by RZR, Everett was as effective as Punto, and that was with a limp noodle for an arm. I have no idea what kind of shape is arm is in now, but I imagine that we will find out. .861 is pretty decent though, tops in the AL is .869 (from Captain Maybe I'll Actually Position Myself Where The Coaches Suggest) and second is Orlando Cabrera at .863.

My main impression of Adam Everett's fielding, though, is from Bill James' essay in John Dewan's The Fielding Bible on Jeter vs. Everett. If you can still find a cheap copy, I recommend it. Here's the part that seems most relevant to this discussion, where James is discussing the video he was given of Jeter's and Everett's 20 best and 20 worst plays:

That being said, watching Derek Jeter made 40 defensive plays and then watching Adam Everett make 40 defensive plays at the same position is sort of like watching video of Barbara Bush dancing at the White House, and then watching Demi Moore dancing in Striptease. The two men could not possibly be more different in the style and manner in which they run the office. Jeter, in 40 plays, had maybe three plays in which he threw with his feet set. He threw on the run about 20-25 times; he jumped and threw about 10-15 times, he threw from his knees once. He threw from a stable position only when the ball, by the way it was hit, pinned him back on his heels.

That you probably didn't need any special video sessions to realize--Jeter's a total drama queen out there.

Everett set his feet with almost unbelievable quickness and reliability, and threw off of his back foot on almost every play, good or bad. Jeter played much, much more shallow than Everett, cheated to his left more, and shifted his position from left to right much, much more than Everett did (with the exception of three plays on which Everett was shifted over behind second in a Ted Williams shift. Jeter had none of these.)

This sounds like something that's going to generally be tough to pick up by watching someone make 3-4 plays per game, but would be easier to notice watching 40 consecutive plays that Everett makes.

Jeter gambled constantly on forceouts, leading to good plays when he beat the runner, bad plays when he didn't. Everett gambled on a forceout only a couple of times, taking the out at first base unless the forceout was a safe play.

This makes it seem as though Everett is pretty conservative out there, though given how much he has dominated the defensive metrics in the past, making the sure out might not be sexy, but could be pretty good regardless.

Many or most of the good plays made by Jeter were plays made in the inield grass, slow rollers that could easily have died i nthe infield, but plays on which Jeter, playing shallow and charging the ball aggressively, was able to get the man at first. These were plays that would have been infield hits with most shortstops, and which almost certainly would have been infield hits with Adam Everett at short.

For Everett, those type of plays were the bad plays, the plays he failed to make. The good plays for Everett were mostly hard hit groundballs in the hole or behind second base, on which Everett, playing deep and firing rockets, was able to make an out. These, conversely, were the bad plays for Jeter--hard-hit or not-too-hard-hit groundballs fairly near the shortstop's home base which Jeter, playing shallow and often positioning himself near second, was unable to convert.

James goes on to talk about how that he's merely described a difference in style here, and from these observations alone, he can't tell anything about which is good or bad. Then he goes to quote some data that indicate Everett was very effective with his style, while Jeter was not even as effective as a normal SS.

Why am I going to all this trouble here? Well, it seems to me that Everett's arm is a key part of what has made him a special defensive shortstop. If he can't make really strong throws, he can't position himself as far back, and he won't get to as much in the hole or up the middle. However, as we saw above, even without a strong arm, Everett looks like an above average defender at shortstop. A weak arm will almost certainly keep him from being an elite defender, though, which he has been in the past.

I tend to think that even if Everett gets the job done as well as Punto, because of his style, he's simply not going to receive as much praise. He's the guy in the back of the class who doesn't talk to anyone but still gets good grades, whereas Punto's the guy who sits in the front row, volunteers to answer every question and gives the teacher an apple every day. Also, given the way that TV broadcasts focus so much on showing us the batter and pitcher between pitches, or close-up shots of whatever, it's going to be tough to notice a guy whose calling card is that he positions himself a little differently and doesn't do anything flashy.

In closing, I'd like to emphasize that this is much more a description of how things have been in the past rather than a prediction of how things will be in the future. As I noted, I have no idea what condition Everett's arm is compared to earlier this year, let alone compared to how it was last year or earlier in his career. Still, I would expect Everett-Punto-Harris(3B)-Span to be a lot better than Harris(SS)-Casilla-Lamb/Buscher-Cuddyer/Kubel/Monroe. So in the near term, the Twins will probably move towards playing more low-scoring games than they did early in the season.