Earlier tonight, I was debating with a friend over the existence of clutch hitting (I was of course arguing against it). I stumbled upon this article, which gives a pretty good argument saying that clutch hitting does indeed exist. (Note: This article is a few years old. I'm not sure if anybody has done any good follow up studies to it since then.)
The main point of the article is that if you look at the data and assume a "zero clutch" hypothesis, the data corresponds very poorly. In other words, what has actually happened doesn't seem to correspond with what would happen if clutch hitting didn't exist.
One of the more interesting parts of the article to me was when you scroll down to the section on clutch hitters and chokers. On the list of chokers is none other than Reggie Jackson. That's right. "Mr. October"--the man that once hit 3 home runs in the deciding game of the World Series is the antithesis of a clutch hitter.
I have a hard time knowing exactly what to think of this. There are two problems with people evaluating hitters as being clutch or not. First is that they tend to base their assumptions on extremely small samples sizes (according to the article, it takes about 7 years as a regular before getting the number of PAs required). The second is that most people use an extremely subjective definition of clutch. Most frequently, the two are combined in order to allow people to make whatever conclusions they want about any player.

Some thoughts on the linked article:
- I don't like concluding that ROE are "unsuccessful" outcomes. Surely, many ROE are on balls-in-play (BIP) that are easy to field, but sometimes hitters get hits on BIP that should be fielded. And if "hustle" is part of clutch performance, it could be argued that a runner hustling to first base would cause more errors than one jogging at half speed. In the end, though, this probably doesn't cause a significant systematic bias, so whatever. (This is mainly just part of my ongoing rant against the error--I dislike it for judging offense, and I dislike it for judging defense/pitching.)
To do this, I ran several thousand trials in which 612 players were created at random using the observed OBP talent distribution (0.329 average, 0.026 rms) and given batting stats based on their talent.
- This part confuses me a bit. Is he assuming a normal distribution around .329? It seems like he is, and given the number of data points (612), that seems reasonable, but because he's selected a group of hitters amongst the best in baseball (as judged by the teams allowing them to accumulate the most PA), he's presumably selecting from one side of the larger distribution of players overall. A simple plot of the distribution would more or less settle this issue, which brings me to...
- No plots! I would dearly love to see a plot with non-clutch performance on the x-axis, clutch performance on the y-axis, and the error bars associated with each point.
- I'm not any kind of fanatical Bayesian, but not assuming any kind of prior (as described in the appendix) rubs me the wrong way. I've seen enough baseball that I don't believe anyone who gets 1000 or more plate appearances is (conservatively) worse than a .100 hitter or better than a .500 hitter.
- He throws around the term "significant" a lot, but--and maybe it's just because its 1am right now--I'm not seeing what his test for significance is. (Basically, as described, I'm not sure I could reproduce his work.)
So I guess those are my complaints from a technical standpoint. Ultimately, they might not make any difference, but they sit in the back of my head and create doubt.
Okay, now onto the baseball. From his conclusions:
I'm perfectly willing to accept that people perform differently under varying degrees of pressure. I also tend to think that professional athletes handle pressure better than a typical person--or they wouldn't have survived the weeding out process. So the difference in clutch abilities between me and Derek Jeter is going to be a lot different than the difference in clutch abilities between Alex Rodriguez and Derek Jeter. But if no one can tell me who the clutch hitters are, who cares?
Elsewhere, Dolphin states:
To which I say poppycock. I can tell you with a large degree of confidence who the best hitters in the league are. You can't tell me with any degree of confidence who the clutchiest hitters are. Obviously, I would love to have the best AND the most clutch hitters, but if I'm doling out contracts at the beginning of the season, it's pretty easy to decide which to invest in. Just because I can't identify clutch players, and because there is a clutch effect, doesn't mean that I should stop caring about which hitters I thrust into critical situations.
(Put in other words: given these numbers, you could say it's better to be lucky than it is to be good. I would still rather be lucky and good than lucky and average, though. That luck exists as a factor is no reason to stop trying to improve your odds.)
Furthermore, if a single player gets 150 clutch PA over the course of a season, then a team of players gets around 1350 clutch PA over the course of a season. Given his standard deviation of 5.7 over 150 PA, that gives us a standard deviation of 17.1 over 1350 PA. If I have 5 above average hitters and 4 average hitters, then first of all, I'm not Terry Ryan, but second of all, that gives me a 19.5 success advantage over a team of average hitters, so my edge in talent is larger than the size of the random variation.
I guess the succinct way of making my last point is that while the luck tends to even out amongst the nine players, the talent adds up. (On top of which, it's good to have talented hitters because they can help create blowouts where their clutch abilities don't matter whatsoever, and they can create clutch situations when the pitchers allow a lot of runs early in the game.)
I tend to agree with most of what you've said here, but I don't see any reason we couldn't figure out who the most clutch hitters are. All you have to do is take any metric that you can apply to normal (non-clutch) situations and apply it to clutch situations. Compare this to how the hitter hits during normal situations and there you have it.
There are however a few problems with this approach. First, by the time a player gets enough PAs to declare him a good clutch hitter his career may well be on the downswing. Second is the fact that clutch here is defined relative to the person's normal performance. The result is that a good hitter who is non-clutch is probably still better than a bad player who is clutch. I will take Reggie Jackson over Ruppert Jones any day of the week.
My position is that those problems are too large to make identifying clutch players useful.
Also, a bigger problem that you didn't mention, different definitions of clutch situations give you different results of who is clutch and who is not. For instance, this year, by THT's clutch, Torii Hunter's clutch value has been around 5 runs. Fangraph's "clutchiness" has Hunter as below average by 0.69 wins in clutch value this year.
A better way to do things, once you decided on which situations are the most important and how to weight them, would be to look at a players' batted ball data rather than his results to get an idea of how hard he's hitting the ball in these situations. That will take a little of the "luck" out of the equation in that, if the hitter happens to run into really good defense in his clutch appearances, he won't be punished for it. That would probably give you a better idea of who is likely to be clutch going forward than just looking at the results. (Then again, you'd have trouble convincing the most staunch clutch performance supporters that this is a good approach since all they care about is whether or not the hitter performed.)
This is one of my big problems in general with people talking about "clutch hitting". Most people simply use clutch hitting as a way to add or remove value from a player without any basis whatsoever (see Arod vs. Magglio discussion here). An attempt to quantify clutch somehow is at least a step in the right direction, but nobody will ever be able to agree on exactly what it is.
In poker we have a term for this. It is call results oriented thinking. People who play a hand poorly and win will be likely to play it that way again in the future and people who play a hand well but lose will be less likely to play that hand the same way despite the fact that playing the hand that way will win them more money (or at least lose them less) than playing it in suboptimal fashion. In order to win money at poker, you have to learn how to ignore how much you are winning or losing as you are playing. Instead, you have to focus on the process and make sure that you are making good decisions. This concept seems almost absurd to people who don't understand poker.
Baseball is a lot like poker in this regard. Your goal as a hitter should be to have good plate appearances and smash the shit out of the ball when you hit it. If you can do this regularly, you will become an amazing hitter. A hitter who works into a 3-1 count and then lines into a double play has in a sense had a more successful at bat than one that hits a bloop single on the first pitch. Despite the outcome, the first hitter will be way more successful in the long run if both players continue to hit in a similar manner.
This concept seems almost absurd to people who don’t understand poker.
I can't necessarily say I "understand" poker, but I completely understand this philosophy. In many things, not all good decisions are rewarded and not all bad decisions are punished. If you have information other than your specific outcomes that informs whether or not a decision was bad, you ought to be using that information.
Incidentally, I think that scouts (especially at the lower levels) understand this pretty well and better than many stats folks. (Personally, I think I didn't really even think about this about baseball until post-DIPS.) When they are scouting high schoolers, and they have only a day or two to look at them, they need to judge them on things like their athleticism, swing, etc., because whether they go 3-4 or 1-4 means relatively little in the long run.
Where this can sometimes lead you astray is if you have, say, a hitter who "looks like a ballplayer," is really toolsy, and by all rights should be a good hitter, but after thousands of at bats in the minors just never hits, then you eventually have to defer to the results, or at least be willing to concede that something beyond your grasp is going on. (The Twins made this mistake with Luis Rivas, for instance.)
well, isn't this a difference between Bayesian and Frequentist approaches to the real world?
In poker, all the probabilities are known and fixed a priori, assuming truly random shuffles. What you call "results-oriented thinking" thus is sub-optimal. In life, we don't really know the true distributions for most things, so we are trying to fit new bits of information about the relevant DGPs on the fly. The human comparative advantage is in adapting to dynamic circumstances. The species has selected for effective, instinctive updating from experience ("results-oriented thinking") for thousands of years.
and, of coruse, real players try to take advantage of these instinctive tendencies. Otherwise, it would be impossible to "set up" a batter via a particular sequence of pitches.
well, isn’t this a difference between Bayesian and Frequentist approaches to the real world?
Sure, but like I mentioned in my intial comment, we know certain things about the distribution. For instance, no one's a .100 (or worse) hitter and no one's a .500 (or better) hitter. Really, we have a lot of information to help us get close to what the probabilities involved are.
Also, there are certain things that you can measure that help you determine things on the baseball field. For instance, I don't need to watch Matthew LeCroy get caught stealing 95% of the time to know that it's a bad idea to let him steal--I can just time him to see that he is mind-numbingly slow.
I don't know, maybe this just means I don't think the Frequentist approach works here, or that I'm really a Bayesian and don't know it, but if I have useful information in my hands, it seems like it'd be a good idea to put it to use.
Human instinct is best suited for responding when the difference between two options is "relatively obvious". A grandmaster at chess can look at the board and immediately come up with a few candidate moves. A computer on the other hand would have to look a bunch of obscure moves in order to determine that they aren't the best. In terms of baseball, this is like a scout looking at a player and determining whether a player has at least some chance of being able to succeed in the majors and letting him on minor league roster.
When it comes to making very fine decisions, the human mind does a very poor job. Without looking at the numbers, it would be essentially impossible to tell whether Jason Kubel or Michael Cuddyer is better. The problem is that our minds will become subject to observational bias. You will almost certainly say that the better player is whoever has played best in the past couple weeks and your mind won't accurately process stuff from a long time ago. When it comes to making these judgments, we must hence defer to whatever data is available.
I don’t know, maybe this just means I don’t think the Frequentist approach works here, or that I’m really a Bayesian and don’t know it, but if I have useful information in my hands, it seems like it’d be a good idea to put it to use.
My Bayesian buddy likes to say that everyone is Bayesian; it's just that the Frequentists don't know it (or are lying to themselves).
When it comes to making very fine decisions, the human mind does a very poor job.
Well, that depends on what you mean by "very fine decisions." Consider the Newtonian mechanics of billiards. I find it pretty damned impressive that so many half-drunk 20-somethings can make so many fine-grained decisions so quickly without the aid of calculator, protractor or any other equipment besides eyeball and brain.
The species has had tens of thousands of years (thousands of generations) of selection pressure to produce a set of survivors who are really, really good at making snap judgments when survival (or at least marginal, reproductive fitness, such as winning a game of 8-ball with $20 on the table) is at stake.
I tend to believe that there is probably more evidence in people choking than people stepping up their game...
I tend to believe that too, but it would also stand to reason that a hitter who was not unclutch (clutch neutral if you will) would then wind up doing better against choking pitchers than when you have a choking hitter vs. choking pitcher matchup. (The classic movable object meets resistable force problem.) So it's kind of unclear to me exactly how that would show up in the data.
apparently, ubelmann doesn't think that Pierson's Puppeteers are secretly breeding for lucky humans.
I dunno about "clutch," but there seems to be a clear difference in outcomes between, for example, plate appearances with RISP and bases empty.
In 2007, the AL OBP with RISP is 356, with bases empty, 325 (with about 22 pts difference accounted for by the difference in IBBs)
In 2007, the NL OBP with RISP is 355, with bases empty, 321 (with about 29 pts difference accounted for by the difference in IBBs)
In 2006, the AL OBP with RISP was 357, with bases empty, 328
In 2006, the NL OBP with RISP was 355, with bases empty, 323
In 2005, the AL OBP with RISP was 348, with bases empty, 319
In 2005, the NL OBP with RISP was 352, with bases empty, 318
A chunk of that persistent difference can be accounted for with IBBs, but not all of it. But is that "clutch" hitting, or a change in pitching//defensive strategy or performance?
*sorry about the first draft. I had BAs for 2006 and 2005*
(Crawls out from under rock...)
If anyone asks, I wasn't here. I'm s'posed to be working hard on my actuarial project.
I betcha a lot of that persistence could be because RISP is not independent of pitching ability.
Bad pitcher, or pitcher having bad day implies more RISP and greater OBP against.
Relievers would mitigate that some, but not enough to remove the effect.
hmmm. good point. I'll have to think about testing that....
pitching in the stretch v. full windup?
I would say that's probably the biggest contributor, though you can sort of check that by what the splits are with just a runner on first against all the other situations with runners on base. Also, you're averaging in a slightly strange way. Per plate appearance, the quality of pitcher with the bases empty is probably going to be better than the quality of pitcher with someone on base simply because good pitchers allow fewer baserunners.
Also, you’re averaging in a slightly strange way.
Which is what AMR was saying. Sheesh.
pitching in the stretch v. full windup?
I think that would fall under the rubric of "a change in pitching//defensive strategy or performance"
which was really a key part of my point. You can't draw conclusions about "clutchness" without accounting for both sides of the equation (hitters and pitchers/defense).
Personally, my inclination is that pitchers have most control over their clutchiness. As the ones who initiate the action, it seems intuitive to me that they have the most control over the situation. Mainly, they control which pitch to throw, how hard to throw it, and where it is going. For instance, if Santana has a 10-run, okay, a 6-run, err, a 4-run lead, then he can shelf the slider (which probably doesn't go for a strike as often) and go with more fastballs which will probably lead to more balls in play, but will generally be less effort for him. He'll be more likely to give up runs, but with a big lead it becomes more of a goal to "eat innings." With the bases loaded with one out in a tie game, he's going to use his entire arsenal to deal with the hitters, and if it takes him 10 pitches, it takes him 10 pitches.
In which case, really, we're barking up the wrong tree. We should be looking to find which pitchers are most clutch so that we can adjust the batters' numbers accordingly. (Rather than assuming all of the batters are facing the same cross section of pitchers in their clutch appearances--something that tends to be true in the long run, but is usually less true than we would like it to be.)
If I have 5 above average hitters and 4 average hitters, then first of all, I’m not Terry Ryan,
WaaaaAA!
+19.5. LOL.
Or if I have them, I'm not playing them in favor of some scrub who gets his uni dirty sliding into first.
-Ron Gardenhire
At the tail end of Sunday Sports talk with Patrick and your boy TJ, a couple interesting nuggetsregarding Casilla (and Punto):
a.) Patrick seemed to intimate that factions of the organization feel that part of Casilla's mental problems at 2nd base are a result of him being a more natural shortstop and he's not developed enough "feel for the position" at the 'major' league level. I did a little bit of research on this and does appear that he has moved around the infield some. He also said something about Casilla needing one more full season at Rochester.
b.) On why Punto is playing so much...it's all about the contract and knowing that he will be here in '08 and is probably the odds-on favorite to be your starting 2nd basemen, although there will be a 3 way competition betwee Casilla, Punto, Tolbert. He also said something to the effect that they would have called up Tolbert before Casilla to relace Castillo, but it coincided with him (Tolbert) going into the tank'.
c.) He repeated some often forgotten Terry Ryan decarlations from the "Butch Husky" era where Ryan went on record saying something like - 'we're not going to do that anymore'. Obviously some self-directed advice which he has failed to follow.
it's "all about the contract", huh?
*angry torch-bearing mob turns and instead heads off towards TR's castle on the hill*
I heard the show. It seems that Casilla will be in AAA in 2008 and TJ was more than happy with that idea. They said that he needed someone to coach him not to make the mental mistakes that he's been making. In an ironic twist (lost on both of them, I think), they indicated that he got away with making a bunch of mistakes at AAA this year. (So, how is it that a guy like Bartlett can go down there and learn leadership if there's no real coaching going on?)
Yeah, the specific comment was that Tolbert was playing like Casilla. His non-callup is obviously a punishment for something or other. He needs to learn leadership!
Yeah, Reusse made the point that Ramon Ortiz was a stupid idea.
Although imperfect, it's still among the best local baseball talk by talking heads - even if you hate both guys. Both guys are around the club, have a well culivated list of sources and clearly are pretty passionate fans.
Did it seem to anyone that Casilla was a defensive liability?
Casilla didn't blow my socks off, but I think he'll be fine, depending on how much he can cut out the silly mistakes. I think his range was at least as good as Castillo's, he turned the DP pretty well, and he's got a strong arm. If he's still adjusting, I'm willing to keep an open mind. If he makes enough mistakes, though, that could negate his range and arm strength.
Also, I'm not going to be especially upset if Casilla get sent to Rochester to start next year. His performance there this year wasn't really overwhelming, and it definitely seems like there are things for him to work on, especially if he is uncomfortable at second base. I'll still wish that the Twins were able to get a better placeholder than Punto, but Casilla is different from Bartlett in this regard, since going into, say, 2006, Bartlett had nothing left to prove/learn in AAA.
Was wondering, I keep seeing that Alexi's got an awful batting line, but I seem to remember him being the only sucessful batter in many games.
Made me think about portfolio theory. Has anyone looked at whether any batters perform in a way that is anti-correllated with the rest of the team? That is, when the rest of the team is struggling, he does better, but then he'll also make three outs over two innings when the rest of the team bats around.
Seems to me that if such players exist, they could be a valuable "asset to hold, " giving your team a chance if the other 8 batters are baffled. Games that may have been lost 0-1 with moderately correlated batters may then be won 2-1 or at least go to extra innings.
Man, I'll be happy when I'm done with all this actuarial educationing/exam-ing/project-completing...
I'm guessing that this is a "feature, not a bug" of a Piranha-dominated offense, right, Greek House?
not negatively correlated performances, but rather, there will be plenty o' instances where one guy gets 3-4 hits and everyone is pretty much silent. Just as there will be instances where the hits come in bunches for the team.