Wednesday, August 18, 2010

Confounding the Defensive Metrics

By Bill

As you probably know if you've read my other blog at all, I've been a big fan of the advanced defensive metrics based on play-by-play data. UZR has always been my favorite, though I'm not really sure why, and I pretty freely refer to plus/minus defensive runs saved and Total Zone as well. The idea (oversimplified) sounds awfully good: every batted ball is categorized according to the hit type and the area into which it is hit, and the fielder gets credited (or not) according to whether or not he made the play and whether an average fielder at that position would be expected to make the same play.

I'm still a fan. I think. The thing is that, since all three of those statistics use basically the same data and appear to use similar methods, you'd think we'd see a lot of agreement among them. And my impression (from staring at these numbers obsessively for a couple years) is that they generally do agree; there might be differences in how the actual runs saved are calculated, because that's just never likely to be an exact science, but generally, a good fielder does well under all three metrics, and a poor fielder ends up looking bad in all three.

This season, it seems to me, there are a lot more cases of crazily broad variance among the three methods. Maybe relatedly, two excellent pieces have been written recently documenting some pretty serious problems with current defensive metrics: one by Colin Wyers of Baseball Prospectus and the other by Tim Marchman of SI. Rather than just kind of rehash what they've said, I thought it would be fun to look at some of the most perplexing cases in 2010 (with UZR/DRS/TZ):
  • Starlin Castro (-0.5; +5; -12). I made brief mention of this one the other day. By UZR, Castro is the definition of an average shortstop; by plus-minus, he's awfully good, saving five runs in just 87 games; and by total zone, in those same 87 games, he's the worst shortstop in the league. So which is it? He's made a ton of errors, but he's seemed to be able to get to just about everything on the few times I've watched him, so I'd lean toward "average" (at least). The frustrating thing is that we'll have to wait at least a couple more years to really know (though scouting reports from you are welcome).

  • Carl Crawford (+22.1, +12, +7). UZR says that the perennial Gold Glove snub has saved more runs, ignoring positional adjustments, than any other player in the league, and by more than 50% over the #2 guy, Tony Gwynn Jr. By plus-minus, he's at a slightly less impressive 14th among all positions. By TZ, he's at an almost pedestrian +7, just sixth among big league left fielders. Have to figure that one is a blip in Total Zone's formula.

  • Ryan Braun (-14.2, -1, -2). Just a tick below average by the other two systems, but UZR sees him as the fourth worst defensive player in baseball relative to his position (only Carlos Quentin, Matt Kemp and Andre Ethier have cost their teams more runs). Plus-minus and UZR were about as down on him last year as UZR is this year, but he's an excellent athlete with a strong arm, so who knows? This is a frustrating one.

  • Adam Dunn (-0.8, -7, -2). Actually, each of these numbers represents an improvement for Dunn, a famously awful fielder -- under any metric and regardless of whether he's playing first or left -- for the past five years. But while UZR and TZ agree with the fans that he's noticeably improved his defense (almost all the way up to average!), he's still the third-worst 1B in the majors by plus-minus.

  • Jason Kubel (-6.8, -6, +1). Late add because I can't believe I missed him. Kubel started the season as the Twins' primary DH, so has played less than half a season's worth of innings in the outfield. Given that, those UZR and plus-minus numbers are close to bottom-of-the-league territory (-17.1 UZR/150). And, as one who has watched the vast majority of those innings, this is where my brain puts him; Kubel is quite slow, appears to have a poor reaction time, makes poor decisions, and doesn't have much of an arm. He's made some sparkling plays lately -- including a sprawling, diving-backward catch last night that saved at least a double -- that have convinced a lot of Twins fans that he's good out there, but it seemed to me that on most of them (including last night), a better and faster outfielder, like Jason Repko, would have gotten there in plenty of time without diving. So, it's odd to me that total zone has him a little above average. There's got to be something off with that one.
And, last and the opposite of least:
  • Cleveland Indians (-37.2, +66, +7). This is what got me thinking about this. The other day, plus-minus mastermind John Dewan tweeted: "Did you know the Indians lead MLB with 58 Defensive Runs Saved?" That floored me, and apparently it would floor the other metrics, too. By UZR, Cleveland is second from last in MLB, just two and a half runs ahead of the Orioles. And then by Total Zone, fittingly, they're smack in the middle, at 15th of 30. So are they great, awful, or average? Who can say? For whatever it's worth, the biggest differences seem to be on Trevor Crowe and Asdrubal Cabrera (who looks to the observer a lot more like a +4 than a -4 out there, but that's neither here nor there).
It's worth noting that it's become clear as I've gone through this that there is still a lot of general correlation among the various systems; if a player has a high UZR, he likely has a pretty high DRS and TZ, too. But what can we do with all this when three systems approach a problem more or less the same way and come up with answers that can be up to 15 runs apart (in an area where 15 runs saved in a year is usually enough to lead the league at most positions)?

I think that until someone gets a better handle on this, you'd better just keep looking at all three. Before drawing a conclusion based on your favorite metric, you'd better make sure it's not wildly out of line with the others (or with past years', in some cases). Put differently, while I don't think I agree with all of the conclusion from Marchman's article, this line seems to get it just about right: "the point is to take these measures as a flashlight in a dark room. There isn't enough light in the room to tell if an object is seven or nine feet high, but there is enough to tell it's tall."


James K. said...

Looking at DRS in season seems worthless.

Add up UZR across MLB and the sum is -0.8 (about where you'd expect a stat that is measured against average to be).

Add up DRS across MLB and the sum is +416 (what?). Apparently adjustments need to be made, possibly after the season, to DRS before it is useful as a stat.

BenJ said...

Defensive Runs Saved admittedly doesn't add zero, but that doesn't make it worthless. -10 Runs Saved in CF isn't quite the same as a -10 UZR, and it's not quite the same as a -10 at third base, either. Comparing teams or players within a position is perfectly valid.

Also, the large Indians discrepancy is mostly due to their pitchers. Defensive Runs Saved has pitching and catching defensive components, unlike some other defensive stats. As it turns out, the Indians pitching staff has been fantastic at fielding their position, according to Plus/Minus Runs Saved.

Zach Sanders said...

It all comes down to sample size. If you took the players you listed above and got a 3-year sample for each, would the results be more uniform? I bet they would be.

Bill said...

Well, probably, yeah. But that was part of Marchman's point. A sample that requires three years to be meaningful is of pretty limited usefulness. By the time three years pass, he'll often be a different player than he is now anyway.