I don’t want to dwell on this too much right now, because it will take a lot more work to tease out more meaningful results. And lots of work has been done elsewhere (this is a good place to start), that does a far more comprehensive job than I’m taking on here.
But there is a lot of talk about FIP, xFIP, tERA and SIERA, all of which take a look at component pitching stats and via their formulas come up with an estimate of how well a pitcher actually pitched, without the obscuring cloud of good and bad luck. The implication is that these numbers are a better measure of a pitcher’s ability, which seems to be borne out by SIERA’s effectiveness at projecting future performance. This would seem to mean that they’re leading indicators.
That they, it is claimed, know more than ERA alone does.
There is a lot of work out there demonstrating this, and I’m not trying to impugn it. They may be right, they probably are right, but my experience is a little different. When I look at those numbers I want them to speak to me on some level, to present a better reality of what actually happened, and to point at what will happen next. I’m not sure how much they do that.
Maybe it’s because there are all these differences that do different things. So I’m increasingly unconfident accepting any of them as useful. So I thought it would be a good time to look at some collected stats to see if they told us anything.
This chart is a rough thing indeed. I took all the qualified 2012 pitchers (those with 162 IP or more). There were 87 of them. I broke them down into quintiles of 17 apiece, ranked from the largest difference between the AVG component ERA and the actual 2012 ERA. The middle quintile the remainder and had 19. And then I used the broadest of measures. I averaged.
AVG is the Average of the four component stats with the 2012 ERA. DIFF is the AVG minus 2012 ERA.
One way to look at this is to say that each DIFF moved in the right direction. The worst 2012 pitchers got better, the best 2012 pitchers by this discrepancy got worse, and the pitchers in between did proportionally in line with what would be expected. But, really, did any of the components do a good job of predicting?
Not really. And most interesting is that no matter the quality of the performance in 2012 in ERA, in 2013 the five slices were pretty darn similar, which I think indicates more of the Plexiglas effect than any predictive powers for the component ERAs.
One thing I noticed is that tERA is worse than the others always. Take that out and the averaged components do a better job.
The question here is whether the component stats are actually doing a better job of valuing a player’s talents or simply regressing to the mean in advance. Since these pitchers were sorted by the difference between their Average Component for 2012 and their actual 2012 stats, the way that the poorest and best performers by that measure moved toward the middle suggests a way to predict them, regardless of the reason why.