Be Wary of xFIP

facebooktwitterreddit

A couple weeks ago, I wrote a story for Beyond the Box Score, entitled “Be Wary of WAR: A Cautionary Tale“.  That article had nothing to do with conflicting countries and young men giving their lives for a cause, it instead had to do with the popular all-encompassing baseball statistic, Wins Above Replacement (WAR).  Before you read all 2300 words that I wrote about WAR, I’ll just state that my main point is that I love WAR, the statistic, but it is in no way perfect.

When evaluating pitchers, I treat the statistic xFIP in a similar fashion.  I think that xFIP is one of the best, if not the best statistic in terms of evaluating pitching performance and predicting future ERA’s.  For those who have never heard of xFIP or need a refresher here’s the description of the statistic from the Sabermetric library:

"Expected Fielding Independent Pitching (xFIP) is a regressed version of FIP, developed by Dave Studeman from The Hardball Times. It’s calculated in the same way as FIP, except it replaces a pitcher’s home run total with an estimate of how many home runs they should have allowed. This estimate is calculated by taking the league-average home run to fly ball rate (~9-10% depending on the year) and multiplying it by a pitcher’s fly ball rate."

I think xFIP does a great job of removing nearly all of the luck involved in a pitcher’s performance from the equation.  While xFIP does remove a ton of luck, it can also remove aspects of their performance that have nothing to do with luck, that could negatively affect a pitcher’s ERA, while leaving their xFIP unharmed.  Thus, I’ll analyze three pitchers who have high ERAs, despite low xFIP, to show that xFIP or any other advanced pitching metric should not be the start and end of any discussion of pitching performance.

Jeff Samardzija (3.64 xFIP, 5.05 ERA):

The Cubs’ starter has a run and a half difference between his xFIP and ERA. There are four possible reasons for this gap; either Samardzija has been unlucky on balls in play, bad on balls in play, unlucky with home runs, or bad with home runs.  Jeff has a below-average home run rate, so that eliminates the last two options.  Before I get into the BABIP against Samardzija, I would just like to point out that there is a 14 point gap between Samardzija’s SIERA (3.78) and his xFIP; which usually means his xFIP is not very trustworthy.

Many people believe, or only know enough about, that BABIP is an all luck statistic.  So when they see that Samardzija has given up a BABIP (.327) that is 35 points above league average, the right-hander’s ERA is stashed under the overused “luck” term.  I, on the other hand, think his BABIP has been right around true talent level, because Samardzija has been hit extremely hard.  His high line drive rate and low infield fly ball percentage yield a higher than average BABIP, and in turn a higher ERA than xFIP.

Jake Arrieta (3.72 xFIP, 5.81 ERA):

The Orioles’ starter has an over two run difference between his xFIP and ERA.  His home run rate is slightly above average, but that high rate only accounts for the .3 run difference between his xFIP and FIP (4.01).  The remaining difference comes from a high BABIP and Left On-Base Percentage (LOB%).

Arrieta has given up a .321 BABIP, but like Samardzija he has given up a ton of line drives.  Almost a quarter of the balls put in play (24.3%) against Arrieta are line drives (which result in a hit almost 50% of the time); which is terrible (7th highest in baseball).  Stranding runners (LOB%), like BABIP, is a statistic that does have a high luck factor.  However, in the case of Arrieta, who has the lowest LOB% (60.4%) among starters, there seems to be something else going on there.  A LOB% that low suggests to me that Arrieta struggles getting runners out when pitching from the stretch; thus, his LOB% is more about skill than it is about luck.

Tim Lincecum (3.70 xFIP, 5.60 ERA):

The two-time Cy Young Award winner’s struggles are talked about almost nearly as much as RA Dickey’s incredible first half.  Lincecum has given up a ton of runs, but his xFIP suggest he has not been nearly as bad as his 5.60 ERA would suggest.  The problem for the Giants’ starter  is that his problems have not been unlucky, but instead have been similar to the issues that have hurt Arrieta and Samardzija in the first half.

Lincecum has a high BABIP (.319), but also has the 2nd highest LD-rate (24.8%) and 2nd lowest IFFB% (2.8%) in baseball; which suggests that his BABIP may be due for some regression… upwards, not down.  Many experts who have attempted to analyze what exactly is wrong with Timmy, have pointed to him leaving the ball up and getting hit hard, when pitching out of the stretch.  His LOB% (62.2%) is the third lowest in baseball, and suggests that the experts are right in concluding that Tim has been bad out of the stretch.

I think unfortunately for Giants’ fans who were hoping that Lincecum was much better than his ERA, as his xFIP suggests, are going to continue to be disappointed in their ace’s (can we still call him that?) performances, unless he can figure out exactly what’s going on when pitching with runners on base.

Advanced DIPS (Defense Independent Pitching Statistics) are great.  I think in almost all cases they work better/are more important than the traditional ERA statistic.  However, that does not mean that ERA should ever be ignored.  DIPS metrics should be used for their predictive value and as a starting or comparison point for evaluating any pitcher; they should not be used by themselves, as the only evidence for making a conclusion about a particular pitcher.

All statistics courtesy of Fangraphs

You can follow Glenn on twitter @Baseballs_Econ or check out his latest at Beyond the Box Score