Baseball is unlike any sport in the way it has embraced and been changed by statistics. While many of the “old” stats are still most familiar, sabermetrics are becoming an everyday part of mainstream baseball conversations. When you don’t understand them or don’t know where to find them, you can feel a little left out.
GeekSided has your back! Here’s our guide to finding the most advanced baseball analytics on the web. We’re going to structure this like a baseball diamond; you’ll take your first steps by hitting a single and going to first base. You’ll take a step up from there in your sabermetric accumen by hitting a double and strolling to second. Next, smack a triple and really get your nerd on at third base.
We’ll cap it off with a couple bonus sites, taking you back to home plate, for the round tripper in true home run style. We’ll check out the on-deck and in-the-hole hitters, too.
First Base: FanGraphs.com
FanGraphs is every sabermetrician’s gateway drug. There’s no place more comprehensive, accessible, or committed to getting things right than FG. They know how to mix conventional with advanced stats to give you all the information you need and then some. Let’s take a look at a standard player page:
If I want to learn about a player, his page on FanGraphs is the first place I stop. This gives you, at a glance, the essentials of both standard and modern advanced stats mixed together in a way that makes sense. I’ll give you a rundown of what you’re looking at.
On the left side, to the right of the season/team, you have the counting stats: plate appearances, homers, RBI, the basics. Even the most seasoned stat-head wants to know these, even if they summarily cast them off as useless thereafter. Going right, you get the core batters’ peripheral statistics: walk and strikeout percentages (out of total plate appearances), isolated power (slugging minus batting average), and batting average on balls in play. These give you a good idea of a guy’s plate discipline, power, and luck.
Next are the rate stats, which include your conventional “slash line”: average/on base/slugging. After that, you’ll see weighted on base average and weighted runs created plus. These stats are intimately related, in that weighted on base average is meant to summarize a player’s offense in a number that you’d interpret much like on base percentage while weighted runs created plus is derived from wOBA, but puts it on a scale where 100 is an average player and each number above and below 100 represents a percentage more run production than the average offensive player (Abreu has been 41% more productive than the average major leaguer, then). Once you get into the groove, wRC+ will be the single most important stat you’ll look for to interpret offense.
Next, you see BsR, Off, Def, and WAR. WAR is wins above replacement, a statistic which is really catching on. FanGraphs’ calculation of WAR is held in higher regard compared to other places’, like Baseball-Reference. To the left of WAR are the three components of it, expressed in runs above average. I want to emphasize that the baserunning, offensive, and defensive components are runs above average, while the final WAR calculation is above “replacement,” which is around 18 runs worse than average. A win equates to 10 runs.
One last thing that you probably found distracting and confusing are the four, green-shaded areas that say ZiPS (R) and (U) and Steamer (R) and (U). These are projections. ZiPS is a well-known projection system produced by noted sabermetrician Dan Szymborski. Steamer is FanGraphs’ own projection system. The (R) refers to what the projection is for the rest of the season, while the (U) is the combination of what the player has already done this year and what the projection has forecasted for the rest of the year.
Let’s take a quick look at a pitcher page, using Masahiro Tanaka:
This is pretty similar to the batters, just with stats that are more applicable to pitchers. First, counting stats: win-loss record, saves, appearances, innings. Then we have the various peripherals of interest: strikeouts, walks, and homers per nine innings, batting average on balls in play, percentage of baserunners allowed left on base, and home runs as percentage of fly balls allowed.
Next are the pitcher rate stats: the familiar earned run average plus fielding independent pitching and expected fielding independent pitching (which is FIP if pitcher allowed 10% HR/FB). Finally, WAR. You don’t see any run values here for pitchers because calculating WAR is more complicated. Basically, the FIP and innings pitched are converted into a run value, which is then compared to replacement level.
The projections retain their meanings.
Finally, FanGraphs is the best place to learn about sabermetrics. There is a very detailed glossary about all advanced statistics, including those they don’t display prominently on player pages. It’s easy enough to understand for a beginner and is a useful resource even for those that are well-versed. Also, they have a Stack Exchange-esque Q&A section that also caters to multiple crowds.
Beyond those things, FanGraphs does some original reporting and analysis that is all of good quality and often is invested in novel uses of sabermetric and other data.
Second Base: Brooks Baseball
Brooks Baseball is to measurements as FanGraphs is to statistics. What do I mean by that? Brooks is home to all the Pitch F/X data your heart could ever desire. Pitch F/X is the ultra-high-tech pitch tracking that goes on for every pitch of every ballgame in MLB. Every morning, Brooks will update their expansive database with the previous day’s pitch data.
More than a website just for information about pitchers, pitch tracking is incredibly useful to learn more about hitters. How much contact does a guy make? Wrong question. How much does he make against curveballs off the outside edge against same-handed pitchers? Now you’re getting the hang of just how interesting and precise the analysis can be at Brooks.
With that in mind, Brooks is much more of an analytics place than it is a site for sabermetrics. Even if you’re not sold on sabermetrics or are still learning, there is much to be learned and explored at Brooks. Let’s take a look at some pitcher data.
Here we can see something interesting about Masahiro Tanaka – he’s throwing more pitches out of the zone than in it! He is especially adept at working the lower portion of the plate.
This one looks different, doesn’t it? We can see that Tanaka stays outside the strike zone for good reason: in those two zones where he throws the most pitches, batters are only hitting .136 and .080 on the balls they make contact with!
I can’t possibly show you all the ways to use Brooks, so I encourage you to explore it if you haven’t been there before. We’ll take one look at a batter’s Brooks data before moving on to third base.
There is one clear message with this graph: Jose Abreu, quit swinging at pitches that are low and outside! He misses on 85% that miss both directions and 58% that are outside and miss low. When you see that Abreu is a high strikeout player and then see this, you’ll know that he is actually fine when it comes to making contact on hittable pitches, but he has displayed poor plate discipline.
Third Base: Baseball Think Factory
This is for when you’ve really gotten enthusiastic about advanced statistics and analytics. Baseball Think Factory is all about original content, often in study format, from some of the field of sabermetrics’s brightest minds. Started by the founders of Baseball-Reference, the website has produced a great deal of meaningful research penned by various authors who either were accomplished already or have since attained jobs throughout MLB.
As opposed to the brute force massive data collection that led to the popularity of Baseball-Reference, the founders wanted BTF to be a place to discuss baseball with real substance, which of course included a lot of sabermetrics. Since its founding, a great deal of very important research and discussion has taken place there. The field has been furthered and many baseball careers have blossomed in its columns.
Among other things, the Ultimate Zone Rating was argued for and established on the pages of BTF. The statistic, conceived of by Mitchel Lichtman, is the most state-of-the-art defensive metric available today and comprises the defensive portion of FanGraphs’ WAR. ZiPS, the projection system devised by Dan Szymborski, was also first outlined on BTF.
If you love baseball, learning about statistics, and pushing the bleeding edge of knowledge about the game, Baseball Think Factory remains a must-visit. Compared to another site I considered for this spot, I consider BTF less overwhelming, more accessible to non-experts, and freer.
Home Run: Home Run Tracker
ESPN’s Home Run Tracker may be not be for the most serious analytics, but I must confess that I check it fairly regularly. The site’s goal is pretty simple: tell you about all of the home runs hit around baseball.
Their offerings, beyond the standard distance measurement, include telling you detailed info about the batted ball speed and its trajectory. Additionally, the Home Run Tracker will give you and adjusted, “true” distance that factors in the wind and elements to give you a distance number that compares with other home runs hit in other places and dates. It will also tell you about how many ballparks a given homer would have gone out of, if only there were neutral conditions.
On Deck: Baseball Prospectus
I feel really bad leaving BP out. In terms of web sites and organizations having the biggest impact on the game and on sabermetrics, BP probably takes the cake. I would recommend checking out the wealth of information here and see for yourself if your prefer BP to content on FanGraphs or BTF.
A few things contributed to me leaving BP out of our trip around the bases: there is so much content, especially written content, that it can be overwhelming. While BP is best known for sabermetrics, they have extended their tentacles significantly. Likewise, their player info is presented in a way that I think tends to be confusing, at least compared to FanGraphs (obviously, sabermetrics can always be confusing).
Also, BP feels much more commercial than any other site here. That’s okay, but to get the most out of BP you have to buy their (not outlandishly expensive) subscription. They also will give you the hard sell on their various books as you browse the site. I have a hard time recommending that someone purchase a subscription to BP when they are not yet aware of BP and could probably benefit just as much or more from all of the free content on the web.
In the Hole: Baseball-Reference
For raw baseball data, B-R is basically unmatched. Their historical data reaches further than anybody else and the various resources to learn about all parts of the game are impressive. Compared to other sites, for instance, their data on player contracts and team payrolls is unmatched.
While B-R has embraced the sabermetric movement, they still privilege some lesser statistics like OPS+ and have not yet come around to include defense-independent metrics in pitcher’s WAR or UZR in positional player WAR. If you don’t know what all that means, that’s fine, there’s much to learn. It’s this writer’s opinion that you get an inferior sabermetric experience at B-R, though in most other ways, B-R is the best.