Joe And Ted's Excellent Accomplishments: Streaks and the Evolving Sport of Baseball

Summarizing my latest sports piece, and first peer-reviewed publication on baseball. Get it for free here.

Even after nearly a century and a half of major league play, baseball still has no shortage of great questions and puzzles, one of them being the phenomenon of the streak. Just how remarkable was Joe DiMaggio's 56-game hitting streak, and Ted Williams's less celebrated 84 consecutive games reaching base? If we could rerun the past 139 seasons of baseball, how likely would we be to see a streak of that length (or longer) again? And how can we trust that the answer we get back is in any way a reliable one, without the use of a time machine?

Any attempt to answer these questions starts with the supposition that there are some outcomes that would be observed every time, and we must consider the mathematics in a way that favors their appearance. One possibility is that even if DiMaggio's hitting streak was somehow magical, other lesser streaks would be expected. After all, while it might not be all that likely that Ty Cobb, Paul Molitor and Jimmy Rollins would run up hitting streaks of 40, 39 and 38 games respectively as their career bests, it isn't all that unusual to say that in these hypothetical do-overs, someone would have done as well in each person's place, and that streaks of roughly these lengths would end up as numbers 6, 7 and 8 on the all-time list.

To test these outcomes, I started with a publicly available database on yearly player outcomes, and assumed that each player would have the same number of games and plate appearances in each year. I then produced several plate appearances for each player in each hypothetical game by, essentially, the Strat-O-Matic or "Wheel of Fortune" method - spin a (computer-based) wheel marked with "hit", "walk" and "out", where each wedge on the wheel has as much space as the probability of each event - and noted whether a hit (or a walk) was recorded in the game. (This, of course, assumes that there is no "hot hand" effect that would cause players on a streak to keep performing above their expected ability.) This is done repeatedly to get a large number of histories to compare to the real McCoy.

For hitting streaks, it turns out that this method does a great job for creating those lesser streaks for 1950 and onwards, but overpredicts how long these high-ranked streaks would be from 1900 until 1950, and is even more wildly high for the corresponding streaks in the 19th century. An easy fix for this is to allow the hit probability to vary from day to day in these early eras - on some days the batter has a higher average than normal, on some days a lower average - and by choosing the right variabilities, those lesser streaks in simulation for each era line up with their "real" counterparts. This produces a highly educated guess about how rarely we would see a 56-game streak or more: a little less than 5% of the time since 1901, a streak of the same length as DiMaggio's would be observed. In my eyes, that's certainly a  remarkable record, whatever factors led to it.

But even this easy tweak opens up a lot of questions. If this extra fluctuation is enough to make the model look like the real world, what real-world factors would produce it? One possibility is that games pre-1940 were called due to darkness more often; another is that while on the road, players would encounter many different types of ballparks; most likely of all, the quality of opposing pitching was far more variable than today, and that a few good (or lucky) pitchers were far better at stopping streaks than others.

If this explanation is true, then we have a new issue to contend with: on the aggregate, the past 60 years of streaks needed no extra variability in day-to-day hitting to produce a streak list that was comparable to reality. Does that mean that the modern use of relief pitching has created a game where a player's opposing pitching is virtually indistinguishable from day to day? Or, more extremely, could it mean that on the whole, any two given major league pitchers are effectively indistinguishable in stopping hits from occurring? That's one philosophy of baseball research, first suggested by Robert "Voros" McCracken a little over a decade ago and still a hot topic for debate, but it also has a taste of Stephen Jay Gould, who pointed out that the variability between hitters has been decreasing over time. While Gould suggested that this would mean the .400 hitter was a thing of the past, the implication here is that this lower variability would encourage longer hitting streaks: without pitchers that are dramatically different from their counterparts, a player on a streak would be less likely to be shut down by an opposing ace today than 100 years ago.

This method isn't nearly as successful for figuring out on-base streaks; in fact, the same machinery that was just used for hits alone grossly overestimates how long history's on-base streaks would last, no matter how much reasonable difference in ability we estimate between pitchers. While this certainly implies that the opposing pitchers still vary widely in their ability to prevent bases-on-balls (the flip-side to McCracken's argument), it gives us no shortage of alternative explanations to consider as the season begins, and just as much continuing mystery about this era of the game that we are privileged to witness.

Recent Entries

Catcher Spotting Data Now Available
Thanks to all those who took part in my trial of Catcher Spotting utilities. As promised, I've posted the data…
Joe And Ted's Excellent Accomplishments: Streaks and the Evolving Sport of Baseball
Summarizing my latest sports piece, and first peer-reviewed publication on baseball. Get it for free here.Even after nearly a century…
A Catcher Spotting Tool: "Hot Or Not?" For Baseball Pitches
Catcher Spotting is a project I've been working on casually for about 4 years, starting when I got curious about…