Does Batting Order Matter?

The 2001 Major League Baseball season starts today in Puerto Rico, with the Toronto Blue Jays taking on the Texas Rangers. Toronto's rookie manager, Buck Martinez, has to choose a batting order for the opening game. Much has been made in the Toronto media about the "battle for the 2 slot". The popular consensus seems to be that Martinez has one of three choices for his lineup: One, Homer Bush bat in the second slot (the opening day lineup from 2000); Two, Alex Gonzalez bats number two; Or three, Jose Cruz, Jr. bats in the second slot. Actually Martinez has far more than three possible lineups. With a 25 man roster he has 741 354 768 000 possible batting orders! Of course, we can assume he is not going to start any pitcher in the field and this cuts his choices down to a mere 259 459 200 or 79 833 600 depending on whether he has 12 or 13 pitchers on the roster respectively. If we assume that we are only going to look at the starters from last year's opening day lineup, who are all projected to be the starters in this years lineup, Martinez has a much more manageable 362 880 possible lineups. The Jays could play 2240 seasons with these nine starters and never once repeat a batting lineup! So the questions of the day are:

  1. Does it matter what lineup Martinez bats?
  2. If so, how much does it matter?
  3. If it matters, what is the best batting order?
answers

Motivation, Approach, and Background

For Blue Jays fans such as myself the year 2000 was the year of the homer. The Jays hit a lot of home runs (244). Despite the power, the Jays had more difficulty with run scoring in general, and were below the AL median in runs scored (8th with 861 runs scored). The other reason it was the year of the homer: Homer Bush. Two words that still make me cringe as a Jays' fan. Homer Bush put up a terrible OPS of 524 (to put that in perspective 6 of the 9 Jays starters had SLG of over 500). Bush was 20.1 runs below replacement level playing just 47% of the season1. So where would you want to put, arguably, the least valuable position player in the majors last year in your batting order? The Jays went with the second slot on opening day, and most of the time Bush was in the lineup, last year. As a fan following the Jays I feel that that just has to have cost us something, and want to see if my gut feeling is right or wrong.

I follow the newsgroups rec.sports.baseball and alt.sports.baseball.tor-bluejays and read various sabermetric analysis online, so I knew some research had been done in this space. My feeling was that the status quo consensus (as much as one can ever say such a beast exists) on batting order strategy was that:

  1. Batting order does matter, but the effect is not very large.
  2. The optimum batting order is likely decreasing order of OBA (or at least closer to that than the traditional batting order).
  3. The traditional batting order, while not theoretically optimal, is thought to be pretty close.
But I had not read any of the analysis done to generate these results. I had just heard numerous people refer to the conclusions. Therefore, I decided to build a simulation to test various batting orders and see what I could conclude on my own without being (further) biased by previous work. Only once the simulation was complete did I consult previous work.

My simulation calculates what the mean runs per game the Jays score with a given batting order. Each Jays' year 2000 numbers were used to estimate their probabilities of various outcomes of a plate appearance. Each possible lineup would then get to bat for a series of nine inning games, with the runs scored in each game tracked and summary statistics for each lineup produced. From this we could try to answer the three questions I asked in the introduction. I'll go into much further detail on the simulation, and its limitations, later on, but first, let's review the previous sabermetric analysis.

The first analysis I could find on lineup construction was from 1954! Branch Rickey published an article analyzing baseball in Life magazine2. Rickey was quite ahead of his time, noting that "[a]s a statistic, RBIs were not only misleading but dishonest". Nearly half a century later many managers and general managers still worship the misleading and dishonest RBI. About lineup construction Rickey noted that a likely key to success was "a closer grouping in the batting order of the club's high OBA hitters" than most teams had, as this led to better team performance in "clutch" situations, because you had the teams best batsmen up more often in these key situations.

A more modern, and more focused approach to lineup construction was taken by Mark Pankin in 1991 3. Pankin used a Markov model to perform the analysis on 1800 possible lineups for each team in the majors in 1986, and then tried to deduce from the best of these lineups what the key characteristics of each lineup slot by using regression on the characteristics of the players who produced the best results in each lineup slot. Among the conclusion Pankin makes is that speed at the top of the order is not what should determine your leadoff hitters, but rather OBA. Further in Pankin's calculation the typical manager's lineup was about 0.05 runs per game worse than Pankin's lineups constructed around the regressed qualities (although it varied from 0 to 0.1 run per game depending on the team). One quite surprising thing was that even though Pankin's ideal lineup and the traditional lineups were quite similar in performance, they were very different in composition.

A later study in 1997 posted by rec.sports.baseball regular Roger Moore again tackled the question of lineup construction, this time using a simulation and the 1996 LA Dodgers4. Moore pitted the Dodgers against themselves. One team had the conventional Dodger lineup, and their opponents had one of three different types of lineups: descending OBA, ascending OBA, and randomly ordered. The conventional lineup fared better than the ascending OBA (win percentage 52.006), but worse than descending OBA (win percentage 49.895) (random ordered was in the middle worse than conventional). The 1986 Dodgers scored 638 runs on the season5. So using the pythagorean rule (with 1.83 as exponent)6 we can calculate that this means in Moore's simulation the ascending OBA probably scored about 0.17 runs per game fewer than the actual Dodgers lineup, the random order would have scored about 0.07 runs per game fewer than the actual Dodgers, and the descending OBA would have scored approximately 0.01 runs a game more than the actual Dodgers.

Results

Let's first look to the opening day 2000 lineup, and what kind of production one should expect from: 1. Stewart; 2. Bush; 3. Mondesi; 4. Delgado; 5. Fullmer; 6. Batista; 7. Fletcher; 8. Cruz Jr.; 9. Gonzalez. In my simulation this ordering produces an average of 5.37 RPG (869 runs per season). Not bad, but what if we move Cruz Jr. to the second spot, Gonzalez to the 8 spot, and Bush as the number nine hitter? This produces an average of 5.42 RPG (878 runs per season). A little better, now what if Gonzalez gets the second slot (which from what I last heard, is Martinez most likely choice), Cruz Jr eighth, and Bush ninth? The Jays now produce 5.43 RPG (879 runs per season). While this seems slightly better than Cruz Jr in the second slot, the difference between Cruz Jr. second and Gonzalez second, is smaller than the precision of my simulated run size7. But it is clear that Bush in the two slot, based on last years number, is not the wisest of choices 8. But the difference between these orders was very small, about 0.85 wins expected difference (using the pythagorean method again) between Bush in the second spot and Gonzalez. So it looks like Martinez is leaning the right way on which of these batters should fill the second slot.

Well what about the sabermetric lineup orderings? I tried Jays lineups sorted increasing and decreasing by AVG, OBA, SLG, and OPS. Here Listed from worst to best you can see a much larger difference:

NumberDescriptionRuns Per GameRuns Per SeasonExtra Wins (compared to opening day 2000)
1Increasing by AVG5.22846-2.0
2Increasing by SLG5.24848-1.8
3Increasing by OBA5.24849-1.7
4Increasing by OPS5.26852-1.5
5Decreasing by AVG5.458831.2
6Decreasing by SLG5.458841.3
7Decreasing by OBA5.518932.0
8Decreasing by OPS5.528942.1

This means the difference between the best ordering of the eight and the worst was 0.3 RPG, 48 runs per season, or 4.1 wins a season. What does this mean for other major league lineups? The Jays are composed like many teams in baseball where their best hitter overall is also their best power hitter is also their best OBA guy (Delgado). The Jays as a team also are better sluggers than most major league teams, but get on base worse, so it is unclear if what is best for the Jays would be best for some other team. Also the wins as compared to the opening day squad assume that your team allowed 908 runs, which is what the Jays in 2000 did.

I also wanted to try some random lineups to see what I could determine from them, and what range of values one gets with random lineups. I tried 250 randomly generated lineups. I would have liked to do more but each lineup takes 10 to 15 minutes to calculate, so I had limited time. 250 lineups means I was only getting about 0.07 % of the possible lineups, so there certainly are more to try. The average of the 250 random lineups produced 5.35 RPG (867 runs per season) which is very near the results opening day lineup from 2000 (0.2 wins worse - a smaller difference than the precision of the simulation). The worst random lineup (Bush, Cruz, Batista, Gonzalez, Fullmer, Mondesi, Stewart, Fletcher, Delgado) produced 5.24 RPG (849 runs per season) pretty much identical to the increasing by SLG and increasing by OBA strategies. The best random lineup (Delgado, Stewart, Mondesi, Cruz, Fullmer, Fletcher, Batista, Gonzalez, Bush) produced 5.47 RPG (886 runs per season) better than the decreasing by AVG and SLG, but not as good as the decreasing by OBA and OPS lineups. Here is a table summarizing how the team scored when a player batted at each position:
 
NameRuns Team Scores Per Game When Player Hits In PositionBest Position
123456789
Delgado5.3875.3855.3545.3695.3605.3495.3335.3065.3211
Cruz5.3345.3495.3365.3785.3495.3395.3495.3545.3734
Bush5.3285.3235.3465.3255.3495.3665.3695.3765.3929
Fletcher5.3575.3565.3565.3465.3535.3645.3415.3525.3466
Fullmer5.3525.3495.3705.3525.3485.3525.3475.3595.3433
Gonzalez5.3575.3455.3525.3405.3315.3555.3585.3755.3558
Stewart5.3585.3585.3535.3535.3585.3505.3415.3415.3492
Batista5.3475.3365.3455.3685.3525.3585.3575.3535.3524
Mondesi5.3475.3525.3565.3415.3695.3275.3775.3535.3497
Best PersonDelgadoDelgadoFullmerCruzMondesiBushMondesiBushBush 

So looking at the table we can determine a number of things, some of which we already had concluded. First, I would need to run many more lineups to try and get good coverage of all possible lineups, and eliminate all the luck. Since the lineups were truly random, each person only got around 30 chances at each slot and if they had a disproportionate number of these slots with Bush near the front of the lineup and Delgado near the back then it may look worse for them then it really is just by random dumb luck. There are interesting points that are probably caused by that such as Mondesi being great in the 5 and 7 slot but horrible in the 6 slot. Or Delgado being worse in the 8 slot then the 9 slot. The highest number there (most important to place) was Bush in the 9 slot. One of the lowest numbers, and worst for Bush, was Bush in the 2 slot (although this may be an artifact and 1 may be worse for Bush). This just underscores how it is very important to have Bush bat at the bottom of the order. The next most important person to place is Delgado. Delgado in either the 1 or 2 spot is excellent. Delgado in the 8 or 9 slot is as bad as you can do setting up the Jays lineup (and surprisingly 8 was worse than 9, it may be an artifact, or may have to do with what situations you are likely to see first time through the order). Putting all of this together, using the best position column one can very quickly deduce what may very well be the Jays strongest lineup:

  1. Delgado
  2. Stewart
  3. Fullmer
  4. Cruz (this was the only one that needs to be figured out. The team scores best when Batista bats 4th and best when Cruz bats 4th, but Cruz > Batista at 4th and Batista > Cruz at 5th - the open spot)
  5. Batista
  6. Fletcher
  7. Mondesi
  8. Gonzalez
  9. Bush

Testing this lineup we do indeed see a relatively strong lineup, scoring 5.48 RPG, 888 runs per season, this lineup was about as good as the best of the random lineups, but not as good as a lineup of decreasing OBA or decreasing OPS.

Caveats, Clarifications, and Concerns

There are two main questions that come to mind reading about a simulation like this: How precise is the simulation? How accurate is the simulation?

The first is the far simpler concern, and basically asks, which of the results presented, if any, are statistically sound and repeatable? Or am I just making much ado about nothing. In running the simulation for each season I made sure that there I ran enough games such that the error on the RPG would be small enough that significant differences could be determined over the expected range of values based on some small samples I had run (5.2 to 5.6). I knew that the standard deviation of runs in a given game was about 3.3-3.4 runs, and from that I could calculate what magnitude of trials I'd need. I choose 32400 games for each lineup because it was an even multiple of 162 (200) and because it was large enough to give a small enough mean error on RPG values (between 0.018 and 0.019 which maps to about 3 runs a season) while small enough to run in under 15 minutes per lineup. This means that while some of the differences between the OBA and OPS lineups may not quite have been statistically significant, the difference between Bush in the second slot and Gonzalez in the second slot were. As for the table above, dividing player into their best batting slot, I'm fairly confident that Delgado belongs in one of the first two slots an Bush towards the end, but I'm sure the information is not significant to the number of digits listed in the table. To get that kind of significance I'd need a lot more runs of the data. That may partially explain why "the Jays strongest lineup" above is not quite as strong as the decreasing OPS, the other reason may be that the best slot for a hitter is dependent on who bats around him, and while batting Bush ninth and Delgado first might be the best things overall individually, maybe Delgado's best slot, given Bush is batting ninth, is really the two slot. But calculating the conditional probabilities like that leaves us back with 9! choices.

So the initial data is statistically significant, but is the simulation accurate? In making a simulation there are always simplifying assumptions that need to be made, and when true values are not known, it is difficult to ensure that all of the simplifying assumptions maintain a reasonable relation between the simulation and reality.

There are two different types of possible inaccuracies. Those which are unlikely to be incorporated into any simulations, and those that just weren't incorporated into this simulation. Amongst the first type are such things as:

I think I will now spend more time explaining exactly what my simulation does, as it gives me the easiest way to discuss any of its inaccuracies. As I've already mentioned it assumes that the Jays' 2000 numbers were their true ability. I hope that in Delgado's case this is true, and in Bush's case far from true (he was injured, and hopefully can be average, or at least above replacement level in 2001). I then for each of their at bats generated a number between 1 and their total number of year 2000 PA. Depending on the number then they either got a BB, single, 2B, 3B, HR, or OUT based on their own personal distribution of these events. If there were a hit and base runners were on then the base runners advanced based on probabilities from a previous major league season (which include the possibility of an out on the basebaths - trying to score from second on a single say)9. Count up the runs in each inning until three outs, and the innings in each game until nine innings were up, and the number of runs in a game was recorded for each of the 32400 games. Thus my simulation has the following flaws:

Conclusions

I still feel that even with all of the above drawbacks my simulation has value. Last year's opening day lineup was roughly what was used where possible and they scored 861, well with in spitting distance of the 869 my simulation would predict (well actually it is not clear what my simulation would predict as you'd need to always play nine inning games, and always play your starters to get this prediction). It seems clear to me that batting Bush second last year was not the best move, although I was a little surprised at how little it mattered. Still an extra 2 wins last year would have made it much closer. I think that managers ought to consider experimenting moving their higher OBA and OPS guys to the top of the order. It appears that the effect of getting more at bats is more important than the perfectly constructed "little ball" first inning. This may partially be because the 2000 Blue Jays played in an offensive explosion and most of the Jays liked to swing for the fences. Still Roger Moore's study found similar results with the 1996 L.A. Dodgers. So that brings us back to the first three questions:

  1. Does it matter what lineup Martinez bats?
    Answer: Yes.
  2. If so, how much does it matter?
    Answer: A good lineup might be as many as 4 games better than a bad one.
  3. If it matters, what is the best batting order?
    Answer: An open question. If anyone has additional insight I'd love to hear from you. It appears that decreasing order based on OBA or OPS would be best. In the Jays case Gonzalez may be the best of the three choices for the two slot, but it isn't clear he's any better than Cruz Jr. It is clear that Bush at the top of the order last year was a mistake. If this year he is fully recovered from his injuries, it may not be as much a disaster if he bats second, but Delgado might be the best choice of all for the two slot, if he doesn't lead off!

Published April 1, 2001 just before the start of the MLB 2001 season by Michael Bodell


References