Rick's Red Sox Rambles: Study: Synergistic and Cascading Effects In a Baseball Lineup.

This is study I did using a baseball simulator I wrote off and on over the last year as part of my high school senior project. Feedback on any part of it would be appreciated.

Synergistic/Cascading effects are a fancy way of saying the difference between a whole and the sum of it's parts. I first had the idea to try to test this for baseball after reading this basketball article about “chemistry” on teams, which used simulations to find how effective groups of five players with various skills would perform as a group. They found that certain skills had what they called positive or negative synergy with each other: having more players with a certain skill gets you proportionately more or less benefit respectively. For example, generating turnovers on defense has a positive synergy with itself, as if you have multiple players who can steal the ball, you will get disproportionately more steals. Meanwhile, avoiding turnovers on offense has negative synergy with itself, as only one offensive player can hang on to the ball at once. I was curious to see if that existed in baseball.

Ideally I would like to be able to test the synergy of both offensive components (effects of mainly on base percentage and slugging percentage on each other) and defensive components (starting pitching, relief pitching, fielding), but as my simulator is currently capable of testing only the offensive components that is what I will test. For my tests, I created seven teams of players of league average overall quality but varying profiles, as shown in the table below:

AVE/OBP/SLG	wOBA
.255/.350/.350	.3174
.255/.340/.367	.3173
.255/.330/.384	.3175
.255/.320/.400	.3174
.255/.310/.416	.3176
.255/.300/.431	.3175
.255/.290/.446	.3176

All the players have wOBAs ranging between .3176 and .3173, but OBP and SLG ranged from .350 to .290 and .446 to .350 respectively. For each of these lineups, I ran five million games and checked how many runs were scored by each team. The results are below:

Slash line	Runs per game	PA per game	% runners score	Runs per PA
.255/.350/.350	4.416	39.897	27.66%	0.1106
.255/.340/.367	4.397	39.412	28.09%	0.1116
.255/.330/.384	4.383	38.939	28.58%	0.1126
.255/.320/.400	4.371	38.476	29.14%	0.1136
.255/.310/.416	4.374	38.041	29.81%	0.1150
.255/.300/.431	4.370	37.616	30.50%	0.1162
.255/.290/.446	4.365	37.198	31.28%	0.1174

That would seem to suggest a couple of things. The first and most obvious is that the high OBP, low SLG team scores the most runs, then the second best OBP, second lowest SLG, and on. We can also see that these high OBP teams have significantly more batters come up top bat per game, with the top OBP team getting 2.7 more PA per game than the top SLG team. However, these teams pay the price for their relative lack of power, with the top OBP team scoring 3.6% fewer base runners and generating more runs on a per PA basis. This would seem to confirm that wOBA is at least close to being correct in its weighting of getting on base vs hitting for power, as it seems to take into account the additional advantages of that extra PA for someone else which reaching base creates (given the way linear weights are calculated, this makes sense). However, when this new plate appearance also goes to someone who is disproportionately likely to reach base, and so on for each new player to reach, the benefit of reaching base and getting someone else another PA is magnified.

These are only sample teams, however. No real team is made up entirely of identical players, and so this demonstration has very limited real use. What if we took an average lineup (league average #1 hitter, league average #2 hitter, and so on), and changed their profiles? That would give us a more accurate look and a better baseline for later tests to work off of. Those runs showed very similar results:

Overall slash line	Runs per game	PA per game	%Runners score	Runs per PA
.258/.353/.358	4.546	40.052	28.14%	0.1135
.258/.343/.376	4.527	39.566	28.57%	0.1144
.258/.333/.392	4.512	39.089	29.07%	0.1154
.258/.323/.408	4.499	38.627	29.62%	0.1165
.258/.313/.424	4.495	38.182	30.27%	0.1177
.258/.303/.438	4.484	37.749	30.96%	0.1188
.258/.293/.454	4.483	37.335	31.76%	0.1201

The standard deviation of runs scored per game over five million sims is 0.0015 runs per game. With the exception of the final two lineups (.303 and .293 OBP), all of the lineups are separated by at least one full standard deviation, and only the .323 and .313 lineups are close to within two standard deviations.

How about the effects of adding better players of different profiles to a lineup? I created three player “profiles”, each with a .370 wOBA (16% above 2011 league average). They are below:

name	AVE (vs average)	OBP (vs average)	SLG (vs average)	wOBA
+OBP	.255 (+.000)	.400 (+.080)	.426 (+.026)	.3699
normal	.255 (+.000)	.370 (+.050)	.476 (+.076)	.3699
+SLG	.255 (+.000)	.340 (+.020)	.521 (+.121)	.3697

I turned these slash lines into OBP and SLG over average (.255/.320/.400), and used those values to modify the average #2 and #3 hitters in an average lineup. Those players are below:

Slot and name	AVE	OBP	SLG	wOBA
#2 +OBP	.268	.411	.440	.3797
#2 +normal	.268	.381	.490	.3799
#2 +SLG	.268	.351	.535	.3799
#3 +OBP	.272	.430	.475	.4003
#3 +normal	.272	.400	.525	.4006
#3 +SLG	.272	.370	.570	.4008

For each possible combination of these batters in the #2 or #3 slots a simulation was run. Each lineup with different players had the better OBP player batting 2^nd. The results are below:

Lineup	Slash line	R/G	PA/G	RS%	R/PA
#2+OBP, #3+OBP	.258/.341/.413	4.933	39.627	30.37%	0.1245
#2+OBP, #3+norm	.258/.338/.419	4.929	39.458	30.59%	0.1249
#2+norm, #3+norm	.258/.334/.424	4.901	39.280	30.65%	0.1248
#2+norm, #3+SLG	.258/.331/.430	4.899	39.116	30.89%	0.1252
#2+SLG, #3+SLG	.258/.327/.435	4.875	38.945	30.98%	0.1252
#2+OBP, #3+SLG	.258/.335/.424	4.918	39.278	30.80%	0.1252

As we can see, the lineup with two high OBP players scored the most runs, then the team with a high OBP player and a balanced player. The next lineup has the high OBP and high SLG players. It's slash line is identical to that of the lineup with two balanced players except for one point of OBP (although it still wound up with 0.002 fewer PA thanks to a couple extra double plays and outs on the bases), but it scored an extra 0.017 runs per game, a difference of a bit over 11 standard deviations, thanks to scoring an extra 0.15% of baserunners. This looks like evidence that there is some synergy between on base percentage and slugging percentage (assuming the extra baserunners are coming in front of better power hitters). After that the lineups continue down in order of OBP.

This seems to be strong evidence of a cascading effect of higher on base percentages, and a smaller but existing synergy between high OBP and high SLG. How much benefit can a team theoretically get from this knowledge? To test this, I took three average teams from earlier in the article, one with a high OBP (.258/.343/.376), one with a high SLG (.258/.303/.438) and one balanced team (.258/.323/.408). Using these lineups plus the three star players from the previous tests, I ran sims of every possible combination with the star player hitting 2^nd and got these results:

Lineup	Slash line	R/G	PA/G	RS%	R/PA
++OBP +OBP	.258/.350/.382	4.754	39.971	29.13%	0.1189
++OBP +norm	.258/.346/.388	4.737	39.794	29.24%	0.1190
++OBP +SLG	.258/.343/.394	4.722	39.618	29.35%	0.1192
Norm +OBP	.258/.332/.411	4.717	39.120	30.02%	0.1206
Norm +norm	.258/.329/.416	4.704	38.951	30.17%	0.1208
Norm +SLG	.258/.325/.422	4.688	38.784	30.30%	0.1209
++SLG +OBP	.258/.315/.438	4.693	38.320	31.13%	0.1225
++SLG +norm	.258/.311/.443	4.681	38.163	31.03%	0.1227
++SLG +SLG	.258/.308/.449	4.674	38.014	31.52%	0.1230

If we assumed that there was no cascading effect and simply estimated runs added based off of wOBA, we would expect these players to add 0.190, 0.194, and 0.199 runs to the low, medium, and high OBP teams respectively (more OBP gives these players more PA, thus the slightly larger benefit) The actual benefit (in runs added) is below:

	SLG star	Balanced star	OBP star
Good SLG team	+0.190	+0.197	+0.209
Balanced team	+0.189	+0.205	+0.218
Good OBP team	+0.195	+0.210	+0.227

If we subtract the runs we would expect based off of wOBA, we get this:

	SLG star	Balanced star	OBP star
Good SLG team	0.0	+0.007	+0.019
Balanced team	-0.005	+0.011	+0.024
Good OBP team	-0.004	+0.011	+0.028

So, how can this be leveraged into extra wins for a team? The effects I found are not particularly significant except in relatively extreme cases (disproportionately OBP/SLG heavy teams and players). That said, there are situations in which it could make a difference, such as my hypothetical example below.

You are the GM of a baseball team, and you are shopping for a star player. You have three options (the .380 wOBA #2 hitters listed above). Ignoring defense, position, etc, and assuming each player will play 150 games, we would estimate that they would add 31.2 runs to a typical team. If you are a disproportionately high OBP team (the 2008 Atlanta braves, for example), a high SLG star (like Josh Hamilton) would add 29.25 runs, a balanced star (like David Wright) would add 31.5 runs, and a high OBP star (like Nick Johnson) would add 34.05 runs. Choosing OBP over SLG would add an extra 4.8 runs over 150 games. For a balanced team, that choice would add an extra 4.35 runs, while a disproportionately high SLG team (like the 2010 Toronto Blue Jays) would get an extra 2.85 runs. The high OBP team gets an extra 2 runs, worth about $1 million, over the high slugging team, and a marginal 0.05 run benefit over the balanced team.

Rick's Red Sox Rambles

Pages

Thursday, August 9, 2012

Study: Synergistic and Cascading Effects In a Baseball Lineup.

No comments:

Post a Comment