DWAR: introducing a method to actually calculate wins above replacement - Yale CampusPress

Page created by Bryan Wade
 
CONTINUE READING
dWAR: introducing a method to actually
          calculate wins above replacement
                                         Daniel J. Eck

                                         March 3, 2019

1 Introduction

Wins above replacement (WAR) is meant to be a one-number summary of the total contribution
made by a player for his team in any particular season. As stated by Steve Slowinski of Fangraphs,
WAR offers an estimate to answer the question, “If this player got injured and their team had to
replace them with a freely available player of lower quality from their bench, how much value would
the team be losing,” where this value is expressed in number of wins [Slowinski, 2010]. That being
said, nobody actually calculates WAR in a manner that properly answers the above question as
posed. This is not by any fault of the metric and those who calculate it. One problem is that it
is impossible to simultaneously quantify the value of a player when the player is available and the
value of a replacement to that player when the player is unavailable. The player in question is either
available to play or unavailable to play, never both. Instead of confronting the problems raised in
this factual-counterfactual world, people have attempted to calculate a hypothetical replacement
player to implicitly compare every player with using the machinery of a proprietary black box
[Baumer et al., 2015]. Three widely used versions of WAR that are calculated in this manner are
Baseball Reference’s bWAR [Reference, 2010], Fangraphs’s fWAR [Slowinski, 2010], and Baseball
Prospectus’s bWARP [Prospectus, 2019].

In this note, we propose a direct estimator of wins above replacement that confronts the difficulty of
the factual-counterfactual real world. Note that there are numerous examples of seasons in which a
player is available and unavailable for a substantial amount of time. When this is so, we can directly
compare how the team performs when the player is available to how the team performs when the
player is unavailable. This framework allows for a direct estimation of the wins that a player adds
above a replacement player. This direct estimator is relatively simple to compute, available, easy
to understand, and its interpretation is flexible to the narrative of a season. We will refer to this
direct calculation of WAR as dWAR, which is short for direct WAR, which is shorthand for the
direct calculation of wins above replacement. The dWAR estimator has the potential to yield a
much more natural and appropriate estimate of WAR than those which involve the calculation of
a hypothetical replacement player via black box methodology. The validity of dWAR depends on
the team and the competition faced by the team being similar during both player states. Very
nuanced interpretations of what dWAR measures emerge when team makeup is confounded with
the availability of the player in question.

                                                  1
We primarily focus on the 2014 Yadier Molina season to show the discrepancies between conven-
tional calculations of WAR and dWAR. Our version of WAR gives much more value to Yadier
Molina’s 2014 season than conventional versions. This result is far from surprising. Many note
that conventional versions of WAR do not properly account for leadership, game management,
pitch framing, and catcher defense, which are all aspects of baseball that Molina excels at [Fagan,
2015, Posnanski, 2015, Schwarz, 2015, Fleming, 2017, Womack, 2017]. That being said, a tangible
numeric value of the additional Cardinals wins attributable to Molina as a result of these intan-
gible traits has not existed until now. We caution against generalizing our findings beyond the
2014 season with certainty, but we hope that the point is taken and can be used to strengthen
Molina’s case for the Hall of Fame. The point here being that conventional versions of WAR likely
have underestimated the number of Cardinals wins attributable to Yadier Molina by a substantial
amount.

Additional analyses are provided for Miguel Cabrera’s 2015 season with the Detroit Tigers and Mike
Trout’s 2017 season with the Los Angeles Angeles. These specific players are chosen because of the
2012 most valuable player (MVP)race between them that is symbolic of the fight between those who
favor new sophisticated analytics to value a player’s production and those who favor traditional
analytics to value a players production. As noted in Baumer et al. [2015], sabermetricians from the
new school advocated strongly for Trout while those that preferred traditional statistics advocated
strongly for Cabrera. To the adherents of sabermetrics, the decision for who should win the 2012
MVP award was clear – point estimates showed Trout leading Cabrera by 3.2 fWAR and 3.6
bWAR. The openWAR metric in Baumer et al. [2015] provided far more sophistication to this
debate. According to openWAR, the estimated difference between Trout and Cabrera is only 1.05
WAR in Trout’s favor. Moreover, there is substantial overlap of the interval estimates of Trout’s and
Cabrera’s openWAR. We do not provide a dWAR estimate of these player’s WAR in 2012 because
these players did not not miss a significant portion of the 2012 season, which voids comparisons to
a suitable replacement player under the dWAR framework. However, both 2012 Cabrera and 2012
Trout were archetypically similar player in 2015 and 2017 respectively. Our dWAR estimates of
WAR for 2015 Miguel Cabrera and 2017 Mike Trout give the opposite impression that conventional
WAR and openWAR give for theses player’s respective value for their teams. In 2015, the Detroit
Tigers were far worse when Miguel Cabrera did not play or was injured. However, the 2017 Los
Angeles Angels were not terribly hindered by the absence of Mike Trout’s production. These
findings are striking (especially for Trout) and they come with natural caveats arising from the
context of those seasons. These caveats are explored.

2 Data Analyses

2.1 Yadier Molina in 2014

Yadier Molina played in 110 regular season games for the St. Louis Cardinals in the 2014 baseball
season out of a possible 162. The St. Louis Cardinals won a total of 90 games and lost a total of 62
games. When Yadier Molina played, the Cardinals won 62 games and lost 48 games. When Yadier
Molina did not play, the Cardinals won 28 games and lost 24 games [ESPN, 2019i]. The games in
which he did not play are split among games in which he was healthy but did not enter the game
and games in which he was injured and unavailable to play in any capacity. The former category

                                                 2
represents a strategic decision that involves a healthy baseball player, the latter category represents
an unplanned incident in which team strategic decisions have to change due to the unexpected loss
of a player. When Yadier Molina did not play but was available, the Cardinals won 7 games and
lost 5 games. When Yadier Molina was injured, the Cardinals won 21 games and lost 19 games.
Yadier Molina’s WAR in 2014 was 2.5 as estimated by BWARP, 2.9 as estimated by fWAR, and
3.1 as estimated by bWAR.

We now motivate three versions of a player’s WAR that directly answers the question as posed in
the Introduction. We compare how the team did in games in which the player played to those in
                                           \ on-off . This estimate of WAR is calculated as
which the player did not play, denoted as dWAR

                                    \ on-off = (p̂on − p̂off ) × G
                                   dWAR

where p̂on is the proportion of team games won when the player played, p̂off is the proportion of
team games won when the player did not play, and G is the number of total games played. We then
compare how the team did in games in which the player was available to play to those in which the
                                         \ avail-unavail . This estimate of WAR is calculated as
player was unavailable to play, denoted dWAR

                               \ avail-unavail = (p̂avail − p̂unavail ) × G
                              dWAR

where p̂avail is the proportion of team games won when the player played or was available to play
and p̂unavail is the proportion of team games won when the player was unavailable to play. We
finally compare how the team did in games in which the player played to those in which the player
                                    \ on-unavail . This estimate of WAR is calculated as
was unavailable to play, denoted dWAR

                                \ on-unavail = (p̂on − p̂unavail ) × G.
                               dWAR

These estimates of dWAR for Yadier Molina’s 2014 season are,

                               \ on-off = (62/110 − 28/52) × 110 = 2.77,
                              dWAR
                         \ avail-unavail = (69/122 − 21/40) × 110 = 4.46,
                        dWAR
                           \ on-unavail = (62/110 − 21/40) × 110 = 4.25.
                          dWAR

The versions of dWAR that compare team success when Yadier Molina is available, and his playing
time is subject to management’s decision, to team success when Yadier Molina is unavailable are
drastically different than conventional calculations of this metric. The discrepancy between these
approaches is anywhere between 1.15 and 1.96 wins depending on which estimates of WAR are
being compared. Interpretations of the discrepancy between these metrics are massive. Slowinski
[2010] provided a rule-of-thumb guideline for interpreting WAR. According to these guidelines, a
WAR between 2.5 and 3.1 corresponds to a player that is anywhere from a solid starting player
to a good player, while a WAR between 4.25 and 4.46 corresponds to an all-star level player that
performed near the top of the league.

The following are the notable injuries during the 2014 Cardinals season: Yadier Molina injured
from July 10th until August 27th [ESPN, 2019i]; Jaime Garca will made his first start on May 18
[of Communications, 2014]; Jason Motte made his return on May 21 [Langosch, 2014]; Jaime Garca
announced on July 5 that he would have season-ending surgery [Wikipedia, 2019]. The following

                                                   3
are notable transactions made by the Cardinals in 2014: On July 11, the Cardinals claimed George
Kottaras off waivers from the Cleveland Indians as a possible replacement for Yadier Molina; on
July 26, the Cardinals signed A.J. Pierzynski as a free agent as a possible replacement for Yadier
Molina; on July 29, the Cardinals released George Kottaras; on July 31, the Cardinals traded Allen
Craig and Joe Kelly for Corey Littrell, John Lackey and cash [Reference, 2019b].

We do not find that these injuries and transactions account for the massive discrepancies between
our dWAR estimator and the conventional approaches. The Cardinals faced similar competition
when Yadier Molina was unavailable and when Yadier Molina was available [Reference, 2019c].
                  \ on-off , dWAR
We conclude that dWAR          \ avail-unavail , and dWAR
                                                      \ on-unavail are preferable to conventional
versions of WAR when assessing the value of Yadier Molina to the Cardinals in the 2014 season.

2.2 Miguel Cabrera in 2015

We perform the same analysis for Miguel Cabrera’s 2015 season with the Detroit Tigers. Miguel
Cabrera played in 119 games in 2015 and we estimate p̂on = 57/119, p̂off = 17/42, p̂avail = 59/126,
and p̂unavail = 15/35 [ESPN, 2019a]. Conventional estimates of WAR for Miguel Cabrera’s 2015
season are,
                       bW AR = 5.2,    f W AR = 4.6,      BW ARP = 5.3.
Our estimates of dWAR for Miguel Cabrera’s 2015 season are,

                              \ on-off = (57/119 − 17/42) × 119 = 8.83,
                             dWAR
                         \ avail-unavail = (59/126 − 15/35) × 119 = 4.72,
                        dWAR
                          \ on-unavail = (57/119 − 15/35) × 119 = 6.00.
                         dWAR

We take a look at significant injuries and transactions made by the Tigers in 2015. The following
are significant injuries: Miguel Cabrera injured from July 4th until August 12th [ESPN, 2019a];
Justin Verlander injured until June 13th [ESPN, 2019h]; Victor Martinez injured from May 14th
until June 17th [ESPN, 2019c]; Jose Iglesias injured from September 4th until October 4th [ESPN,
2019b]; Anibal Sanchez injured from August 18th until October 4th [ESPN, 2019e]. The following
are significant transactions: On July 6, the Tigers claimed Marc Krauss off waivers from the Tampa
Bay Rays as a possible Cabrera replacement; on July 30, the Tigers traded David Price for Daniel
Norris, Matt Boyd and Jairo Labourt; on July 30, the Tigers traded Joakim Soria for shortstop
JaCoby Jones; on July 31, the Tigers traded Yoenis Cespedes for Michael Fulmer and Luis Cessa;
on August 20, the Tigers acquired Randy Wolf from the Toronto Blue Jays for cash considerations
[Reference, 2019d].

We find that these injuries and transactions result in some key differences between in the makeup of
the 2015 Tigers during the stretches when Cabrera is either available or unavailable. The strongest
team makeup, excluding the availability of Cabrera, was from June 17th until the trades that began
July 30th. Outside of this stretch, dWAR would maintain validity of the unavailability of Victor
Martinez and Justin Verlander is equal to the balance of the departures of David Price, Joakim
Soria, and Yoenis Cespedes and arrivals of Daniel Norris, Matt Boyd, Jairo Labourt, JaCoby

                                                 4
Jones, Michael Fulmer, and Luis Cessa. Perhaps this is so, in which case dWAR retains its validity.
Perhaps this is not so, and a more thorough analysis is required.

                                                                      \ on-off , dWAR
Another striking feature of this analysis is the differences between dWAR         \ avail-unavail ,
      \ on-unavail . These differences are partially explained by the 2015 Tigers winning 2 games
and dWAR
and losing 5 when Cabrera was available to play but did not enter the game. Perhaps management
picked poor spots to rest Miguel Cabrera or perhaps this abysmal performance is explained by
the inherent volatility of the 2015 Tigers season [Mowery, 2015] and the small amount of games
in which Cabrera was available to play but did not enter the game. We think that it is likely
      \ avail-unavail underestimates 2015 Miguel Cabrera’s value to the 2015 Tigers and that
that dWAR
 \ on-off overestimates 2015 Miguel Cabrera’s value to the 2015 Tigers. In any event, Miguel
dWAR
Cabrera was a phenomenal hitter in 2015 and all estimates of WAR pick up on this. That being said,
It appears that conventional estimates of WAR underestimated the number of wins that Cabrera’s
hitting brought to the Detroit Tigers in 2015.

2.3 Mike Trout in 2017

We perform the same analysis for Mike Trout’s 2017 season with the Los Angeles Angels. Mike
Trout played in 114 games in 2017 and we estimate p̂on = 57/114, p̂off = 23/48, p̂avail = 59/118,
and p̂unavail = 21/44 [ESPN, 2019g]. Conventional estimates of WAR for Mike Trout’s 2017 season
are,
                        bW AR = 6.7,    f W AR = 6.9,      BW ARP = 6.2.
Our estimates of WAR for Mike Trout’s 2017 season are,

                              \ on-off = (57/114 − 23/48) × 114 = 2.38,
                             dWAR
                         \ avail-unavail = (59/118 − 21/44) × 114 = 2.59,
                        dWAR
                          \ on-unavail = (57/114 − 21/44) × 114 = 2.59.
                         dWAR

The dWAR estimates that compare team success when Mike Trout is available, and his playing
time is subject to management’s decision, to team success when Mike Trout is unavailable are
drastically different than conventional calculations of this metric. The differences between the two
approaches are massive and are not in Mike Trout’s favor.

As before, we take a look at significant injuries and transactions made by the Angels in 2017.
The following are significant injuries: Mike Trout injured from May 6th until May 10th and from
May 28th until July 9th [ESPN, 2019g]; many key Angels players played very limited time in
2017, including Garrett Richards and Huston Street [ESPN, 2019d,f]. The following are significant
transactions: On June 1, the Angels signed Michael Bourn as a free agent as a possible Mike Trout
replacement; on July 2, the Angels released Michael Bourn; on August 31, the Angels traded Elvin
Rodriguez and Grayson Long for Justin Upton [Reference, 2019a].

We do not find that these injuries and transactions account for the massive discrepancies between
dWAR and the conventional approaches. However, Reiter [2017] notes that the Angels’ relative
success in Trout’s absence is mind-boggling, especially given that the Angels faced relatively tough

                                                 5
competition during that stretch. He attributes the team success in Trout’s absence to the rest of
the players coming together and clicking the moment Trout got hurt. We find this explanation to
be interesting and plausible. We also find it plausible that Mike Trout is not the type of leader that
can instill motivation in other players when being looked upon to do so. We do not have enough
information to properly distinguish causes for this very strange “Mike Trout effect.”

3 Discussion

We hope that dWAR can help strengthen Yadier Molina’s deserving Hall of Fame case. It is entirely
within the realm of possibility that conventional calculations of WAR will have underestimated
Yadier Molina’s career WAR by 10-30 wins when he retires.

The 2017 Mike Trout season was truly mind-boggling and it illustrates the collective shortcomings
of all metrics, including dWAR. The dWAR estimator of Mike Trout’s value to the 2017 Angels
is valid in the sense that the 2017 Angels performed almost as well without Mike Trout as they
performed with him. That is what happened and it is not debatable. However, using this as a
final determination of Mike Trout’s value is very suspect. Perhaps the “Mike Trout effect” is real
and perhaps it is a result of either lackadaisical uninspired teammates, poor leadership on Mike
Trout’s part, poor team management, or all of the above, or possibly some other explanation. Even
if this effect exists, it has only been shown to exist with respect to the 2017 Angels season and it
may not exist in any other year or on any other team. Conventional indirect estimates of WAR
provide a one-number summary of how statistically dominant Mike Trout was and currently is. To
what extent this raw context neutral statistical account of Mike Trout’s greatness can translate to
actual wins above replacement is an honest conversation worth having by fans, historians, scouts,
sabermetricians, the baseball media, and team executives alike.

As it currently stands, dWAR can only validly estimate WAR for a player during a season in
which the player spent a substantial time in both the available and unavailable states. Since
the calculation of dWAR is constrained to one season in isolation, it cannot be used to properly
compare players across eras as noted in Eck [2018]. Valid era-invariant extensions to dWAR for
all player seasons can be made with a suitable era-invariant model of the probabilities of team
success that are central to the calculation of dWAR. This is not a trivial task and we speculate that
strong structural assumptions in a partially Bayesian latent variable modeling framework would be
required to achieve believable era-invariance.

We would like to emphasize that traditional considerations of value including leadership ability,
winning attitude, clutch performances, great defense, and other intangibles are important sup-
plements to statistical accounts of value. Our metric dWAR can contribute to this emphasis by
measuring to what extent an aggregation of these previously unmeasurable accounts of value has
on the additional number of wins a particular player produces for his team over that of a suitable
and available replacement player. If intangible value actually helps a team win when a player is on
the field, then dWAR can find it when conventional calculations of WAR such as bWAR, fWAR,
and BWARP will not. These conventional calculations of WAR are not actually in the business of
directly calculating wins above replacement, dWAR is.

                                                  6
References
Benjamin S. Baumer, Shane T. Jensen, and Gregory J. Matthews. Openwar: An open source
  system for evaluating overall player performance in major league baseball. Journal of Quantitative
  Analysis in Sports, 11 (2):1–27, 2015.

Daniel J. Eck. Challenging nostalgia and performance metrics in baseball. ArXiv Preprint, 2018.
 URL https://arxiv.org/abs/1810.08029.

ESPN. Miguel cabrera game-by-game stats, 2019a. URL http://www.espn.com/mlb/player/
  gamelog/_/id/5544/year/2015.

ESPN. Jose iglesias game-by-game stats, 2019b.         URL http://www.espn.com/mlb/player/
  gamelog/_/id/30382/year/2015.

ESPN. Victor martinez game-by-game stats, 2019c. URL http://www.espn.com/mlb/player/
  gamelog/_/id/5007/year/2015.

ESPN. Garrett richards game-by-game stats, 2019d. URL http://www.espn.com/mlb/player/
  gamelog/_/id/30892/year/2017.

ESPN. Anibal sanchez game-by-game stats, 2019e. URL http://www.espn.com/mlb/player/
  gamelog/_/id/6472/year/2015.

ESPN. Huston street game-by-game stats, 2019f.         URL http://www.espn.com/mlb/player/
  gamelog/_/id/6175/huston-street.

ESPN. Mike trout game-by-game stats, 2019g.            URL http://www.espn.com/mlb/player/
  gamelog/_/id/30836/year/2017.

ESPN. Justin verlander game-by-game stats, 2019h. URL http://www.espn.com/mlb/player/
  gamelog/_/id/6341/year/2015.

ESPN. Yadier molina game-by-game stats, 2019i. URL http://www.espn.com/mlb/player/
  gamelog/_/id/5986/year/2014.

Ryan Fagan.       Which active mlb players are hall of fame bound?            the
  case   for  15,    2015.        URL    http://www.sportingnews.com/us/mlb/list/
  baseball-hall-of-fame-2015-voting-players-pujols-cabrera-kershaw-trout-posey-beltre-beltran/
  dzae73sytib612elvquapw62q.

John    Fleming.        How      does    yadier molina compare    to   2017s   hall
  of   fame     ballot catchers?,     2017.        URL   https://www.vivaelbirdos.
  com/st-louis-cardinals-sabermetrics-analysis/2017/1/20/14275092/
  cardinals-yadier-molina-2017-hall-of-fame-catchers-ivan-rodriguez-pudge-jorge-posada-jason-v

Jenifer Langosch. Motte impresses in return to the mound, 2014. URL https://www.mlb.com/
  news/cardinals-jason-motte-impresses-in-return-to-the-mound/c-76353834.

Matthew     B.  Mowery.        Numbers   added   up   for  an    abysmal   season
 for    tigers in   2015,   2015.        URL    https://www.theoaklandpress.com/
 sports/numbers-added-up-for-an-abysmal-season-for-tigers-in/article_
 15663b9d-d334-538d-901a-8654dec6f70f.html.

                                                 7
St. Louis Cardinals Department of Communications. Game notes, 2014. URL http://cardinals.
  mlb.com/documents/8/9/4/75903894/51814_Layout_1_5dhyviqx.pdf.

Joe Posnanski.   Kids in the hall, 2015.  URL https://sportsworld.nbcsports.com/
  active-baseball-players-hall-of-fame-chances/.

Baseball Prospectus. View details for warp, 2019. URL https://legacy.baseballprospectus.
  com/glossary/index.php?search=WARP.

Baseball Reference.  Baseball-reference.com war explained, 2010.                URL https://www.
  baseball-reference.com/about/war_explained.shtml.

Baseball Reference. 2017 los angeles angels trades and transactions, 2019a. URL https://www.
  baseball-reference.com/teams/LAA/2017-transactions.shtml.

Baseball Reference. 2014 st. louis cardinals trades and transactions, 2019b. URL https://www.
  baseball-reference.com/teams/STL/2014-transactions.shtml.

Baseball Reference.  2014 st. louis cardinals schedule, 2019c.                 URL https://www.
  baseball-reference.com/teams/STL/2014-schedule-scores.shtml.

Baseball Reference. 2015 detroit tigers trades and transactions, 2019d.         URL https://www.
  baseball-reference.com/teams/DET/2015-transactions.shtml.

Ben Reiter. Mike trout is injured, but the angels have only improved during his absence, 2017. URL
  https://www.si.com/mlb/2017/06/29/los-angeles-angels-winning-mike-trout-injury.

Joe Schwarz.   An update on the pitch framing of yadier molina, 2015. URL https:
  //www.vivaelbirdos.com/st-louis-cardinals-sabermetrics-analysis/2015/6/25/
  8842869/an-update-on-the-pitch-framing-of-yadier-molina.

Steve Slowinski. What is war? fangraphs, 2010. URL https://library.fangraphs.com/misc/
  war/.

Wikipedia. 2014 st. louis cardinals season, 2019. URL https://en.wikipedia.org/wiki/2014_
 St._Louis_Cardinals_season.

Graham Womack.          Yadier molina is a clear hall of famer to cardi-
  nals   pitchers,  2017.         URL    http://www.sportingnews.com/us/mlb/news/
  yadier-molina-hall-of-fame-cardinals-catcher-stats-pitch-framing/
  xkxgjg96e1ck1eqaiqqlyw2a3.

 c 2019 Daniel J. Eck

This note is the intellectual property of Daniel J. Eck and terms of its redistribution are subject to
the Creative Commons license:

          Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

For details on this license, see https://creativecommons.org/licenses/by-nc-sa/4.0/

                                                  8
You can also read