DWAR: introducing a method to actually calculate wins above replacement - Yale CampusPress
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
dWAR: introducing a method to actually calculate wins above replacement Daniel J. Eck March 3, 2019 1 Introduction Wins above replacement (WAR) is meant to be a one-number summary of the total contribution made by a player for his team in any particular season. As stated by Steve Slowinski of Fangraphs, WAR offers an estimate to answer the question, “If this player got injured and their team had to replace them with a freely available player of lower quality from their bench, how much value would the team be losing,” where this value is expressed in number of wins [Slowinski, 2010]. That being said, nobody actually calculates WAR in a manner that properly answers the above question as posed. This is not by any fault of the metric and those who calculate it. One problem is that it is impossible to simultaneously quantify the value of a player when the player is available and the value of a replacement to that player when the player is unavailable. The player in question is either available to play or unavailable to play, never both. Instead of confronting the problems raised in this factual-counterfactual world, people have attempted to calculate a hypothetical replacement player to implicitly compare every player with using the machinery of a proprietary black box [Baumer et al., 2015]. Three widely used versions of WAR that are calculated in this manner are Baseball Reference’s bWAR [Reference, 2010], Fangraphs’s fWAR [Slowinski, 2010], and Baseball Prospectus’s bWARP [Prospectus, 2019]. In this note, we propose a direct estimator of wins above replacement that confronts the difficulty of the factual-counterfactual real world. Note that there are numerous examples of seasons in which a player is available and unavailable for a substantial amount of time. When this is so, we can directly compare how the team performs when the player is available to how the team performs when the player is unavailable. This framework allows for a direct estimation of the wins that a player adds above a replacement player. This direct estimator is relatively simple to compute, available, easy to understand, and its interpretation is flexible to the narrative of a season. We will refer to this direct calculation of WAR as dWAR, which is short for direct WAR, which is shorthand for the direct calculation of wins above replacement. The dWAR estimator has the potential to yield a much more natural and appropriate estimate of WAR than those which involve the calculation of a hypothetical replacement player via black box methodology. The validity of dWAR depends on the team and the competition faced by the team being similar during both player states. Very nuanced interpretations of what dWAR measures emerge when team makeup is confounded with the availability of the player in question. 1
We primarily focus on the 2014 Yadier Molina season to show the discrepancies between conven- tional calculations of WAR and dWAR. Our version of WAR gives much more value to Yadier Molina’s 2014 season than conventional versions. This result is far from surprising. Many note that conventional versions of WAR do not properly account for leadership, game management, pitch framing, and catcher defense, which are all aspects of baseball that Molina excels at [Fagan, 2015, Posnanski, 2015, Schwarz, 2015, Fleming, 2017, Womack, 2017]. That being said, a tangible numeric value of the additional Cardinals wins attributable to Molina as a result of these intan- gible traits has not existed until now. We caution against generalizing our findings beyond the 2014 season with certainty, but we hope that the point is taken and can be used to strengthen Molina’s case for the Hall of Fame. The point here being that conventional versions of WAR likely have underestimated the number of Cardinals wins attributable to Yadier Molina by a substantial amount. Additional analyses are provided for Miguel Cabrera’s 2015 season with the Detroit Tigers and Mike Trout’s 2017 season with the Los Angeles Angeles. These specific players are chosen because of the 2012 most valuable player (MVP)race between them that is symbolic of the fight between those who favor new sophisticated analytics to value a player’s production and those who favor traditional analytics to value a players production. As noted in Baumer et al. [2015], sabermetricians from the new school advocated strongly for Trout while those that preferred traditional statistics advocated strongly for Cabrera. To the adherents of sabermetrics, the decision for who should win the 2012 MVP award was clear – point estimates showed Trout leading Cabrera by 3.2 fWAR and 3.6 bWAR. The openWAR metric in Baumer et al. [2015] provided far more sophistication to this debate. According to openWAR, the estimated difference between Trout and Cabrera is only 1.05 WAR in Trout’s favor. Moreover, there is substantial overlap of the interval estimates of Trout’s and Cabrera’s openWAR. We do not provide a dWAR estimate of these player’s WAR in 2012 because these players did not not miss a significant portion of the 2012 season, which voids comparisons to a suitable replacement player under the dWAR framework. However, both 2012 Cabrera and 2012 Trout were archetypically similar player in 2015 and 2017 respectively. Our dWAR estimates of WAR for 2015 Miguel Cabrera and 2017 Mike Trout give the opposite impression that conventional WAR and openWAR give for theses player’s respective value for their teams. In 2015, the Detroit Tigers were far worse when Miguel Cabrera did not play or was injured. However, the 2017 Los Angeles Angels were not terribly hindered by the absence of Mike Trout’s production. These findings are striking (especially for Trout) and they come with natural caveats arising from the context of those seasons. These caveats are explored. 2 Data Analyses 2.1 Yadier Molina in 2014 Yadier Molina played in 110 regular season games for the St. Louis Cardinals in the 2014 baseball season out of a possible 162. The St. Louis Cardinals won a total of 90 games and lost a total of 62 games. When Yadier Molina played, the Cardinals won 62 games and lost 48 games. When Yadier Molina did not play, the Cardinals won 28 games and lost 24 games [ESPN, 2019i]. The games in which he did not play are split among games in which he was healthy but did not enter the game and games in which he was injured and unavailable to play in any capacity. The former category 2
represents a strategic decision that involves a healthy baseball player, the latter category represents an unplanned incident in which team strategic decisions have to change due to the unexpected loss of a player. When Yadier Molina did not play but was available, the Cardinals won 7 games and lost 5 games. When Yadier Molina was injured, the Cardinals won 21 games and lost 19 games. Yadier Molina’s WAR in 2014 was 2.5 as estimated by BWARP, 2.9 as estimated by fWAR, and 3.1 as estimated by bWAR. We now motivate three versions of a player’s WAR that directly answers the question as posed in the Introduction. We compare how the team did in games in which the player played to those in \ on-off . This estimate of WAR is calculated as which the player did not play, denoted as dWAR \ on-off = (p̂on − p̂off ) × G dWAR where p̂on is the proportion of team games won when the player played, p̂off is the proportion of team games won when the player did not play, and G is the number of total games played. We then compare how the team did in games in which the player was available to play to those in which the \ avail-unavail . This estimate of WAR is calculated as player was unavailable to play, denoted dWAR \ avail-unavail = (p̂avail − p̂unavail ) × G dWAR where p̂avail is the proportion of team games won when the player played or was available to play and p̂unavail is the proportion of team games won when the player was unavailable to play. We finally compare how the team did in games in which the player played to those in which the player \ on-unavail . This estimate of WAR is calculated as was unavailable to play, denoted dWAR \ on-unavail = (p̂on − p̂unavail ) × G. dWAR These estimates of dWAR for Yadier Molina’s 2014 season are, \ on-off = (62/110 − 28/52) × 110 = 2.77, dWAR \ avail-unavail = (69/122 − 21/40) × 110 = 4.46, dWAR \ on-unavail = (62/110 − 21/40) × 110 = 4.25. dWAR The versions of dWAR that compare team success when Yadier Molina is available, and his playing time is subject to management’s decision, to team success when Yadier Molina is unavailable are drastically different than conventional calculations of this metric. The discrepancy between these approaches is anywhere between 1.15 and 1.96 wins depending on which estimates of WAR are being compared. Interpretations of the discrepancy between these metrics are massive. Slowinski [2010] provided a rule-of-thumb guideline for interpreting WAR. According to these guidelines, a WAR between 2.5 and 3.1 corresponds to a player that is anywhere from a solid starting player to a good player, while a WAR between 4.25 and 4.46 corresponds to an all-star level player that performed near the top of the league. The following are the notable injuries during the 2014 Cardinals season: Yadier Molina injured from July 10th until August 27th [ESPN, 2019i]; Jaime Garca will made his first start on May 18 [of Communications, 2014]; Jason Motte made his return on May 21 [Langosch, 2014]; Jaime Garca announced on July 5 that he would have season-ending surgery [Wikipedia, 2019]. The following 3
are notable transactions made by the Cardinals in 2014: On July 11, the Cardinals claimed George Kottaras off waivers from the Cleveland Indians as a possible replacement for Yadier Molina; on July 26, the Cardinals signed A.J. Pierzynski as a free agent as a possible replacement for Yadier Molina; on July 29, the Cardinals released George Kottaras; on July 31, the Cardinals traded Allen Craig and Joe Kelly for Corey Littrell, John Lackey and cash [Reference, 2019b]. We do not find that these injuries and transactions account for the massive discrepancies between our dWAR estimator and the conventional approaches. The Cardinals faced similar competition when Yadier Molina was unavailable and when Yadier Molina was available [Reference, 2019c]. \ on-off , dWAR We conclude that dWAR \ avail-unavail , and dWAR \ on-unavail are preferable to conventional versions of WAR when assessing the value of Yadier Molina to the Cardinals in the 2014 season. 2.2 Miguel Cabrera in 2015 We perform the same analysis for Miguel Cabrera’s 2015 season with the Detroit Tigers. Miguel Cabrera played in 119 games in 2015 and we estimate p̂on = 57/119, p̂off = 17/42, p̂avail = 59/126, and p̂unavail = 15/35 [ESPN, 2019a]. Conventional estimates of WAR for Miguel Cabrera’s 2015 season are, bW AR = 5.2, f W AR = 4.6, BW ARP = 5.3. Our estimates of dWAR for Miguel Cabrera’s 2015 season are, \ on-off = (57/119 − 17/42) × 119 = 8.83, dWAR \ avail-unavail = (59/126 − 15/35) × 119 = 4.72, dWAR \ on-unavail = (57/119 − 15/35) × 119 = 6.00. dWAR We take a look at significant injuries and transactions made by the Tigers in 2015. The following are significant injuries: Miguel Cabrera injured from July 4th until August 12th [ESPN, 2019a]; Justin Verlander injured until June 13th [ESPN, 2019h]; Victor Martinez injured from May 14th until June 17th [ESPN, 2019c]; Jose Iglesias injured from September 4th until October 4th [ESPN, 2019b]; Anibal Sanchez injured from August 18th until October 4th [ESPN, 2019e]. The following are significant transactions: On July 6, the Tigers claimed Marc Krauss off waivers from the Tampa Bay Rays as a possible Cabrera replacement; on July 30, the Tigers traded David Price for Daniel Norris, Matt Boyd and Jairo Labourt; on July 30, the Tigers traded Joakim Soria for shortstop JaCoby Jones; on July 31, the Tigers traded Yoenis Cespedes for Michael Fulmer and Luis Cessa; on August 20, the Tigers acquired Randy Wolf from the Toronto Blue Jays for cash considerations [Reference, 2019d]. We find that these injuries and transactions result in some key differences between in the makeup of the 2015 Tigers during the stretches when Cabrera is either available or unavailable. The strongest team makeup, excluding the availability of Cabrera, was from June 17th until the trades that began July 30th. Outside of this stretch, dWAR would maintain validity of the unavailability of Victor Martinez and Justin Verlander is equal to the balance of the departures of David Price, Joakim Soria, and Yoenis Cespedes and arrivals of Daniel Norris, Matt Boyd, Jairo Labourt, JaCoby 4
Jones, Michael Fulmer, and Luis Cessa. Perhaps this is so, in which case dWAR retains its validity. Perhaps this is not so, and a more thorough analysis is required. \ on-off , dWAR Another striking feature of this analysis is the differences between dWAR \ avail-unavail , \ on-unavail . These differences are partially explained by the 2015 Tigers winning 2 games and dWAR and losing 5 when Cabrera was available to play but did not enter the game. Perhaps management picked poor spots to rest Miguel Cabrera or perhaps this abysmal performance is explained by the inherent volatility of the 2015 Tigers season [Mowery, 2015] and the small amount of games in which Cabrera was available to play but did not enter the game. We think that it is likely \ avail-unavail underestimates 2015 Miguel Cabrera’s value to the 2015 Tigers and that that dWAR \ on-off overestimates 2015 Miguel Cabrera’s value to the 2015 Tigers. In any event, Miguel dWAR Cabrera was a phenomenal hitter in 2015 and all estimates of WAR pick up on this. That being said, It appears that conventional estimates of WAR underestimated the number of wins that Cabrera’s hitting brought to the Detroit Tigers in 2015. 2.3 Mike Trout in 2017 We perform the same analysis for Mike Trout’s 2017 season with the Los Angeles Angels. Mike Trout played in 114 games in 2017 and we estimate p̂on = 57/114, p̂off = 23/48, p̂avail = 59/118, and p̂unavail = 21/44 [ESPN, 2019g]. Conventional estimates of WAR for Mike Trout’s 2017 season are, bW AR = 6.7, f W AR = 6.9, BW ARP = 6.2. Our estimates of WAR for Mike Trout’s 2017 season are, \ on-off = (57/114 − 23/48) × 114 = 2.38, dWAR \ avail-unavail = (59/118 − 21/44) × 114 = 2.59, dWAR \ on-unavail = (57/114 − 21/44) × 114 = 2.59. dWAR The dWAR estimates that compare team success when Mike Trout is available, and his playing time is subject to management’s decision, to team success when Mike Trout is unavailable are drastically different than conventional calculations of this metric. The differences between the two approaches are massive and are not in Mike Trout’s favor. As before, we take a look at significant injuries and transactions made by the Angels in 2017. The following are significant injuries: Mike Trout injured from May 6th until May 10th and from May 28th until July 9th [ESPN, 2019g]; many key Angels players played very limited time in 2017, including Garrett Richards and Huston Street [ESPN, 2019d,f]. The following are significant transactions: On June 1, the Angels signed Michael Bourn as a free agent as a possible Mike Trout replacement; on July 2, the Angels released Michael Bourn; on August 31, the Angels traded Elvin Rodriguez and Grayson Long for Justin Upton [Reference, 2019a]. We do not find that these injuries and transactions account for the massive discrepancies between dWAR and the conventional approaches. However, Reiter [2017] notes that the Angels’ relative success in Trout’s absence is mind-boggling, especially given that the Angels faced relatively tough 5
competition during that stretch. He attributes the team success in Trout’s absence to the rest of the players coming together and clicking the moment Trout got hurt. We find this explanation to be interesting and plausible. We also find it plausible that Mike Trout is not the type of leader that can instill motivation in other players when being looked upon to do so. We do not have enough information to properly distinguish causes for this very strange “Mike Trout effect.” 3 Discussion We hope that dWAR can help strengthen Yadier Molina’s deserving Hall of Fame case. It is entirely within the realm of possibility that conventional calculations of WAR will have underestimated Yadier Molina’s career WAR by 10-30 wins when he retires. The 2017 Mike Trout season was truly mind-boggling and it illustrates the collective shortcomings of all metrics, including dWAR. The dWAR estimator of Mike Trout’s value to the 2017 Angels is valid in the sense that the 2017 Angels performed almost as well without Mike Trout as they performed with him. That is what happened and it is not debatable. However, using this as a final determination of Mike Trout’s value is very suspect. Perhaps the “Mike Trout effect” is real and perhaps it is a result of either lackadaisical uninspired teammates, poor leadership on Mike Trout’s part, poor team management, or all of the above, or possibly some other explanation. Even if this effect exists, it has only been shown to exist with respect to the 2017 Angels season and it may not exist in any other year or on any other team. Conventional indirect estimates of WAR provide a one-number summary of how statistically dominant Mike Trout was and currently is. To what extent this raw context neutral statistical account of Mike Trout’s greatness can translate to actual wins above replacement is an honest conversation worth having by fans, historians, scouts, sabermetricians, the baseball media, and team executives alike. As it currently stands, dWAR can only validly estimate WAR for a player during a season in which the player spent a substantial time in both the available and unavailable states. Since the calculation of dWAR is constrained to one season in isolation, it cannot be used to properly compare players across eras as noted in Eck [2018]. Valid era-invariant extensions to dWAR for all player seasons can be made with a suitable era-invariant model of the probabilities of team success that are central to the calculation of dWAR. This is not a trivial task and we speculate that strong structural assumptions in a partially Bayesian latent variable modeling framework would be required to achieve believable era-invariance. We would like to emphasize that traditional considerations of value including leadership ability, winning attitude, clutch performances, great defense, and other intangibles are important sup- plements to statistical accounts of value. Our metric dWAR can contribute to this emphasis by measuring to what extent an aggregation of these previously unmeasurable accounts of value has on the additional number of wins a particular player produces for his team over that of a suitable and available replacement player. If intangible value actually helps a team win when a player is on the field, then dWAR can find it when conventional calculations of WAR such as bWAR, fWAR, and BWARP will not. These conventional calculations of WAR are not actually in the business of directly calculating wins above replacement, dWAR is. 6
References Benjamin S. Baumer, Shane T. Jensen, and Gregory J. Matthews. Openwar: An open source system for evaluating overall player performance in major league baseball. Journal of Quantitative Analysis in Sports, 11 (2):1–27, 2015. Daniel J. Eck. Challenging nostalgia and performance metrics in baseball. ArXiv Preprint, 2018. URL https://arxiv.org/abs/1810.08029. ESPN. Miguel cabrera game-by-game stats, 2019a. URL http://www.espn.com/mlb/player/ gamelog/_/id/5544/year/2015. ESPN. Jose iglesias game-by-game stats, 2019b. URL http://www.espn.com/mlb/player/ gamelog/_/id/30382/year/2015. ESPN. Victor martinez game-by-game stats, 2019c. URL http://www.espn.com/mlb/player/ gamelog/_/id/5007/year/2015. ESPN. Garrett richards game-by-game stats, 2019d. URL http://www.espn.com/mlb/player/ gamelog/_/id/30892/year/2017. ESPN. Anibal sanchez game-by-game stats, 2019e. URL http://www.espn.com/mlb/player/ gamelog/_/id/6472/year/2015. ESPN. Huston street game-by-game stats, 2019f. URL http://www.espn.com/mlb/player/ gamelog/_/id/6175/huston-street. ESPN. Mike trout game-by-game stats, 2019g. URL http://www.espn.com/mlb/player/ gamelog/_/id/30836/year/2017. ESPN. Justin verlander game-by-game stats, 2019h. URL http://www.espn.com/mlb/player/ gamelog/_/id/6341/year/2015. ESPN. Yadier molina game-by-game stats, 2019i. URL http://www.espn.com/mlb/player/ gamelog/_/id/5986/year/2014. Ryan Fagan. Which active mlb players are hall of fame bound? the case for 15, 2015. URL http://www.sportingnews.com/us/mlb/list/ baseball-hall-of-fame-2015-voting-players-pujols-cabrera-kershaw-trout-posey-beltre-beltran/ dzae73sytib612elvquapw62q. John Fleming. How does yadier molina compare to 2017s hall of fame ballot catchers?, 2017. URL https://www.vivaelbirdos. com/st-louis-cardinals-sabermetrics-analysis/2017/1/20/14275092/ cardinals-yadier-molina-2017-hall-of-fame-catchers-ivan-rodriguez-pudge-jorge-posada-jason-v Jenifer Langosch. Motte impresses in return to the mound, 2014. URL https://www.mlb.com/ news/cardinals-jason-motte-impresses-in-return-to-the-mound/c-76353834. Matthew B. Mowery. Numbers added up for an abysmal season for tigers in 2015, 2015. URL https://www.theoaklandpress.com/ sports/numbers-added-up-for-an-abysmal-season-for-tigers-in/article_ 15663b9d-d334-538d-901a-8654dec6f70f.html. 7
St. Louis Cardinals Department of Communications. Game notes, 2014. URL http://cardinals. mlb.com/documents/8/9/4/75903894/51814_Layout_1_5dhyviqx.pdf. Joe Posnanski. Kids in the hall, 2015. URL https://sportsworld.nbcsports.com/ active-baseball-players-hall-of-fame-chances/. Baseball Prospectus. View details for warp, 2019. URL https://legacy.baseballprospectus. com/glossary/index.php?search=WARP. Baseball Reference. Baseball-reference.com war explained, 2010. URL https://www. baseball-reference.com/about/war_explained.shtml. Baseball Reference. 2017 los angeles angels trades and transactions, 2019a. URL https://www. baseball-reference.com/teams/LAA/2017-transactions.shtml. Baseball Reference. 2014 st. louis cardinals trades and transactions, 2019b. URL https://www. baseball-reference.com/teams/STL/2014-transactions.shtml. Baseball Reference. 2014 st. louis cardinals schedule, 2019c. URL https://www. baseball-reference.com/teams/STL/2014-schedule-scores.shtml. Baseball Reference. 2015 detroit tigers trades and transactions, 2019d. URL https://www. baseball-reference.com/teams/DET/2015-transactions.shtml. Ben Reiter. Mike trout is injured, but the angels have only improved during his absence, 2017. URL https://www.si.com/mlb/2017/06/29/los-angeles-angels-winning-mike-trout-injury. Joe Schwarz. An update on the pitch framing of yadier molina, 2015. URL https: //www.vivaelbirdos.com/st-louis-cardinals-sabermetrics-analysis/2015/6/25/ 8842869/an-update-on-the-pitch-framing-of-yadier-molina. Steve Slowinski. What is war? fangraphs, 2010. URL https://library.fangraphs.com/misc/ war/. Wikipedia. 2014 st. louis cardinals season, 2019. URL https://en.wikipedia.org/wiki/2014_ St._Louis_Cardinals_season. Graham Womack. Yadier molina is a clear hall of famer to cardi- nals pitchers, 2017. URL http://www.sportingnews.com/us/mlb/news/ yadier-molina-hall-of-fame-cardinals-catcher-stats-pitch-framing/ xkxgjg96e1ck1eqaiqqlyw2a3. c 2019 Daniel J. Eck This note is the intellectual property of Daniel J. Eck and terms of its redistribution are subject to the Creative Commons license: Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) For details on this license, see https://creativecommons.org/licenses/by-nc-sa/4.0/ 8
You can also read