Comparing the Impact of Star Rookies Carmelo Anthony and Lebron James: An Example on Simulating Team Performances in the NBA League

Page created by Gregory Gibbs

Sports

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

AMMARWRIGHT
Comparing the Impact of Star Rookies Carmelo Anthony and Lebron James: An Example on Simulating Team Performances in the
NBA League

Comparing the Impact of Star Rookies Carmelo
Anthony and Lebron James: An Example on
Simulating Team Performances in the NBA
League
Salwa Ammar
Ronald Wright
Department of Business Administration
Le Moyne College
Syracuse, New York
Ammar@lemoyne.edu
Wright@lemoyne.edu

Abstract
This paper describes a simulation exercise designed for introductory quantitative method classes both at the
MBA and undergraduate level. The exercise tracks the performances of teams in the National Basketball Asso-
ciation (NBA) during the season of 2004. It is designed as a spreadsheet model and is developed in stages
throughout the academic semester. The example is a significant illustration of the use of sports as a vehicle for
teaching OR topics, specifically simulation. It also incorporates many spreadsheet modeling skills such as the
use of Excel functions and Data Tables. The model provided a good mix of the ingredients for an effective
simulation example including a variety of answers, challenges and surprises.

Editor's note: This is a pdf copy of an html document which resides at http://ite.pubs.informs.org/Vo5No1/
AmmarWright/

1. Introduction the number one draft in the NBA for 2003 and gener-
ated much media attention. The second rookie,
For years, sports have been used in teaching Statistics Carmelo Anthony, led his college team for Syracuse
(Lock, 1997 and Nettleton, 1998). More recently, sports University to its first national championship in basket-
have been used as a motivating context for introducing ball. The championship was by far the most important
simulation (Ammar and Wright, 2001). The probabilis- sporting result for the central New York region in re-
tic nature of the competitive outcomes, the desire to cent years. Thus following the immediate progress of
predict these outcomes, and the variety among the this local 'hero' guaranteed for us the broad student
relationships between the outcomes, provide a rich interest. Student interest and curiosity were essential
array of applications for simulation models. Excel in sustaining their enthusiasm as we explored various
spreadsheets give instructors the ability to move away modeling tools and concepts for the simulation model.
from 'toy' examples and introduce, in a manageable Although specific to these rookies and their draft
format, real and meaningful illustrations (Evans, 2000). teams, the exercise can easily be generalized and up-
The most effective sports related examples capitalize dated to predict performances of any teams in the
on current events of general interest that stimulate NBA.
curiosity and inquisitiveness beyond those who may
be considered diehard fans. In this paper we demonstrate the details of the simu-
lation model using Carmelo Anthony's team, the
This paper describes an exercise that predicts the im- Denver Nuggets. In the previous year and prior to his
pact of two rookies in the National Basketball Associ- draft the Nuggets won only 17 out of the season's 82
ation (NBA). One of the rookies, LeBron James, was games. By the all − star game of 2004 and with the help
INFORMS Transactions on Education 5:1(67-74) 67 © INFORMS ISSN: 1532-0545

AMMARWRIGHT
Comparing the Impact of Star Rookies Carmelo Anthony and Lebron James: An Example on Simulating Team Performances in the
NBA League
of Anthony, the Nuggets had already won 31 games 2. Denver Simulation Model
and seem to be well on their way to the playoffs. The
example described in this paper joins the league at the The model involves an attempt to simulate the remain-
point of the all star game (after 55 games) and includes ing 27 games for the Denver Nuggets. Once the out-
a Monte − Carlo simulation for the performance of the come of these games is simulated the next step is to
Nuggets in the remaining games of the season. In de- assess the Nuggets' chances of making the playoffs.
termining the playoff chances of the Nuggets, the This process is designed in several stages. The first
simulation includes the performances of the team's stage is that of estimating the winning probabilities
nearest competitors. Similar assessments are also per- (percentage) by team in the league. It is important to
formed for the Cleveland Cavaliers, the team that recognize that these probabilities vary for each team
drafted LeBron James. depending on whether the game is played at home or
away. The second stage is to simulate the number of
The exercise is designed to help achieve several objec- Denver wins for the remaining scheduled season based
tives. To run a successful simulation the first step tends on the estimated probabilities. Finally in assessing
to focus on defining the relevant probabilities. The Denver's chances of reaching the playoffs, the model
example introduces a method for estimating probabil- examines Denver's nearest (slightly above or slightly
ities that is intuitively acceptable and follows the rules below) competitors by simulating each of their perfor-
of probability. In this paper the process of estimating mances and comparing Denver's performance.
the probabilities is simple (by design) in order to avoid
the need for extensive discussion and coverage of ad- 2.1. Estimating Winning Probabilities
vance topics in probability theory. Another objective
of this example is to introduce students to the simula- The probabilities could be estimated by using informa-
tion capabilities of Excel including the use of data tation on teams' performances in the previously played
bles to replicate observations. Also, this example intro- games. In the first 55 games of the season, Denver's
duces the concept of simulating events that are related. winning percent was .582. (Note: percent is typically
For example simulating the outcome of the Seattle at referred to on sport pages as a number between 0 and
Denver game determines the outcome of the Denver 1. For convenience we maintain this convention for
home game as well as the outcome of the Seattle away the data in this paper). We could use .582 as the prob-
game. Another important objective of the example is ability of winning each game. However, in the NBA
to demonstrate for students how simulation can be there is a significant difference between winning per-
used to provide answers to questions beyond the cents for home games and for games on the road. Table
specific simulated events. Simulating the outcomes of 1 shows Denver's home and road winning percentages
games for the season allows us to explore the chances as well as the average of the entire league. It also shows
of the team returning for the playoffs. The final part the same percentages for two select teams (for purpose
of this paper focuses on the process of assessing and of illustration).
updating the estimated probabilities as the season
progresses and games are won and lost.
Table 1: Select Winning Percentages
This example is used in our management science class
to introduce concepts and basic skills in spreadsheet
simulation. This is a core junior level class required of
students majoring in business and accounting. Stu-
dents entering this class are expected to have complet-
ed the introductory statistics requirement. The simula-
tion is done entirely in Excel. Large numbers of trials
are executed using data tables and other useful func-
tions in Excel. The use of add − ins such as Crystal Ball
and @Risk is introduced at the higher (or senior) level Since Denver has won 72% of its home games we could
simulation class and are not needed for this particular begin by assigning a probability of .72 to Denver
exercise. winning a future home game. However, the quality
of the opponent would raise or lower that probability

INFORMS Transactions on Education 5:1(67-74) 68 © INFORMS ISSN: 1532-0545

AMMARWRIGHT
Comparing the Impact of Star Rookies Carmelo Anthony and Lebron James: An Example on Simulating Team Performances in the
NBA League
for a particular game. For example it might be reason-
able to assign a probability of .72 to Denver beating .72 − (.67 − .38) = .43, a lower value since Indiana is a better than
New York at home since New York's road winning average road team.
percentage matches that of the league. That is New
York is an average road team. If the probability that It is important to check that the estimates for probabil-
Denver beats New York at home is .72, then the prob- ities of complementary events add to one. If we add
ability that New York wins that game would be 1 − the probability that the road team wins (as calculated
.72 or .28. This is lower than New York's road average, in (1)) to the probability that the home team wins (as
indicating that Denver is a better than average home calculated in (2)) we get:
team. In fact the extent to which New York's probabil-
Rper − (Hper − LHper) + Hper − (Rper − LRper) = LHper+ LRper= 1.
ity of winning is reduced (.38 to .28) of course matches
the extent to which Denver is a better than average
home team (.72 − .62). A possible generalization fol- This relationship suggests that equations (1) and (2)
lows. provide possible estimates for winning probabilities.
These estimates take into account the relative position
Consider two teams playing a particular game, team of any particular team in the league.
A, the home team and team B, the road team.
2.2. Modeling the Number of Denver Wins
Let Hper= proportion Team A wins when playing at home.
Figure 1 shows Denver's simulation sheet(1). It includes
Let Rper= proportion Team B wins when playing on the road. Denver's schedule and the appropriate home or road
records of each opponent. Average league results are
Let LHper= average proportion all league teams win when playing also included. Denver's schedule and the league
at home. standings at the time of this example were downloaded
from a popular sports site. Denver's win percentages
Let LRper= average proportion all league teams win when playing and the league win percentages were included. Using
on the road. equations (1) and (2) (and IF statements for home or
road), the probability of Denver winning each game
We can then estimate the probability that the road is calculated.
team wins as (team B over Team A):

Rper − (Hper − LHper). (1)

In our example, the estimated probability that New
York beats Denver on the road is:

.38 − (.72 − .62) = .28 .

Similarly we can estimate the probability that the home
teams wins as:

Hper − (Rper − LRper). (2)
Figure 1: Denver Simulation
In our example, the estimated probability that Denver
beats New York at home is: The outcome for each game is simulated by using
RAND, the Excel random number function for a uni-
.72 − (.38 − .38) = .72. form distribution between 0 and 1. If the random
number is less than the probability of Denver winning
Also, if Denver is playing Indiana at home the proba- that game a "win" is recorded. Otherwise "lose" is en-
bility of Denver winning is: tered in the simulation column. (Results in highlighted

(1) http://ite.pubs.informs.org/Vol5No1/AmmarWright/denver.xls
INFORMS Transactions on Education 5:1(67-74) 69 © INFORMS ISSN: 1532-0545

AMMARWRIGHT
Comparing the Impact of Star Rookies Carmelo Anthony and Lebron James: An Example on Simulating Team Performances in the
NBA League
cells are explained in the following section.) A was to run similar simulations for these three teams
COUNTIF statement is used to count the number of and compare the number of wins (Simulation File(2),
simulated wins. The total number of wins is the simu- Note: when running simulations only one spreadsheet
lated number of wins plus the actual wins at the time should be open at a time.). Running a simulation for
of the example. a new team merely requires entering a new schedule
(a lookup functions will check records and recalculate
Each time a recalculation is done, new random num- probabilities). However we need to take into account
bers are generated and new simulation results instances in which each of these teams plays one an-
recorded. The results of multiple simulations can be other. For example Denver plays Utah at Utah . Al-
recorded using Excel's Data Table (Evans and Olson, though the probabilities that Denver wins and Utah
2002). For a 1000 run simulation we found the average loses add to one, we don't want to use two different
number of wins to be 46 with a minimum of 39 and a random numbers to simulate the outcome of a single
maximum of 55. Table 2 shows the number of times a game. Here we choose to randomly generate outcomes
range of wins occurred in the 1000 runs. for home games and use the results to determine those
of the road games. For example, for the Denver at Utah
game, the outcome on the Denver sheet is linked to
Table 2: Range of Simulated Total Season Wins
the outcome on the Utah sheet. If a "lose" shows up
for Utah , a "win" will be entered for Denver and vice
versa. (See the lower highlighted cell in Figure 1.) The
spreadsheet(3) contains simulation sheets for each of
the four teams. Each sheet duplicates the basic struc-
ture of the Denver sheet described above.

When the season for all four teams is simulated, an IF
statement is used to record a "yes" if Denver's win total
equals the maximum of the four win totals and a "no"
otherwise (ignoring ties). As shown in Figure 2, a Data
Table is used to perform a 1000 replications and a
COUNTIF statement is used to count the percent of
"yes" results.

At this point we have some sense of how many games
the Denver Nuggets might win for the season. In more
than 90 percent of the runs Denver wins at least 43
games. Is this enough to make the playoffs? In the
previous year it took 44 wins to make the playoffs in
the Western Conference while 38 would have been
enough in the East. To better assess Denver's chances
for the playoffs we may need to know how the number
of Denver wins compared to the competing teams.

2.3. Modeling the Top Performer from Four Figure 2: First of Four
Teams
In one simulation of 1000 runs we observed that Den-
At the time of this exercise, Denver was in eighth place ver's wins exceeded the number of wins of the other
in the West, the final playoff spot. Teams in spots 9, three teams 98% of the time. Our confidence that
10, and 11 (Seattle, Utah and Portland ) could be re- Denver will make the playoffs has increased.
garded as threats to Denver's position. Our next step

(2) http://ite.pubs.informs.org/Vol5No1/AmmarWright/fourteams.xls
(3) http://ite.pubs.informs.org/Vol5No1/AmmarWright/fourteams.xls
INFORMS Transactions on Education 5:1(67-74) 70 © INFORMS ISSN: 1532-0545

AMMARWRIGHT
Comparing the Impact of Star Rookies Carmelo Anthony and Lebron James: An Example on Simulating Team Performances in the
NBA League

2.4. Modeling the Top Three Performers of Seven Also, Table 3 shows the summary of the simulated
Teams results. Since the top three teams will qualify for the
playoffs (along with the assumed first five) Denver
Can we be even more confident? Currently Denver is misses the playoffs only 1% of the time (actually 9 out
actually tied with Memphis and Houston for the 6th, of 1000). By all three models Denver's and Carmelo
7th, and 8th places. Hence there is a chance that one of Anthony's chances of making the playoffs seem very
these teams might falter, improving Denver's odds. high.
Our third simulation includes the 6th through 12th place
teams and attempts to determine whether Denver Table 3: Denver's Ranks as a Percent of 1000 Runs
would place in the top three of the last seven teams(4).
We are assuming the current top five teams will be in
the playoffs and only three spots are yet to be deter-
mined.

We can count the wins for each of the seven teams as
we did before in the four teams simulation. In seven
teams.xls(5) the sheets for the individual teams (other
than Denver ) are hidden to simplify the readers inter-
action with the spreadsheet. The sheets are not protect-
ed and can be easily unhiden to show a structure 3. The Other Guy
identical to those in the four teams' sheets. Once the
seven team performance is simulated the RANK Syracuse and Denver fans must admit that there is a
function is used to determine the rank of Denver second super rookie this year by the name of LeBron
within these seven teams. A Data Table is then used James. What are his chances of leading the Cleveland
to simulate 1000 replications with the rank of Denver Cavaliers to the playoffs? At the time of the exercise
as the table output. Figure 3 includes these seven team the Cavaliers were in 11th place in the East but talking
calculations. confidently of rising to the 8th and final spot. The 8th
through 11th spots were held by Boston, Miami,
Philadelphia and Cleveland, in that order. A new
spreadsheet was created for these four teams and a
thousand endings to the seasons were simulated. It
was assumed that only the top team would make the
playoffs from these four teams (the seven higher teams
are substantially above Boston, the current 8th and last
qualifier). Table 4 contains the percent of trials (out of
1000) in which each of the four teams gained that last
spot.

Table 4: Percent of Trials each Team Made the Playoffs

Figure 3: Top Three of Seven As modeled, Cleveland actually has very little chance
of making the playoffs. Also the current 8thplace team,

(4) http://ite.pubs.informs.org/Vol5No1/AmmarWright/seventeams.xls
(5) http://ite.pubs.informs.org/Vol5No1/AmmarWright/seventeams.xls
INFORMS Transactions on Education 5:1(67-74) 71 © INFORMS ISSN: 1532-0545

AMMARWRIGHT
Comparing the Impact of Star Rookies Carmelo Anthony and Lebron James: An Example on Simulating Team Performances in the
NBA League
Boston , does not have the best shot. Miami is the fa- so no fan should give up yet. And that too is a lesson
vorite to gain the eighth position. Some of the students from simulations. A lot of things can happen.
observe that this is consistent with Boston's poor play
after all their recent trades and the resignation of their
Table 6: Cleveland's Chances
coach. Of course, our model knows nothing of that.
What it does know, as summarized in Table 5, is that
Miami has a better home record than the other teams
and plays more home games down the stretch than
the other three teams.

Table 5: Home Game Advantages
4. Comparison of Model Prediction with
Actual Outcomes
One of the many values of simulating sporting events
is that we can compare our model predictions to actual
outcomes within a relatively brief period of time. In
The students' observations however allow for an ap- our case Denver began losing some key games (includ-
propriate discussion about what our models do and ing a game lost due to acknowledged referee error).
do not take into account. If all teams play the remain- Students started wondering about the validity of our
ing games at the level they played the first part of the model. This created the opportunity to discuss the
season our results are likely to be a very fair represen- nature of probability and the extent to which a simu-
tation. However, this analysis was performed just be- lation gives a range of possible outcomes any of which
fore the deadline for teams to make trades. As the could occur (and others as well). Our simulation did
students point out, the Boston team playing after the include events in which Denver did not make it. We
all − star break is a very different one from the one that began periodically entering the actual results to date,
played the first half of the season. This is also the case recalculating probabilities, and re − simulating the rest
for other teams including Cleveland . Our model ig- of the season. The first recalculation dropped the per-
nores all this and assumes the teams will play in a cent of times Denver qualified to around 70%. As
manner consistent with the way they played the first Denver started losing more games than predicted (on
fifty some games. Obviously if the probabilities are average), the recalculated lowered probabilities of
based on historical data and the future is sharply dif- winning produced lower likelihoods of making the
ferent form the past than the results are less reliable. playoffs. Figure 4 contains a chart showing those
We could attempt to make the results better by reduc- changing odds over the last part of the season.
ing the probabilities that Boston will win games (at
least based on their last 10 games) and we could in-
crease the probabilities that Cleveland will win based
on the improved team and perhaps the maturing of
LeBron James. Unfortunately these changed probabil-
ities might represent biased preferences.

What we can do is ascertain how much better Cleve-
land will have to play to have a reasonable chance at
the playoffs. To do this we increase the probability of
winning both home and road games until Cleveland
makes the playoffs more than half the time. These in-
cremental results are shown in Table 6. As the table
shows, Cleveland's performance has to improve to Figure 4: Predicted probabilities that Denver makes the
well above the league average in order to have a rea- playoff as season progresses.
sonable chance. Of course a 4% chance is still a chance,

INFORMS Transactions on Education 5:1(67-74) 72 © INFORMS ISSN: 1532-0545

AMMARWRIGHT
Comparing the Impact of Star Rookies Carmelo Anthony and Lebron James: An Example on Simulating Team Performances in the
NBA League
Eventually the regular season ended. To our relief Nonetheless, the anecdotal evidence points to a very
Denver did make the playoffs, Cleveland did not, and useful approach in introducing simulation and its
Miami did, all as predicted. As we were prepared to various components. Students were very enthusiastic
argue (if Denver didn't make it), no single outcome about the model and its results. The basketball example
says a great deal about the validity of a simulation coupled with the local interest in the lead player (An-
model. In this exercise however we were actually thony) contributed greatly to students' continued in-
simulating the outcome of over 150 games (in the terest and desire to explore the model further. Beyond
Western Conference alone) and then keeping track of the basic simulation model we were able to maintain
the playoff outcome. One way to evaluate the model a flexible agenda for the exercise. The four teams and
is to compare the number of games won by each team seven teams' analyses were a direct result of further
with the expected number of wins based on the esti- probing initiated by the students. Also all subsequent
mated probabilities. Furthermore, rather than compar- analyses including tracking and updating Denver's
ing point estimates we can look at a distribution of chances, developing confidence intervals for the sim-
predicted number of wins. This distribution can be ulation results, and evaluating the Cleveland team,
estimated by using the average probability of each were instigated by students' inquiries. We were able
team winning its remaining games as the probability on several occasions to demonstrate the limitations of
for a binomial random variable with the number of the model as well as our ability to interpret the results.
games as the number of trials. Figure 5 contains a 95% Finally, where students were inclined to explain the
confidence interval for the predicted number of wins outcome using factors not included in the model we
for each team along with the actual number of wins were able to clearly point that out.
(marker). In every case the actual number of wins falls
within the estimated distribution. It is true that Den- Overall, this exercise allows the instructor to reinforce
ver's number of wins was below the expected value important aspects of simulation modeling and model-
and Portland's and Utah's wins were above the expect- ing in general. Specifically issues related to probabilis-
ed average. However, more often that not, results are tic modeling, validity of simulation models, and what
going to be either above of below the average. In real- if analyses can all be address with this example.
ity Portland and Utah did compete with Denver to the
very end of the season for that final spot. In most simulation models the assumed probability
distributions are estimated based on historical data.
Whether the future is in fact well represented by this
historical data is always a concern. This exercise allows
students to fully understand this in a familiar context.
The exercise also provides an opportunity to evaluate
the quality of these estimated probabilities after a sig-
nificant amount of actual data becomes available. The
model can be used to reinforce the fact that the proba-
bilities used in defining simulation models can have
considerable impact on the results and validity of the
model. As the probabilities of winning changed over
the season, the likelihood of Denver making the play-
offs changed considerably.
Figure 5: Actual # of wins versus a predicted 95% confi-
dence interval. There is also room for meaningful student discussion
about the extent to which these models fail to describe
5. Conclusions the real world exactly but still give useful information.
For example, we could try to improve the calculation
This exercise has proved to be a very useful experience of probabilities to include the impact of a team's
in the classroom. It is important to note that it was schedule in determining their record for the first part
designed and used only in one semester (Spring of of the season. Did some teams play a weaker early
2004). The effectiveness or the impact on student schedule? If we attempted to use the existing data to
learning has not been measured in any formal way. ascertain this would we be using smaller and smaller

INFORMS Transactions on Education 5:1(67-74) 73 © INFORMS ISSN: 1532-0545

AMMARWRIGHT
Comparing the Impact of Star Rookies Carmelo Anthony and Lebron James: An Example on Simulating Team Performances in the
NBA League
samples to determine our probabilities and would we
face diminishing returns for our efforts? Would it make
any difference if we included all 29 teams in our
model?

What − if analysis is an important part of any modeling
exercise. We were able to demonstrate the usefulness
of varying the probabilities used in the model to see
if any reasonable variation in the probabilities really
gave Cleveland a good chance of making the playoffs.

With this exercise students can see the value of Monte
− Carlo simulation in a way that is fully transparent
and in a context they understand. At the same time,
and just as important, they get to experience using
data tables and other useful Excel functions. Some
students will get excited about the basketball results,
some about the power of simulations, and some about
what they can do with Excel. Hopefully we have im-
proved the chances that some students will get excited
about something.

References
Ammar, A and Wright, R. (2001), "What Chance Does
the USA Have of Going to the World Cup?: An
Example of Spreadsheet Monte − Carlo Simu-
lation using Visual Basic," Proceedings of Deci-
sion Sciences Institute National Meeting.
Evans, J. (2000), "Spreadsheets as a Tool for Teaching
Simulations," INFORMS Transactions on Educa-
tion, http://ite.pubs.in-
forms.org/Vol1No1/Evans/index.php
Evans, J. and Olson D. (2002), Introduction to Simulation
and Risk Analysis, 2nd Edition, Prentice Hall,
New Jersey.
Lock, R. (1997), "NFL Scores and Point Spreads," Jour-
nal of Statistics Education, Vol. 5.
Nettleton, D. (1998), "Investigating Home Court Ad-
vantage," Journal of Statistics Education, Vol. 6.

INFORMS Transactions on Education 5:1(67-74) 74 © INFORMS ISSN: 1532-0545

You can also read