The Gambler's Fallacy Prevails in Lottery Play - Brian Dillon

Page created by Ricky Osborne
 
CONTINUE READING
The Gambler’s Fallacy Prevails in Lottery Play∗†

                     Brian Dillon∗‡                   Travis J. Lybbert‡

                                          July 19, 2021

                                             Abstract
      We use large natural experiments in Haiti and Denmark to test recent theoretical pre-
      dictions about how agents react to random events. Using player-level administrative
      data from lotteries, we find that the average player avoids recent winners (the gam-
      bler’s fallacy). A small share of players exhibit the hot hand fallacy, and bet recent
      winners. We find no evidence of ‘streak switching,’ in which beliefs switch from the
      gambler’s fallacy to the hot hand fallacy as winning streaks grow. The consistency of
      findings across these different settings suggests that the cognitive underpinnings of the
      gambler’s fallacy are deeply rooted in human cognition.

      Keywords: Gambler’s fallacy; hot hand fallacy; lottery; law of small numbers; Haiti;
      Denmark.
      JEL codes: D91, D84, G41.

  ∗†
     We are grateful to Hilary Wething and Ben Glasner for excellent research assistance, and to Chris
Barrett, Dan Benjamin, Supreet Kaur, Alex Rees-Jones, and Bruce Wydick for comments on an earlier
draft. Any errors are our responsibility.
  ∗‡
     Cornell University. Email: bmd28@cornell.edu.
   ‡
     University of California, Davis. Email: tlybbert@ucdavis.edu.

                                                  1
1     Introduction
People struggle to form correct statistical intuitions. Two common mistakes, the gambler’s
fallacy (GF) and the hot hand fallacy (HHF), demonstrate the difficulty of understanding
stochastic processes. The GF, the belief that an event drawn from an independent, identically
distributed (i.i.d.) process is less likely to be drawn immediately after it wins (the expectation
of reversals), betrays a mistaken sense that small samples should look like large samples
(Kahneman et al., 1982; Rabin, 2002). The HHF is the belief in too much serial correlation,
namely that an event is more likely to be drawn if it was drawn in the recent past (the
expectation of persistence). A large literature has documented the prevalence of these biases
and examined their consequences for choice and belief formation (see Benjamin (2019)).
       The GF and the HHF may appear to be mutually exclusive, because they suggest
opposite reactions to recent events. However, theory allows for a possible relationship be-
tween these biases (Edwards, 1961; Camerer, 1989; Rabin, 2002; Rabin and Vayanos, 2010).
Consider an agent predisposed to the GF. If they observe an outcome drawn repeatedly from
a random process, they believe it is less likely to be drawn again. If the streak continues, at
some point the agent sees this as so implausible that they believe the data generating process
(DGP) must favor that outcome. They now expect persistence. Such “streak switching” is
one case in a broader model of belief formation developed by Rabin and Vayanos (2010).
Streak switching requires uncertainty about the DGP.
       Streak switching has largely been motivated by applications in finance (Rabin and
Vayanos, 2010; Suetens, Galbo-Jørgensen and Tyran, 2016). Yet its central features—an
inclination to expect reversals, combined with a tendency to over-interpret the signal when
reversals do not happen—align with anecdotal evidence from many settings. Strange weather
events can be dismissed as anomalies, but frequent aberrations tend to make people more
concerned about climate change (Deeg et al., 2019). Increased media coverage of police
killings of black men has coincided with a rise in the share of whites who believe that black
people in the US face “a lot of discrimination” (Tesler, 2020), possibly because attributing
violence to rogue officers is easier after news of a single event (a short streak) than after
repeated coverage of many such events (a long streak). Unjustified optimism about online

                                                2
dating can lead individuals to dismiss initial negative experiences, but a string of bad experi-
ences may lead the same users to lower their expectations so far that they give up on dating
services altogether (Schetzer, 2017). These are complex scenarios, consistent with various
models of belief formation. Yet, they highlight the difficulty of forming accurate statistical
intuitions, and suggest that streak switching may be present in many domains.
       We test for foundational behaviors related to the GF, the HHF, and streak switching
in large administrative data sets from a mobile phone-based lottery in Haiti and an online
lottery in Denmark. Lotteries represent massive natural experiments, with known DGPs
and repeated choices, outside the realm of finance. Our empirical analysis proceeds in three
steps, each of which tests an assumption or prediction of the Rabin and Vayanos model.
First, we use randomization inference to estimate the share of players who choose “hot”
numbers—those that were winners in the previous round—more frequently than the random
rate. This addresses the fundamental assumption in Rabin and Vayanos that the agent
has a default GF bias. Second, we test whether the gambler’s fallacy predominates when
there are no streaks or short streaks. Third, we use a semi-parametric approach to test for
streak switching, i.e., to allow for heterogeneous reactions to winning streaks of different
lengths. Leveraging the large scale and individually identifiable nature of the data, all of our
regressions control for player, number, and round fixed effects.
       In Haiti, we find that 6.3% of players choose recent winners more often than suggested
by chance. In Denmark, the share of these hot-hand types is higher, at 15.7%. However,
in both settings the average player reduces betting on recent winners, indicating that the
dominant average tendency is to succumb to the gambler’s fallacy. In Haiti, betting on a
number falls by over a third after a recent win. The largest magnitude effect is for numbers
that won two rounds ago, not one round ago, because the immediate deterrent effect of
winning is attenuated by the bets of the hot-hand types. In Denmark, the average player also
avoids recent winners. The effect is statistically significant but much smaller in magnitude
than in Haiti, equivalent to a reduction of less than one percent of the mean bet. This could
be due to differences in the relative sizes of player choice sets in the two lotteries (see Section
3.4). In both countries, the deterrent effect of a win decays gradually with time, but persists
for several weeks.

                                                3
We find no evidence in either country of a switch to HHF betting after longer streaks.
In one of the main specifications for Haiti we find the opposite, namely that the deterrent
effect of a streak increases monotonically in streak length. These findings are consistent with
the predictions of Rabin and Vayanos (2010) when the DGP is known to be i.i.d. Popular
narratives around the lottery suggest that many players believe lottery odds are mutable and
not i.i.d. (see Bhatia (2010) and Section 2), which would satisfy the conditions for possible
streak switching. While these narratives surely reflect the beliefs of some lottery players,
they are not borne out in the average response to wins. The average player seems to believe
that draws are i.i.d., but that small and large samples should look similar (hence the default
GF bias) (Rabin, 2002).
        These findings contribute to the literature in three ways. First, we advance a long line
of work on belief formation that has used aggregate lottery or casino data to understand how
players react en masse to winning streaks (Clotfelter and Cook, 1993; Terrell, 1994; Scoggins,
1995; Papachristou and Karamanis, 1998; Farrell et al., 2000; Roger and Broihanne, 2007).
With aggregate lottery data, it is impossible to distinguish within-player changes in betting
from changes in the composition of the player pool. Some recent work has used individually
identifiable lottery data to examine the role of number preferences and recent wins in shaping
choice (Suetens and Tyran, 2012; Lien and Yuan, 2015; Wang et al., 2016; Suetens, Galbo-
Jørgensen and Tyran, 2016). Only one of these studies, Suetens, Galbo-Jørgensen and Tyran
(2016), is concerned with streak switching. Suetens, Galbo-Jørgensen and Tyran (2016) find
suggestive evidence of streak switching in lottery data from Denmark; in Section 5.1 we
explain why we arrive at a different conclusion using the same Denmark data set.1
        Second, we present concrete evidence of two distinct player types in lottery play.
The typical player in both countries is susceptible to the GF, yet a small but non-negligible
share of players subscribe to the HHF. Future work on belief formation would benefit from
incorporating these two types and modeling how their relative shares in the population
influence aggregate outcomes in settings of choice under uncertainty.2
    1
      A more detailed comparison with SGT is in Appendix C. In Section 3 we explain why the Haiti data set,
which is novel to this paper, is somewhat more conducive than the Denmark data to testing the predictions
of Rabin and Vayanos (2010).
    2
      Throughout the analysis we ignore the possibility of streak selection bias identified by Miller and Sanjurjo
(2018). Their central insight is that when streaks are defined as consecutive successes from a binary process,

                                                        4
Finally, our third contribution lies in the parallel analyses of lottery choices across
two very different settings. While the Haiti and Denmark games are not identical (see
Section 3.4), they share many of the same broad features. By many economic metrics, these
two countries are as different as can be. The consistency of findings across these settings
suggests that the cognitive underpinnings of the gambler’s fallacy are deeply rooted in human
cognition.

2     Conceptual Background
The model of Rabin and Vayanos (2010) explains how the HHF “might arise as a consequence
of the GF” (p.731). The agent in the model is dogmatically predisposed to the GF, observes a
sequence of draws from a DGP, and uses Bayesian inference to form beliefs about parameters.
When the agent is uncertain about the DGP or believes that draws are serially dependent,
they expect reversals after short streaks (the GF) but persistence after long streaks (the
HHF). In contrast, if the DGP is i.i.d. and agent is told this, “they expect reversals after
streaks of any length” (p. 751).
        Lotteries would seem to violate the conditions required for streak switching. The
DGP is fixed, and draws are i.i.d. Yet, popular narratives reveal that many players believe
that lottery odds are mutable. For many Haitian players, superstitions about the lottery
are part of a broader religious worldview in which divine intervention and fate have direct
bearing on daily life (Bhatia, 2010). In the US, there is a cottage industry of books that
peddle secrets for beating the lottery, many of which promote combinations of both GF and
HHF reasoning. Because it is beliefs about the DGP, rather than the true DGP, that are
relevant for choice, it is plausible that even in the lottery setting we might find evidence of
streak switching.
        These considerations motivate our two research questions. First, in the absence of
the conditional probability that a streak continues is smaller than the unconditional probability of success.
This bias can lead to incorrect rejection of the HHF in some settings. There are two reasons why streak
selection bias is not relevant for our analysis. The first and foremost is that we do not define streaks as
consecutive successes. Second, even if we ignore that fact, the lottery DGP represents many simultaneous
Bernoulli trials from a single process. Players effectively observe multiple sequences from the “same coin.”
Miller and Sanjurjo (2018) show in their Appendix A.2 that the bias is vanishingly small when the agent
observes multiple sequences.

                                                     5
streaks, do lottery players demonstrate a default attachment to the GF? This addresses the
maintained assumption about player types that underlies the Rabin and Vayanos (2010)
model. Second, do superstitions or other beliefs lead players to ignore the i.i.d. nature
of lottery draws, succumb to streak switching, and bet “hot” numbers after long streaks?
Evidence of streak switching in our lottery contexts would imply that the average player
does not believe the lottery is i.i.d.

3        Lottery Details and Data
In this section we describe the Haiti and Denmark data sets, analyze the rates of hot-
hand play, discuss the differences in the two lotteries, and explain how we define the main
dependent variables.

3.1        The Boloto Mobile Phone Lottery in Haiti

The lottery is part of the rhythm of daily life in Haiti. Millions of Haitians play frequently,
and some of the working poor routinely wager a large share of their daily income (Bernstein,
2015). Players often select numbers based on superstitions and dreams. Concordances known
as the Tchala, which are available at every lottery stall and online,3 translate elements of
dreams into numbers. While most games are administered by physical lotto stalls, digital
lotteries played on mobile phones have gained popularity in recent years, particularly among
younger Haitians in urban and peri-urban areas.
           We study a mobile phone lottery game called Boloto. Like all Haitian lottery games,
Boloto is played twice each day, corresponding to the midday and evening numbers drawn
in the New York Lottery (to ensure transparency). To participate, a player places a bet
consisting of three two-digit number pairs (00-99) in a specified order. The cost of each bet
is 25 Haitian gourdes (HTG), or about 0.60 USD in 2012. There is no limit to the number
of bets a single player can make in each round. To bet more money on a set of numbers, a
player simply places additional bets. The payout for Boloto is a function of which number
    3
        For an online version of the Tchala see http://lisa.ht/tchala/ (Accessed 21 November 2018).

                                                       6
matches the draw.4
          15000
                                                                                       .05             10: 4.7%

                                                                                       .04
              10000
     Number of bets

                                                                                       .03   0: 2.7%

                                                                             Density
                                                                                                       11: 2.5%
                                                                                                                    33: 2.4%
   5000

                                                                                                         13: 2.1%
                                                                                       .02

                                                                                       .01
          0
              12

                               12

                                            2

                                                          12

                                                                    13
                                         01
            20

                             20

                                                        20

                                                                    20
                                       g2
         b

                           ay

                                                      v

                                                                  n
                                                                                        0
      fe

                                     au

                                                   no

                                                               ja
                           m

                                                               20
  01

                                    01

                                                   01

                                                                                             0                20        40         60   80   100
                         01

                                            Date                                                                        Number played

                       A. Number of bets by round, Haiti                                     B. Histogram of numbers played, Haiti

                                         Figure 1: Betting patterns in the Haiti Boloto
Notes: Authors’ calculations from individually identifiable, administrative lottery data from Haiti. Panel A shows number of
bets placed over 730 rounds of play—twice per day, for a year—in Haiti. Each bet in Haiti consists of three numbers from set
{0, 1, ..., 99}; Panel C shows the histogram of numbers played.

                      The private firm that conducts the Boloto provided us with data for the universe of
bets placed from February 1, 2012 to January 31, 2013.5 For each bet we observe the player
ID, the date of the game, an indicator for midday/evening round, the ordered set of three
two-digit number selections, the time and date of the bet, and the winning numbers. Player
IDs are unique codes linked to mobile phone accounts. Across the 730 rounds (2 per day, for
a year), a total of 4,505,519 bets were placed, representing over 13.5 million number choices.
We observe bets from 112,808 different players. The average player makes 39.9 bets, in 12.7
different rounds, in 2 different months, on 9.1 different days.
                      Panel A of Figure 1 shows the number of bets placed in Boloto, by round.6 A weekly
cycle of activity is clearly visible. Increased activity during the months July–October might
        4
      Winning in the first, second, or third position pays out 250 HTG (10x), 100 HTG (4x) or 50 HTG (2x),
respectively. Picking all three winners, out of order, pays 100,000 HTG (4,000x). Picking all three winners
in order wins the jackpot, which pays out 2,000,000 HTG (80,000x). Payouts are independent across players,
except if there are multiple jackpots, in which case the winners split the payout. Jackpot splitting is never
observed (there are only a few jackpots in the data, and none are shared). We assume that the possibility
of splitting a jackpot does not shape individual number choices.
    5
      There are 366 days in that range, because 2012 was a leap year, but only 365 days with betting (there
is no data for Christmas Eve).
    6
      The spike on October 17 coincides with Dessalines Day, a national holiday commemorating the assas-
sination of Haiti’s founder.

                                                                         7
be associated with summer visits from Haitians living abroad, or with positive income shocks
from the September-October harvest. Panel B of Figure 1 shows the histogram of numbers
played. The most popular choice, 10, represents 4.7% of all plays. The four next most
popular numbers are 00, 11, 33, and 13, all of which are played at more than twice the
random rate.

3.2     The System Lotto Online Lottery in Denmark

System Lotto is a weekly, online lottery game in Denmark.7 In this game, seven winning
numbers are drawn each week, without replacement, from the positive integers 1, . . . , 36.8
There are five categories of winnings corresponding to choosing different shares of the drawn
numbers. The jackpot, which is won by picking all seven numbers, pays out 11.25% of total
ticket revenues. Unclaimed jackpots are rolled over to the next round. Payouts in System
Lotto are pari-mutuel (divided among all winners in each category).
        To participate in System Lotto, players select between 8 and 31 numbers from the set
{1, 2, . . . , 36}. The online system then randomly selects 7 of those numbers to be the player’s
bet. Each bet costs 3 Danish krone (DKK), or about 0.48 USD at the time. Players can
increase the wager on a number by purchasing more tickets and/or choosing fewer numbers
per bet. They can also select more than the maximum 31 numbers per round by purchasing
multiple tickets and varying their number choices across tickets.
        In administrative data from 28 weeks of System Lotto play in 2005, provided by
SGT, we observe at least one bet by 25,807 players. The data set includes a unique ID for
each player.9 The average player participates in 11.2 of the 28 observed weeks, makes 33.1
separate ticket purchases per round, and chooses 13.5 different numbers per round (ranging
from 8 to 36).
        Panel A of Figure 2 shows the number of bets per round in System Lotto. The two
spikes in play correspond to periods with consecutive jackpot rollovers and hence higher
   7
     All of the details we provide about the System Lotto game structure are drawn from SGT. See their
paper for more details about the game and about lotteries in Denmark.
   8
     Although numbers are drawn without replacement, the lottery is i.i.d. from the player’s perspective,
because no choices are made between the draws of a single round.
   9
     The data is available, for the purpose of replicating SGT, from the website of the Journal of the European
Economic Association.

                                                      8
500000                                                                     .05

                    400000                                                                     .04
   Number of bets

                                                                                                                       7: 3.2%
                    300000                                                                               1: 3.1%                      13: 3.1%   19: 3.0%
                                                                                               .03

                                                                                     Density
                    200000
                                                                                               .02
                    100000

                                                                                               .01
                           0
                             5

                                         5

                                                    05

                                                                  05

                                                                            05
                           00

                                      00

                                                  20

                                                                20

                                                                          20
                                                                                                0
                         2

                                   g2
                      un

                                                p

                                                              v

                                                                         ec
                                 au

                                             se

                                                           no

                                                                          d
                       j
                    19

                                             21

                                                                                                     0             6             12         18        24    30   36

                                                                       24
                                 05

                                                           07
                                                    Date                                                                                Number played

                       A. Number of bets by round, Denmark                                      B. Histogram of numbers played, Denmark

                                        Figure 2: Betting patterns in the Denmark System Lotto
Notes: Authors’ calculations from individually identifiable, administrative lottery data from Denmark. Panel A shows the
number of bets per round across 28 weekly rounds in Denmark. Each bet in Denmark consists of 8-31 numbers from set
{0, 1, ..., 36}; Panel B shows the histogram of numbers played.

potential winnings. Panel B of Figure 2 shows the histogram of numbers played. The most
popular choice, 7, represents 3.2% of all plays. Other popular numbers are 13, 1, and 19.
There is much less variation in the relative popularity of numbers in Denmark than in Haiti
(comparing panels C and D), possibly because players in Denmark must at minimum choose
8 of 36 possible options each round (25%), while players in Haiti can choose as few as 1 of
100 possible options per round (1%).

3.3                     Analyzing rates of hot hand play

We use randomization inference to test the null hypothesis that each player chooses hot
numbers—those that were winners in the previous round—at a rate less than or equal to
that which would be consistent with randomly picking numbers. We interpret a rejection
of this hypothesis as an indication that the player is to some extent susceptible to the hot
hand fallacy, or has some other preference for hot numbers unrelated to beliefs about their
winning probability. See Appendix D for details on the implementation of these tests.
                       Panel A of of Figure 3 shows a player-level histogram of p-values from a test of the
null hypothesis that the player chooses hot numbers at a rate less than or equal to random.
While nearly two thirds (63.8%) of players in Haiti never make a hot hand bet (not shown),

                                                                                 9
6.3% choose hot numbers with a frequency that is significantly greater than the random rate
(95% confidence). The rate of hot hand play is roughly constant across the year, although
there are occasional spikes. The round after the winning numbers were 5-50-55, nearly 40%
of bets included a hot number. The next five rounds with the highest shares of hot hand
betting occur after a win by 0 or 10, two of the most popular number choices.

             15
                                                                                                            15
                       p-values shown for players with hot hand play rate > random
                                                                                                                      p-values shown for players with hot hand play rate > random
                       Among this group, 25.5% significantly > random (95% confidence)
                                                                                                                      Among this group, 32.1% significantly > random (95% confidence)
                       Among all players, 6.3% significantly > random (95% confidence)
                                                                                                                      Among all players, 15.7% significantly > random (95% confidence)

             10                                                                                             10
   Density

                                                                                                  Density
             5                                                                                               5

             0                                                                                               0
                  0 .05           .25                .5                .75               1                       0 .05           .25                .5                .75                1
                      p-values, null hypothesis is that hot hand play rate
choices, if their beliefs about other players’ choices affect their own selections. Second, the
online system in Denmark chooses seven of each player’s selected numbers, at random, to
be their actual bet. Although this intermediate selection is unaffected by recent wins, we do
not know what players believe about the subselection process. This step is like an individual
mini-lottery within each round, which may itself be subject to biased beliefs. Third, System
Lotto has less coverage than Boloto. The time period for the Haiti data is nearly twice
as long, and involves 730 rounds of play compared to 28 in Denmark. There are almost
five times as many unique players in Haiti than in Denmark (the population of Haiti was
roughly twice that of Denmark during the periods that generated the data). Finally, the
probability of a number winning is much higher in System Lotto than Boloto. To faciliate
comparison with SGT we will treat the previous six rounds as the period of recent history
for assessing GF, HHF, and streak switching in Denmark (see below). With seven unique
numbers winning each week, in expectation there will be 26.16 (out of 36) numbers that are
winners over any six consecutive rounds in Denmark. With so many “streaking” numbers,
it is difficult for players to react similarly to all of them (and we see below that they in fact
do not).
       Despite these differences, an important aspect of our study is that we are able to
implement similar tests in individually identifiable lottery data from two countries that
differ in many respects. In 2019, Denmark ranked 11th in terms of the Human Development
Index, while Haiti ranked 169th. Any commonalities that emerge between these two settings
would seem to be suggestive of deeply engrained patterns in human cognition and statistical
perception, rather than purely environmental factors.

3.5    Construction of Dependent Variables

Our analysis examines the relationship between a number’s winning history and the prob-
ability that it is bet. To construct an outcome variable for the Haitian Boloto, we first
represent each number selection as 100 separate choices: 1 decision to play a number, and 99
decisions not to play all others. This allows us to take advantage of both player and number
fixed effects. Let dijnrp be a dummy variable equal to 1 if player i in bet j plays number
n in round r in position p, and 0 otherwise. The position p refers to the first, second, or

                                               11
third number in the bet. The numbers n lie in the set {00, 01, 02, . . . , 99}. The round, r,
includes both the date and the time (midday or evening) of the game. The bet indicator, j,
captures the possibility that a player places multiple bets per round. We use Jir to denote
the number of bets placed by player i in round r. In our analysis the dependent variable is
P layedinr = Jj=1
            P ir P3
                    p=1 {dijnrp }, a count of the number of times in round r that i played n

in any position and any bet.10 This is approximately proportional to the amount wagered
on the number. At the player-number-round level, the full dataset contains a little over 143
million observations. The mean value of P layedinr in the Haiti data is 0.094.
         To construct an outcome variable for the Danish System Lotto, we perform a similar
transformation, so that each selection of m numbers is characterized as m separate decisions
to play a number and 36-m decisions not to play the others. We then define P layedinr as
the effective bet placed on number n by player i in round r. This value is calculated by
multiplying the total amount wagered by player i in round r with the share of the number n
in i’s round r number selections. This is equivalent to the dependent variable “Money bet”
in SGT, and is roughly comparable to the definition of the dependent variable for Haiti. The
mean value of P layedinr in the Denmark data is 2.76.

4        Empirical Approach
We use a common set of empirical specifications to separately analyze the Haiti and Denmark
data. To provide a baseline characterization of how the amount bet on a number is related
to its recent success, we first estimate OLS regressions of the following form:

                                         R
                                         X
                          P layedinr =         βl W innern,r−l + ηXinr + inr                      (1)
                                         l=1

where P layedinr is as defined in the previous section; W innern,r−l is a binary variable indi-
cating whether n was one of the drawn numbers in round r − l; Xinr includes fixed effects for
players, numbers, and rounds; and inr is a statistical error term. With a sufficiently large
    10
     Our findings are broadly similar if we define the dependent variable as P layedDummyinr =
maxJj=1
     ir
        {max3p=1 {dijnrp }}, which measures the extensive margin choice to play a number at the player-
number-round level. See Appendix B.

                                                    12
choice of R, a plot of the βl coefficients will non-parametrically trace out the time path of
effects of past wins on current betting. Evidence of βl < 0 (βl > 0) is consistent with a GF
effect (HHF effect) that persists for l rounds. In practice, we set R to be large enough that
at the longer lags there is no evidence of an effect on betting.11
        Specification (1) does not account for the probability that a number drawn in round
r − l will be drawn again prior to round r, which is increasing in l. The effect of winning, es-
pecially winning in the distant past, may be underestimated if a number wins multiple times.
To account for this, we also estimate OLS regressions based on the following specification:

                                     R
                                     X
                      P layedinr =         βl M ostRecentW inn,r−l + ηXinr + inr                        (2)
                                     l=1

which is identical to (1), except the key independent variable M ostRecentW inn,r−l takes a
value of 1 only if r − l is the most recent round in which n was drawn, and 0 otherwise. Once
again, a finding of βl < 0 (βl > 0) is consistent with the GF (HHF).
        Estimation of specifications (1) and (2) provides the average effect of lagged wins on
current betting. To test predictions about players’ reactions to streaks, we need to allow for
more complex interactions between past events. Lengthy winning streaks are rare in both
games, but particularly so in Boloto where the probability that a specific number wins in any
given round is only 0.0297. Following SGT, we define a streak as the co-occurrence of a win in
the previous round with a history of winning in other recent rounds. Formally, let Hotnessnr
be the number of times that n was a winner during rounds r − 2 to r − S, for some integer
S ≥ 2; and let Hotnrc be a dummy variable equal to 1 if Hotnessnr = c, and 0 otherwise. For
each r, the winning streak of number n is given by W innern,r−1 ×{W innern,r−1 +Hotnessnr }
(e.g., the streak has length 3 if n was drawn in the previous round and was drawn twice in
rounds 2 . . . S). To semi-parametrically estimate the average response to streaks of different
  11
     Because the Haitian Boloto is played twice per day and the Danish System Lotto is played once per
week, R is much larger for Haiti than for Denmark (i.e., it covers more rounds), but it covers a shorter time
period.

                                                     13
length, we estimate OLS regressions of the following form:

                                   C
                                   X
P layedinr = βW innern,r−1 +             {δc Hotnrc + γc (W innern,r−1 × Hotnrc )} + ηXinr + inr
                                   c=1
                                                                                              (3)

where all variables are as defined above, C is sufficiently large to include all observed streaks,
and inr is a statistical error term. For Denmark we follow SGT and set S = 6, which covers
the previous six rounds / weeks. For the Boloto in Haiti, which is played more often and hence
provides more flexibility in how we define recent events, we report results for S ∈ {6, 14, 60},
equivalent to defining streaks over the previous 3, 7, and 30 days.
       There is an important tension in the choice of lag used to define streaks. The esti-
mated effect of past wins on betting will be attenuated if we use a recall period that extends
back beyond when wins are salient (which likely varies across players), because players will
be reacting to streaks that they perceive to be shorter than those defined by us. The baseline
findings from equations (1) and (2) will provide some indication of the persistence of any
influence of past results on number selection.
       In equation (3), the player, number, and round fixed effects account for average
differences between players, average popularity of numbers, and temporal patterns in betting.
The total effect on current betting of a streak of length 1 is given by β. The total effect of a
streak of length d > 1 is νc = β + δc + γc , where c = d − 1. If players are not influenced by
either the GF or the HHF, we expect β = δc = γc = 0 for all c. The predictions of Rabin and
Vayanos (2010) are equivalent to (i) β < 0, (ii) νc < 0 for all c, and (iii) νd < νc for any d > c
(because the model predicts that longer streaks induce a larger GF effect). Alternatively, if
players exhibit streak switching—i.e., if the average player does not believe that the lottery
DGP is i.i.d.—then we expect β < 0 and νc > 0 for all c of sufficient length.
       Estimates based on equation (3) provide average effects for streaks of a given length.
To allow for complete flexibility in the estimated response to any combination of past wins,
we also estimate a fully non-parametric model for the previous six rounds (S = 6). In
this model, the dependent variable is P layedinr , and the independent variables are dummy
variables for all observed combinations of wins during the previous 6 rounds. As always, we

                                                 14
include player, number, and round fixed effects.
         For all models we report standard errors clustered at the player level. We do not
impose balance on the panel, taking as given the extensive margin decision to participate
in the lottery in any particular round (however, in a robustness check described below, we
partially account for game entry and exit).

5        Results
For our baseline estimates of equations (1) and (2) in Haiti, we use a set of 84 dummy
variables covering every round in the previous 6 weeks (R = 84). For Denmark, we use a set
of 10 dummy variables covering the previous 10 weeks (R = 10). Panel A of Figure 4 plots
the coefficients on the dummy variables representing a win each lagged period, along with
95% confidence intervals, for equation (1) estimated with the Haiti data. Panel B shows the
equivalent plot for equation (2). Panels C and D provide the analogous plots for Denmark.
         In Haiti, the average effect of a number being drawn in lagged rounds 2-84 indicates
a surprisingly persistent GF response. Winning never leads to an increase in betting, on
average. Players avoid numbers that have won recently, but the effect is attenuated as the
win fades into the past. The deterrent effect of a recent win is statistically significant for
approximately 60 rounds (30 days).12 A win in lagged round 2, 3, or 4 reduces the number
of times a number is selected by 0.032–0.035, a reduction of over a third from the mean of
0.094. The effects are even larger in magnitude when we restrict attention to only the most
recent win (Panel B). The exception to the pattern of diminishing effect size in Haiti is from
a win in the immediately preceding round. The point estimate for a win in the preceding
round is approximately −0.01 in Panel A, less than a third of the magnitude of the effect of
a win in lagged rounds 2–4. This attenuation may be driven by the 6.3% of ‘hot hand types’
in Haiti (Figure 3, Panel A).
         For Denmark, we also see evidence in the baseline analysis of a GF effect after recent
wins (panels C and D of Figure 4). The average effect of a win in the previous 1-2 rounds
    12
     How do players keep track of numbers’ winning histories so far back? Some physical lottery stalls post
the recent winning numbers. It is also a popular service to receive the winning numbers by text message,
which provides a convenient archive of recent wins.

                                                    15
0                                                                                                0
   Estimated coefficient w/ 95% C.I.

                                                                                                    Estimated coefficient w/ 95% C.I.
                                       -.02                                                                                             -.02

                                       -.04                                                                                             -.04

                                       -.06                                                                                             -.06

                                              0             20        40         60       80                                                   0          20       40         60       80
                                                                       Lag                                                                                          Lag

                                                  A. Haiti, effect of any past win                                                             B. Haiti, effect of most recent win

                                       .02                                                                                              .02
   Estimated coefficient w/ 95% C.I.

                                                                                                    Estimated coefficient w/ 95% C.I.

                                         0                                                                                                0

                                       -.02                                                                                             -.02

                                       -.04                                                                                             -.04

                                              0         2         4          6        8        10                                              0      2        4          6        8        10
                                                                      Lag                                                                                          Lag

                                              C. Denmark, effect of any past win                                                         D. Denmark, effect of most recent win
                                                             Figure 4: Effect of prior wins on amount bet on a number
Notes: Authors’ calculations from individually identifiable, administrative lottery data from Haiti and Denmark. Each panel
represents one regression. Figures show OLS coefficients with 95% confidence intervals from regressions of the amount bet at
the player-number-round level on a set of binary variables that describe the winning history of the number over the lagged
rounds displayed on the horizontal axes. All regressions include player, number, and round fixed effects, with standard errors
clustered at the player level. The Boloto game in Haiti is played twice per day (60 rounds = 30 days). The System Lotto game
in Denmark is played once per week (4 rounds = 28 days). The relative magnitudes of estimated coefficients are much larger
in Haiti, where the mean value of the dependent variable is 0.094, than they are in Denmark, where the mean of the dependent
variable is 2.762.

                                                                                               16
is to reduce the amount bet by 0.013-0.022 (DKK). These are much smaller effects than we
find in Haiti, representing less than one percent of the mean bet of 2.76 DKK. The deterrent
effect of a win disappears after two rounds. On average, System Lotto players do not react
systematically to wins that occurred three or more weeks previously.
       Table 1 shows estimates of equation (3) for both countries. Columns 1, 2, and 3
report the findings for streaks defined over the previous 3 days, 7 days, and 30 days, using
the Haiti data. Panel A reports coefficient estimates, and Panel B reports the estimated
marginal effects (νc ). Across all streak lengths and specifications, there are no positive
marginal effects of a winning streak on the probability that a number is bet. All 13 of the
estimated effects in Panel B, columns 1–3 are negative and statistically different from zero.
In column 1, the negative effect of a win streak on the probability that a number is selected
is increasing in streak length. Betting on a number falls by 0.011 percentage points (11.6%
of the mean), 0.021 percentage points (22.1%), and 0.031 percentage points (32.6%) after
streaks of length 1, 2, and 3, respectively. The finding that the deterrent effect increases
in streak length is consistent with Rabin and Vayanos (2010) when the DGP is i.i.d. (and
participants are aware of that). When we define streaks over periods of 7 or 30 days, the GF
again dominates (columns 2 and 3), but the marginal effects do not increase monotonically
in streak length. This could be due to using an overly long window for streak definition, as
discussed in Section 4. As we saw in Figure 4, the magnitude of the GF effect is smaller for
more distant wins, indicating a weakening over time of the salience of past wins.
       Column 4 of Table 1 reports estimates from specification (3) using the Danish data.
As in Haiti, all of the estimated marginal effects are negative. Only the effects of streaks of
length 1 and 2 are statistically different from zero, and the magnitude of the deterrent effect
does not increase in streak length. Effect sizes are again much smaller in Denmark than in
Haiti. In Haiti the magnitudes represent reductions in betting that are 13-38% of the mean,
while in Denmark the statistically significant effects represent reductions on the order of one
hundredth of a percent of the mean (specifically, 0.007-0.012%).
       Table 2 show estimates from the fully non-parametric model, in which the independent
variables are dummy variables for all observed combinations of wins during the previous 6
rounds. Columns 1-3 are for Haiti, and Columns 4-6 are for Denmark. There is no pattern

                                              17
Table 1: The Effects of Winning Streaks on Betting
 Dependent variable:   Amount of money bet by player i on number                     n in round r
                                         HAITI                                                  DENMARK
                               Lag used to define streaks                                       Specification
                       3 days         7 days         30 days                         Ours            Modified SGT
                       (1)            (2)            (3)                             (4)             (5)
 Panel A: Estimated Coefficients
 Winner                -0.012***      -0.013***      -0.016***                       -0.033***          -0.061***
                       (0.001)        (0.001)        (0.001)                         (0.010)            (0.010)
 Hot 1                 -0.031***      -0.026***      -0.022***                       -0.009             -0.007*
                       (0.001)        (0.001)        (0.001)                         (0.006)            (0.004)
 Hot 2                 -0.037***      -0.034***      -0.033***                       -0.005             0.003
                       (0.001)        (0.001)        (0.001)                         (0.007)            (0.005)
 Hot 3                 -0.044***      -0.039***      -0.040***                       -0.006             -0.015**
                       (0.003)        (0.001)        (0.001)                         (0.010)            (0.006)
 Hot 4                                -0.035***      -0.038***                       -0.022             0.010
                                      (0.005)        (0.001)                         (0.019)            (0.019)
 Hot 5                                               -0.037***
                                                     (0.003)
 Winner × Hot 1        0.020***       0.011***       0.010***                        0.020**            0.051***
                       (0.001)        (0.001)        (0.001)                         (0.008)            (0.009)
 Winner × Hot 2        0.012***       0.021***       0.014***                        0.020**            0.053***
                       (0.003)        (0.002)        (0.001)                         (0.009)            (0.010)
 Winner × Hot 3                       0.036***       0.025***                        0.032**            0.019
                                      (0.004)        (0.002)                         (0.016)            (0.019)
 Winner × Hot 4                                      0.023***                        0.006              0.106**
                                                     (0.003)                         (0.038)            (0.045)
 Winner × Hot 5                                      0.024***
                                                     (0.006)
 Observations          1.42e+08       1.42e+08       1.42e+08                        1.04e+07           1.01e+07
 R2                    0.053          0.053          0.053                           0.347              0.485
 Mean of dep. variable 0.095          0.095          0.095                           2.762              2.756
 Panel B: Marginal Effects
 Streak length 1       -0.012***      -0.013***      -0.016***                       -0.033***          -0.061***
                       (0.001)        (0.001)        (0.001)                         (0.010)            (0.010)
 Streak length 2       -0.023***      -0.028***      -0.028***                       -0.021**           -0.018**
                       (0.001)        (0.001)        (0.001)                         (0.010)            (0.008)
 Streak length 3       -0.036***      -0.027***      -0.034***                       -0.017             -0.005
                       (0.004)        (0.002)        (0.002)                         (0.012)            (0.009)
 Streak length 4                      -0.016***      -0.030***                       -0.007             -0.058***
                                      (0.004)        (0.002)                         (0.017)            (0.015)
 Streak length 5                                     -0.030***                       -0.049             0.055
                                                     (0.004)                         (0.039)            (0.038)
 Streak length 6                                     -0.030***
                                                     (0.005)

Notes: Authors’ calculations from individually identifiable, administrative lottery data from Haiti and Denmark. W inner is
an indicator variable for the number being drawn in the previous round. Hot X is an indicator for the number being drawn X
times in rounds r − 2 to r − S. S is noted in the header for columns 1-3, and S = 6 for columns 4-5. Regressions in columns
1-4 include player, number, and round fixed effects. The regression in column 5, which matches a main specification in Suetens,
Galbo-Jørgensen and Tyran (2016), includes number fixed effects, a lagged dependent variable, and an indicator for jackpot
roll-over weeks. Relative to SGT, the only modification in column 5 is that hotness enters non-parametrically here, and linearly
in SGT. Standard errors in all regressions are clustered at the player level.
                                                               18
of recent wins that increases average betting on a number in Haiti. Out of 37 coefficients in
column 1, 35 are statistically different from zero, and all of those are negative. Longer streaks
are almost universally associated with the largest reductions in betting in Haiti, confirming
that the pattern in column 1, Panel B, Table 1 is not an artifact of an outlier or single event.
       Columns 4-6 of Table 2 show the non-parametric estimates for Denmark. There are
six statistically significant effects for combinations that include a win in the previous round
(lag 1). All are negative, except for the weakly positive effect of winning in periods {1, 3,
5, 6}. This appears to be spurious, as the point estimates for wins in lags {1, 2, 3, 5}, {1,
2, 4, 5}, and {1, 2, 3, 4, 6} are all negative (if imprecise), and we know from column 4 of
Table 1 that the average effects of streaks of any length are never positive and statistically
significant.
       The central takeaway from our main results in both Haiti and Denmark is that players
tend to bet in accordance with the GF. There is no evidence of HHF betting or streak
switching on average. We find that the GF effect is stronger for longer streaks, but only
when using the shortest recall period in Haiti. These results are robust to the inclusion of a
lagged dependent variable to account for entry and exit related to wins by preferred numbers
(Appendix A), and to defining the dependent variable as the extensive margin decision to
play a number (Appendix B).

5.1     Reconciling our Results with SGT

SGT use the System Lotto data from Denmark to test predictions very similar to those
tested here. They claim to find evidence of streak switching. Specifically, they conclude
that “players tend to bet less on numbers that have been drawn in the preceding week, as
suggested by the ‘gambler’s fallacy’, and bet more on a number if it was frequently drawn in
the recent past, consistent with the ‘hot-hand fallacy’ (Suetens, Galbo-Jørgensen and Tyran,
2016). Why do our analyses arrive at different conclusions about streak switching from the
same data set?
       There are a few differences in the empirical specifications used by us and SGT, all of
which likely matter to some degree. But the fundamental difference arises from SGT’s deci-
sion to impose linearity in the possible effect of Hotness on the amount bet on a number. In

                                               19
specification (3), the hotness of a number enters our regression equation non-parametrically.
SGT force that relationship to be linear. Because the large majority of streaks are short
(Table 2, column 6), the slope coefficient on Hotness in SGT is determined primarily by
changes that occur between streaks of length 0, 1, and 2. In Table 1, column 4, Panel B,
the coefficient on a streak of length 1 is -0.033, and that on a streak of length 2 is -0.021.
Extrapolating this line linearly, one would eventually find that long streaks lead to hot
hand betting, which looks like streak switching. But as our non-parametric analysis reveals,
such extrapolation would be spurious. Gamblers in Denmark do not react systematically to
winning streaks longer than 2.
        To further verify this interpretation, we re-estimate SGT’s main specification, with
the only difference that we relax their linearity assumption.13 To match their model we omit
player and round fixed effects from equation (3), include a lagged dependent variable, and
include a control for rollover jackpots. Column 5 of Table 1 shows the results. The implied
“slope” of the line between streaks of length 1 and 2 is steeper than in our preferred specifi-
cation (compare columns 4 and 5, panel B). It is easy to see how linear extrapolation of these
coefficients would generate the appearance of streak switching. Yet, any such appearance
would be spurious. There are three statistically significant marginal effects in column 5, and
all are negative. The coefficient on streaks of length 5 is positive, but not statistically differ-
ent from zero (and is identified by only a single instance in which a number achieves a streak
of 5). Relaxing linearity in the SGT model removes any indication of streak switching.14

  13
      Here we focus on SGT’s analysis of “active players.” In other models they impose balance on the panel
by assigning zeroes for all numbers during non-played rounds. We do not estimate such models for Haiti or
for Denmark.
   14
      See Appendix C for a discussion of other differences between our approach and the SGT analysis.

                                                    20
Table 2: The Effects of Winning Streaks on Betting: Non-Parametric

 Dependent variable:              Amount of money bet by player i on number n in round r
                                                  HAITI                                           DENMARK
                                               Standard       Number of                           Standard       Number of
                                  Coefficient  error          occurrences           Coefficient   error          occurrences
                                  (1)          (2)            (3)                   (4)           (5)            (6)
 Win in lag 1                     -0.012***    0.001          1867                  -0.032***     0.010          61
 Win in lag 2                     -0.037***    0.001          1870                  -0.030***     0.009          65
 Win in lag 3                     -0.034***    0.001          1870                  -0.009        0.007          60
 Win in lag 4                     -0.032***    0.001          1867                  -0.003        0.006          63
 Win in lag 5                     -0.027***    0.001          1864                  -0.002        0.006          65
 Win in lag 6                     -0.026***    0.001          1852                  0.002         0.006          60
 Win in lag 1,   2                -0.016***    0.002          52                    -0.012        0.013          16
 Win in lag 1,   3                -0.023***    0.002          56                    -0.030**      0.012          21
 Win in lag 1,   4                -0.027***    0.003          54                    -0.033**      0.013          17
 Win in lag 1,   5                -0.020***    0.002          58                    -0.024*       0.014          15
 Win in lag 1,   6                -0.030***    0.002          58                    -0.008        0.013          21
 Win in lag 2,   3                -0.036***    0.002          51                    -0.014        0.012          15
 Win in lag 2,   4                -0.036***    0.002          55                    -0.004        0.013          18
 Win in lag 2,   5                -0.042***    0.002          52                    -0.026**      0.012          19
 Win in lag 2,   6                -0.037***    0.002          59                    -0.007        0.012          16
 Win in lag 3,   4                -0.040***    0.001          50                    -0.013        0.012          15
 Win in lag 3,   5                -0.036***    0.002          55                    0.012         0.011          20
 Win in lag 3,   6                -0.039***    0.002          54                    -0.010        0.010          23
 Win in lag 4,   5                -0.032***    0.001          49                    0.001         0.011          14
 Win in lag 4,   6                -0.035***    0.002          58                    0.022**       0.010          17
 Win in lag 5,   6                -0.029***    0.001          49                    -0.002        0.011          15
 Win in lag 1,   2,   3           -0.012       0.009          1                     -0.018        0.020          4
 Win in lag 1,   2,   4           0.005        0.014          2                     -0.019        0.017          6
 Win in lag 1,   2,   5           -0.047***    0.011          2                     -0.013        0.020          3
 Win in lag 1,   2,   6                                                             -0.016        0.034          1
 Win in lag 1,   3,   4           -0.050***    0.005          3                     -0.021        0.023          3
 Win in lag 1,   3,   5                                                             -0.021        0.019          4
 Win in lag 1,   3,   6           -0.047***    0.009          1                     0.013         0.033          2
 Win in lag 1,   4,   5           -0.039***    0.007          3                     -0.003        0.019          6
 Win in lag 1,   4,   6                                                             -0.022        0.018          5
 Win in lag 1,   5,   6           -0.085***    0.009          1                     -0.049**      0.023          3
 Win in lag 2,   3,   4           -0.045***    0.005          1                     -0.041*       0.022          3
 Win in lag 2,   3,   5           -0.045***    0.013          2                     0.004         0.023          3
 Win in lag 2,   3,   6           -0.056***    0.012          2                     -0.015        0.020          3
 Win in lag 2,   4,   5           -0.056***    0.005          3                     -0.028        0.026          2
 Win in lag 2,   4,   6                                                             0.011         0.020          5
 Win in lag 2,   5,   6           -0.059***    0.005          3                     -0.002        0.018          7
 Win in lag 3,   4,   5           -0.028***    0.006          1                     0.012         0.018          5
 Win in lag 3,   4,   6           -0.043***    0.014          2                     -0.043**      0.019          4
 Win in lag 3,   5,   6           -0.053***    0.005          3                     0.053         0.034          1
 Win in lag 4,   5,   6           -0.041***    0.005          1                     0.000         0.018          5
 Win in lag 1,   2,   3,   5                                                        -0.028        0.023          3
 Win in lag 1,   2,   4,   5                                                        -0.017        0.034          1
 Win in lag 1,   2,   4,   6                                                        0.003         0.030          1
 Win in lag 1,   3,   5,   6                                                        0.065*        0.035          1
 Win in lag 1,   4,   5,   6                                                        -0.013        0.038          1
 Win in lag 2,   3,   4,   5                                                        0.015         0.044          1
 Win in lag 2,   3,   4,   6                                                        0.003         0.025          2
 Win in lag 2,   3,   5,   6                                                        -0.097***     0.035          1
 Win in lag 3,   4,   5,   6                                                        -0.015        0.037          1
 Win in lag 1,   2,   3,   4, 6                                                     -0.049        0.040          1
 Observations                     142462600                                         10379916
 R2                               .053                                              .347

Notes: Authors’ calculations from individually identifiable, administrative lottery data from Haiti and Denmark. All regressions
include player, number, and round fixed effects, with standard errors clustered at the player level. Column 3 reports the count
of times over 73,000 observed number-rounds in Haiti that a number was a winner in the listed combination of lags. Column 6
reports the equivalent counts over 1,008 observed number-rounds in Denmark. Columns 1-3 are left blank for combinations of
lagged wins that are never observed in Haiti.
                                                                 21
6    Conclusion
This paper uses large natural experiments in different settings to test predictions of the
Rabin and Vayanos (2010) model of streak switching. In our context, the DGP is i.i.d.,
but players may not believe that it is, leaving open the possibility of complex biases. We
find broad support for the foundational conditions that make streak switching possible, but
we reject the idea that the average player believes the lottery is not i.i.d. In both Haiti
and Denmark, the average player bets in accordance with the GF by avoiding numbers that
recently won, and never bets the hot hand. We also find evidence of ideological attachment
to the HHF by a small share of players. In Haiti (Denmark), 6.3% (15.7%) of players bet
recent winners more frequently than would be predicted by chance. This identification of two
distinct player types in high-frequency administrative data may provide new foundations for
models in which players of finite types make (potentially biased) choices that have important
influence on aggregate outcomes.
       Perhaps the most striking aspect of these results is their qualitative similarity in
Haiti and Denmark, despite the many differences between these countries. This suggests
that ideological attachment to the GF, which is a precondition for streak switching, is a
deeply rooted aspect of human cognition. While applications from finance in rich countries
motivated the formulation of streak switching models such as Rabin and Vayanos (2010), we
speculated in the introduction that streak switching may help us understand the evolution
of beliefs about climate change, racial bias, and online dating. If baseline attachment to
the GF is as universal as is implied by our findings, it seems all the more likely that streak
switching could be present in these other choice domains, where agents may be even less
likely to believe that the DGP is i.i.d. (and where it may indeed not be). Applying the
streak switching framework to the study of choice and belief formation in other real-world
contexts is a promising avenue for future research.

                                             22
References
Benjamin, Daniel J. 2019. “Errors in probabilistic reasoning and judgment biases.” In Handbook of Be-
  havioral Economics: Applications and Foundations 1. Vol. 2, 69–186. Elsevier.

Bernstein, Rachel L. 2015. “In Pursuit of the Transformational Sum: Lottery and Savings in Haiti.”
  University of California, Davis.

Bhatia, Pooja. 2010. “Dream Ticket.” The National, Friday, April 2: 3–5.

Camerer, Colin F. 1989. “Does the Basketball Market Believe in theHot Hand,’ ?” The American Economic
  Review, 79(5): 1257–1261.

Clotfelter, Charles T, and Philip J Cook. 1993. “Notes: The “gambler’s fallacy” in lottery play.”
  Management Science, 39(12): 1521–1525.

Deeg, K, E Lyon, A Leiserowitz, E Maibach, and J Marlon. 2019. “Who is changing their mind
  about global warming and why?” Yale University and George Mason University. New Haven, CT: Yale
  Program on Climate Change Communication.

Edwards, Ward. 1961. “Probability learning in 1000 trials.” Journal of Experimental Psychology,
  62(4): 385.

Farrell, Lisa, Roger Hartley, Gauthier Lanot, and Ian Walker. 2000. “The demand for lotto: the
  role of conscious selection.” Journal of Business & Economic Statistics, 18(2): 228–241.

Judson, Ruth A, and Ann L Owen. 1999. “Estimating dynamic panel data models: a guide for macroe-
  conomists.” Economics letters, 65(1): 9–15.

Kahneman, Daniel, Stewart Paul Slovic, Paul Slovic, and Amos Tversky. 1982. Judgment under
 uncertainty: Heuristics and biases. Cambridge university press.

Lien, Jaimie W, and Jia Yuan. 2015. “The cross-sectional “Gambler’s Fallacy”: Set representativeness
  in lottery number choices.” Journal of Economic Behavior & Organization, 109: 163–172.

Miller, Joshua B, and Adam Sanjurjo. 2018. “Surprised by the hot hand fallacy? A truth in the law
 of small numbers.” Econometrica, 86(6): 2019–2047.

Nickell, Stephen. 1981. “Biases in dynamic models with fixed effects.” Econometrica: Journal of the
  econometric society, 1417–1426.

Papachristou, George, and Dimitri Karamanis. 1998. “Investigating efficiency in betting markets:
  Evidence from the Greek 6/49 Lotto.” Journal of Banking & Finance, 22(12): 1597–1615.

Rabin, Matthew. 2002. “Inference by believers in the law of small numbers.” The Quarterly Journal of
  Economics, 117(3): 775–816.

Rabin, Matthew, and Dimitri Vayanos. 2010. “The gambler’s and hot-hand fallacies: Theory and
  applications.” The Review of Economic Studies, 77(2): 730–778.

Roger, Patrick, and Marie-Hélène Broihanne. 2007. “Efficiency of betting markets and rationality of
  players: evidence from the French 6/49 lotto.” Journal of Applied statistics, 34(6): 645–662.

Schetzer, Alana. 2017. “Dating burnout: The fallout from serial online dating disappointment.” SBS
  Australia. 2 May 2017.

Scoggins, John F. 1995. “The lotto and expected net revenue.” National Tax Journal, 61–70.

                                                 23
Suetens, Sigrid, and Jean-Robert Tyran. 2012. “The gambler’s fallacy and gender.” Journal of Eco-
  nomic Behavior & Organization, 83(1): 118–124.

Suetens, Sigrid, Claus B Galbo-Jørgensen, and Jean-Robert Tyran. 2016. “Predicting lotto num-
  bers: a natural experiment on the gambler’s fallacy and the hot-hand fallacy.” Journal of the European
  Economic Association, 14(3): 584–607.

Terrell, Dek. 1994. “A test of the gambler’s fallacy: Evidence from pari-mutuel games.” Journal of risk
  and uncertainty, 8(3): 309–317.

Tesler, Michael. 2020. “The Floyd protests will likely change public attitudes about race and policing.
  Here’s why.” Washington Post. 5 June 2020.

Wang, Tong V, Rogier Jan Dave Potter van Loon, Martijn J Van den Assem, and Dennie
 Van Dolder. 2016. “Number preferences in lotteries.” Judgment and Decision Making, 11(3): 243–259.

                                                  24
Online Appendix
    “The Gambler’s Fallacy Prevails in Lottery Play”
                             Brian Dillon and Travis J. Lybbert

A     Robustness: Including a lagged dependent variable
Many lottery players have favorite numbers, which they play often. It is conceivable that
intermittent play combined with strong number preferences could be responsible for the
smaller magnitude single-lag effects in Figure 4, relative to the effects of wins that occurred
two or more rounds in the past. The idea is that some players may choose to play in a
particular round specifically because one of their favored numbers was drawn in the previous
round. While this behavior has the flavor of hot hand play, other interpretations are possible.
A player that sees his favorite number win might simply be reminded of the game and the fun
of gambling. Or, he may choose to interpret a win by his favorite number as a signal that it
is time to play again, even if he does not believe that the odds of winning are mutable. The
player fixed effects in our main analysis do not fully account for these possible behaviors.
Player fixed effects control for time invariant number preferences and for players that never
change their bets, but they do not control for the extensive margin decision to play a favorite
number in response to recent events.
       To account for this potential mechanism, we re-estimate all of our main specifications
with the inclusion of a lagged dependent variable as an additional independent variable.
Specifically, when the dependent variable is P layedinr , we define the lagged dependent vari-
able as the value of P layedinq , where q < r is the last round in which player i placed a bet
(not necessarily the round immediately prior to r). In an unbalanced panel with a short T
for many players, including this lagged dependent variable could bias all coefficients, though
to our knowledge the exact form of the bias in this circumstance is not known (Nickell, 1981).
We provide this analysis for robustness, but are cautious in our interpretations due to the
potential for Nickell bias, and for this reason prefer the results in the main table which do
not include a lagged dependent variable.
       Figure S1, Table A, and Table A contain results analogous to those in Figure 4,

                                              1
Table 1, and Table 2 in the main paper, augmented with the inclusion of a lagged dependent
variable. In all cases, the coefficient on the lagged dependent variable is highly statistically
significant, and lies between 0 and 1 (not reported). Of greater interest to us is the stability
of the estimated coefficients on variables representing the recent winning history of a number.
Those estimates are broadly consistent with the main tables. Level effects change with the
inclusion of the lagged dependent variable, but the signs, relative magnitudes, and patterns
of statistical significance are qualitatively similar to our main results. Player attachments
to specific numbers may be responsible for some periodic entry and exit from these lottery
games, but not so much that they alter our main conclusions about the gambler’s fallacy
and the lack of streak switching.

                                               2
0                                                                                               0
   Estimated coefficient w/ 95% C.I.

                                                                                                   Estimated coefficient w/ 95% C.I.
                                       -.02                                                                                            -.02

                                       -.04                                                                                            -.04

                                       -.06                                                                                            -.06

                                              0             20       40         60       80                                                   0          20       40         60       80
                                                                      Lag                                                                                          Lag

                                                  A. Haiti, effect of any past win                                                            B. Haiti, effect of most recent win

                                       .02                                                                                             .02
   Estimated coefficient w/ 95% C.I.

                                                                                                   Estimated coefficient w/ 95% C.I.

                                         0                                                                                               0

                                       -.02                                                                                            -.02

                                       -.04                                                                                            -.04

                                              0         2        4          6        8        10                                              0      2        4          6        8        10
                                                                     Lag                                                                                          Lag

                                              C. Denmark, effect of any past win                                                        D. Denmark, effect of most recent win
Figure S1: Effect of prior wins on amount bet on a number, with lagged dependent variable
Notes: Authors’ calculations from individually identifiable, administrative lottery data from Haiti and Denmark. Each panel
represents one regression. Figures show OLS coefficients with 95% confidence intervals from regressions of the amount bet at
the player-number-round level on a set of binary variables that describe the winning history of the number over the lagged
rounds displayed on the horizontal axes. All regressions include player, number, and round fixed effects, as well as the value of
the dependent variable from the last period in which the player was active, with standard errors clustered at the player level.
The Boloto game in Haiti is played twice per day (60 rounds = 30 days). The System Lotto game in Denmark is played once
per week (4 rounds = 28 days). The relative magnitudes of estimated coefficients are much larger in Haiti, where the mean
value of the dependent variable is 0.094, than they are in Denmark, where the mean of the dependent variable is 2.762.

                                                                                               3
You can also read