The Gambler's Fallacy Prevails in Lottery Play - Brian Dillon
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
The Gambler’s Fallacy Prevails in Lottery Play∗† Brian Dillon∗‡ Travis J. Lybbert‡ July 19, 2021 Abstract We use large natural experiments in Haiti and Denmark to test recent theoretical pre- dictions about how agents react to random events. Using player-level administrative data from lotteries, we find that the average player avoids recent winners (the gam- bler’s fallacy). A small share of players exhibit the hot hand fallacy, and bet recent winners. We find no evidence of ‘streak switching,’ in which beliefs switch from the gambler’s fallacy to the hot hand fallacy as winning streaks grow. The consistency of findings across these different settings suggests that the cognitive underpinnings of the gambler’s fallacy are deeply rooted in human cognition. Keywords: Gambler’s fallacy; hot hand fallacy; lottery; law of small numbers; Haiti; Denmark. JEL codes: D91, D84, G41. ∗† We are grateful to Hilary Wething and Ben Glasner for excellent research assistance, and to Chris Barrett, Dan Benjamin, Supreet Kaur, Alex Rees-Jones, and Bruce Wydick for comments on an earlier draft. Any errors are our responsibility. ∗‡ Cornell University. Email: bmd28@cornell.edu. ‡ University of California, Davis. Email: tlybbert@ucdavis.edu. 1
1 Introduction People struggle to form correct statistical intuitions. Two common mistakes, the gambler’s fallacy (GF) and the hot hand fallacy (HHF), demonstrate the difficulty of understanding stochastic processes. The GF, the belief that an event drawn from an independent, identically distributed (i.i.d.) process is less likely to be drawn immediately after it wins (the expectation of reversals), betrays a mistaken sense that small samples should look like large samples (Kahneman et al., 1982; Rabin, 2002). The HHF is the belief in too much serial correlation, namely that an event is more likely to be drawn if it was drawn in the recent past (the expectation of persistence). A large literature has documented the prevalence of these biases and examined their consequences for choice and belief formation (see Benjamin (2019)). The GF and the HHF may appear to be mutually exclusive, because they suggest opposite reactions to recent events. However, theory allows for a possible relationship be- tween these biases (Edwards, 1961; Camerer, 1989; Rabin, 2002; Rabin and Vayanos, 2010). Consider an agent predisposed to the GF. If they observe an outcome drawn repeatedly from a random process, they believe it is less likely to be drawn again. If the streak continues, at some point the agent sees this as so implausible that they believe the data generating process (DGP) must favor that outcome. They now expect persistence. Such “streak switching” is one case in a broader model of belief formation developed by Rabin and Vayanos (2010). Streak switching requires uncertainty about the DGP. Streak switching has largely been motivated by applications in finance (Rabin and Vayanos, 2010; Suetens, Galbo-Jørgensen and Tyran, 2016). Yet its central features—an inclination to expect reversals, combined with a tendency to over-interpret the signal when reversals do not happen—align with anecdotal evidence from many settings. Strange weather events can be dismissed as anomalies, but frequent aberrations tend to make people more concerned about climate change (Deeg et al., 2019). Increased media coverage of police killings of black men has coincided with a rise in the share of whites who believe that black people in the US face “a lot of discrimination” (Tesler, 2020), possibly because attributing violence to rogue officers is easier after news of a single event (a short streak) than after repeated coverage of many such events (a long streak). Unjustified optimism about online 2
dating can lead individuals to dismiss initial negative experiences, but a string of bad experi- ences may lead the same users to lower their expectations so far that they give up on dating services altogether (Schetzer, 2017). These are complex scenarios, consistent with various models of belief formation. Yet, they highlight the difficulty of forming accurate statistical intuitions, and suggest that streak switching may be present in many domains. We test for foundational behaviors related to the GF, the HHF, and streak switching in large administrative data sets from a mobile phone-based lottery in Haiti and an online lottery in Denmark. Lotteries represent massive natural experiments, with known DGPs and repeated choices, outside the realm of finance. Our empirical analysis proceeds in three steps, each of which tests an assumption or prediction of the Rabin and Vayanos model. First, we use randomization inference to estimate the share of players who choose “hot” numbers—those that were winners in the previous round—more frequently than the random rate. This addresses the fundamental assumption in Rabin and Vayanos that the agent has a default GF bias. Second, we test whether the gambler’s fallacy predominates when there are no streaks or short streaks. Third, we use a semi-parametric approach to test for streak switching, i.e., to allow for heterogeneous reactions to winning streaks of different lengths. Leveraging the large scale and individually identifiable nature of the data, all of our regressions control for player, number, and round fixed effects. In Haiti, we find that 6.3% of players choose recent winners more often than suggested by chance. In Denmark, the share of these hot-hand types is higher, at 15.7%. However, in both settings the average player reduces betting on recent winners, indicating that the dominant average tendency is to succumb to the gambler’s fallacy. In Haiti, betting on a number falls by over a third after a recent win. The largest magnitude effect is for numbers that won two rounds ago, not one round ago, because the immediate deterrent effect of winning is attenuated by the bets of the hot-hand types. In Denmark, the average player also avoids recent winners. The effect is statistically significant but much smaller in magnitude than in Haiti, equivalent to a reduction of less than one percent of the mean bet. This could be due to differences in the relative sizes of player choice sets in the two lotteries (see Section 3.4). In both countries, the deterrent effect of a win decays gradually with time, but persists for several weeks. 3
We find no evidence in either country of a switch to HHF betting after longer streaks. In one of the main specifications for Haiti we find the opposite, namely that the deterrent effect of a streak increases monotonically in streak length. These findings are consistent with the predictions of Rabin and Vayanos (2010) when the DGP is known to be i.i.d. Popular narratives around the lottery suggest that many players believe lottery odds are mutable and not i.i.d. (see Bhatia (2010) and Section 2), which would satisfy the conditions for possible streak switching. While these narratives surely reflect the beliefs of some lottery players, they are not borne out in the average response to wins. The average player seems to believe that draws are i.i.d., but that small and large samples should look similar (hence the default GF bias) (Rabin, 2002). These findings contribute to the literature in three ways. First, we advance a long line of work on belief formation that has used aggregate lottery or casino data to understand how players react en masse to winning streaks (Clotfelter and Cook, 1993; Terrell, 1994; Scoggins, 1995; Papachristou and Karamanis, 1998; Farrell et al., 2000; Roger and Broihanne, 2007). With aggregate lottery data, it is impossible to distinguish within-player changes in betting from changes in the composition of the player pool. Some recent work has used individually identifiable lottery data to examine the role of number preferences and recent wins in shaping choice (Suetens and Tyran, 2012; Lien and Yuan, 2015; Wang et al., 2016; Suetens, Galbo- Jørgensen and Tyran, 2016). Only one of these studies, Suetens, Galbo-Jørgensen and Tyran (2016), is concerned with streak switching. Suetens, Galbo-Jørgensen and Tyran (2016) find suggestive evidence of streak switching in lottery data from Denmark; in Section 5.1 we explain why we arrive at a different conclusion using the same Denmark data set.1 Second, we present concrete evidence of two distinct player types in lottery play. The typical player in both countries is susceptible to the GF, yet a small but non-negligible share of players subscribe to the HHF. Future work on belief formation would benefit from incorporating these two types and modeling how their relative shares in the population influence aggregate outcomes in settings of choice under uncertainty.2 1 A more detailed comparison with SGT is in Appendix C. In Section 3 we explain why the Haiti data set, which is novel to this paper, is somewhat more conducive than the Denmark data to testing the predictions of Rabin and Vayanos (2010). 2 Throughout the analysis we ignore the possibility of streak selection bias identified by Miller and Sanjurjo (2018). Their central insight is that when streaks are defined as consecutive successes from a binary process, 4
Finally, our third contribution lies in the parallel analyses of lottery choices across two very different settings. While the Haiti and Denmark games are not identical (see Section 3.4), they share many of the same broad features. By many economic metrics, these two countries are as different as can be. The consistency of findings across these settings suggests that the cognitive underpinnings of the gambler’s fallacy are deeply rooted in human cognition. 2 Conceptual Background The model of Rabin and Vayanos (2010) explains how the HHF “might arise as a consequence of the GF” (p.731). The agent in the model is dogmatically predisposed to the GF, observes a sequence of draws from a DGP, and uses Bayesian inference to form beliefs about parameters. When the agent is uncertain about the DGP or believes that draws are serially dependent, they expect reversals after short streaks (the GF) but persistence after long streaks (the HHF). In contrast, if the DGP is i.i.d. and agent is told this, “they expect reversals after streaks of any length” (p. 751). Lotteries would seem to violate the conditions required for streak switching. The DGP is fixed, and draws are i.i.d. Yet, popular narratives reveal that many players believe that lottery odds are mutable. For many Haitian players, superstitions about the lottery are part of a broader religious worldview in which divine intervention and fate have direct bearing on daily life (Bhatia, 2010). In the US, there is a cottage industry of books that peddle secrets for beating the lottery, many of which promote combinations of both GF and HHF reasoning. Because it is beliefs about the DGP, rather than the true DGP, that are relevant for choice, it is plausible that even in the lottery setting we might find evidence of streak switching. These considerations motivate our two research questions. First, in the absence of the conditional probability that a streak continues is smaller than the unconditional probability of success. This bias can lead to incorrect rejection of the HHF in some settings. There are two reasons why streak selection bias is not relevant for our analysis. The first and foremost is that we do not define streaks as consecutive successes. Second, even if we ignore that fact, the lottery DGP represents many simultaneous Bernoulli trials from a single process. Players effectively observe multiple sequences from the “same coin.” Miller and Sanjurjo (2018) show in their Appendix A.2 that the bias is vanishingly small when the agent observes multiple sequences. 5
streaks, do lottery players demonstrate a default attachment to the GF? This addresses the maintained assumption about player types that underlies the Rabin and Vayanos (2010) model. Second, do superstitions or other beliefs lead players to ignore the i.i.d. nature of lottery draws, succumb to streak switching, and bet “hot” numbers after long streaks? Evidence of streak switching in our lottery contexts would imply that the average player does not believe the lottery is i.i.d. 3 Lottery Details and Data In this section we describe the Haiti and Denmark data sets, analyze the rates of hot- hand play, discuss the differences in the two lotteries, and explain how we define the main dependent variables. 3.1 The Boloto Mobile Phone Lottery in Haiti The lottery is part of the rhythm of daily life in Haiti. Millions of Haitians play frequently, and some of the working poor routinely wager a large share of their daily income (Bernstein, 2015). Players often select numbers based on superstitions and dreams. Concordances known as the Tchala, which are available at every lottery stall and online,3 translate elements of dreams into numbers. While most games are administered by physical lotto stalls, digital lotteries played on mobile phones have gained popularity in recent years, particularly among younger Haitians in urban and peri-urban areas. We study a mobile phone lottery game called Boloto. Like all Haitian lottery games, Boloto is played twice each day, corresponding to the midday and evening numbers drawn in the New York Lottery (to ensure transparency). To participate, a player places a bet consisting of three two-digit number pairs (00-99) in a specified order. The cost of each bet is 25 Haitian gourdes (HTG), or about 0.60 USD in 2012. There is no limit to the number of bets a single player can make in each round. To bet more money on a set of numbers, a player simply places additional bets. The payout for Boloto is a function of which number 3 For an online version of the Tchala see http://lisa.ht/tchala/ (Accessed 21 November 2018). 6
matches the draw.4 15000 .05 10: 4.7% .04 10000 Number of bets .03 0: 2.7% Density 11: 2.5% 33: 2.4% 5000 13: 2.1% .02 .01 0 12 12 2 12 13 01 20 20 20 20 g2 b ay v n 0 fe au no ja m 20 01 01 01 0 20 40 60 80 100 01 Date Number played A. Number of bets by round, Haiti B. Histogram of numbers played, Haiti Figure 1: Betting patterns in the Haiti Boloto Notes: Authors’ calculations from individually identifiable, administrative lottery data from Haiti. Panel A shows number of bets placed over 730 rounds of play—twice per day, for a year—in Haiti. Each bet in Haiti consists of three numbers from set {0, 1, ..., 99}; Panel C shows the histogram of numbers played. The private firm that conducts the Boloto provided us with data for the universe of bets placed from February 1, 2012 to January 31, 2013.5 For each bet we observe the player ID, the date of the game, an indicator for midday/evening round, the ordered set of three two-digit number selections, the time and date of the bet, and the winning numbers. Player IDs are unique codes linked to mobile phone accounts. Across the 730 rounds (2 per day, for a year), a total of 4,505,519 bets were placed, representing over 13.5 million number choices. We observe bets from 112,808 different players. The average player makes 39.9 bets, in 12.7 different rounds, in 2 different months, on 9.1 different days. Panel A of Figure 1 shows the number of bets placed in Boloto, by round.6 A weekly cycle of activity is clearly visible. Increased activity during the months July–October might 4 Winning in the first, second, or third position pays out 250 HTG (10x), 100 HTG (4x) or 50 HTG (2x), respectively. Picking all three winners, out of order, pays 100,000 HTG (4,000x). Picking all three winners in order wins the jackpot, which pays out 2,000,000 HTG (80,000x). Payouts are independent across players, except if there are multiple jackpots, in which case the winners split the payout. Jackpot splitting is never observed (there are only a few jackpots in the data, and none are shared). We assume that the possibility of splitting a jackpot does not shape individual number choices. 5 There are 366 days in that range, because 2012 was a leap year, but only 365 days with betting (there is no data for Christmas Eve). 6 The spike on October 17 coincides with Dessalines Day, a national holiday commemorating the assas- sination of Haiti’s founder. 7
be associated with summer visits from Haitians living abroad, or with positive income shocks from the September-October harvest. Panel B of Figure 1 shows the histogram of numbers played. The most popular choice, 10, represents 4.7% of all plays. The four next most popular numbers are 00, 11, 33, and 13, all of which are played at more than twice the random rate. 3.2 The System Lotto Online Lottery in Denmark System Lotto is a weekly, online lottery game in Denmark.7 In this game, seven winning numbers are drawn each week, without replacement, from the positive integers 1, . . . , 36.8 There are five categories of winnings corresponding to choosing different shares of the drawn numbers. The jackpot, which is won by picking all seven numbers, pays out 11.25% of total ticket revenues. Unclaimed jackpots are rolled over to the next round. Payouts in System Lotto are pari-mutuel (divided among all winners in each category). To participate in System Lotto, players select between 8 and 31 numbers from the set {1, 2, . . . , 36}. The online system then randomly selects 7 of those numbers to be the player’s bet. Each bet costs 3 Danish krone (DKK), or about 0.48 USD at the time. Players can increase the wager on a number by purchasing more tickets and/or choosing fewer numbers per bet. They can also select more than the maximum 31 numbers per round by purchasing multiple tickets and varying their number choices across tickets. In administrative data from 28 weeks of System Lotto play in 2005, provided by SGT, we observe at least one bet by 25,807 players. The data set includes a unique ID for each player.9 The average player participates in 11.2 of the 28 observed weeks, makes 33.1 separate ticket purchases per round, and chooses 13.5 different numbers per round (ranging from 8 to 36). Panel A of Figure 2 shows the number of bets per round in System Lotto. The two spikes in play correspond to periods with consecutive jackpot rollovers and hence higher 7 All of the details we provide about the System Lotto game structure are drawn from SGT. See their paper for more details about the game and about lotteries in Denmark. 8 Although numbers are drawn without replacement, the lottery is i.i.d. from the player’s perspective, because no choices are made between the draws of a single round. 9 The data is available, for the purpose of replicating SGT, from the website of the Journal of the European Economic Association. 8
500000 .05 400000 .04 Number of bets 7: 3.2% 300000 1: 3.1% 13: 3.1% 19: 3.0% .03 Density 200000 .02 100000 .01 0 5 5 05 05 05 00 00 20 20 20 0 2 g2 un p v ec au se no d j 19 21 0 6 12 18 24 30 36 24 05 07 Date Number played A. Number of bets by round, Denmark B. Histogram of numbers played, Denmark Figure 2: Betting patterns in the Denmark System Lotto Notes: Authors’ calculations from individually identifiable, administrative lottery data from Denmark. Panel A shows the number of bets per round across 28 weekly rounds in Denmark. Each bet in Denmark consists of 8-31 numbers from set {0, 1, ..., 36}; Panel B shows the histogram of numbers played. potential winnings. Panel B of Figure 2 shows the histogram of numbers played. The most popular choice, 7, represents 3.2% of all plays. Other popular numbers are 13, 1, and 19. There is much less variation in the relative popularity of numbers in Denmark than in Haiti (comparing panels C and D), possibly because players in Denmark must at minimum choose 8 of 36 possible options each round (25%), while players in Haiti can choose as few as 1 of 100 possible options per round (1%). 3.3 Analyzing rates of hot hand play We use randomization inference to test the null hypothesis that each player chooses hot numbers—those that were winners in the previous round—at a rate less than or equal to that which would be consistent with randomly picking numbers. We interpret a rejection of this hypothesis as an indication that the player is to some extent susceptible to the hot hand fallacy, or has some other preference for hot numbers unrelated to beliefs about their winning probability. See Appendix D for details on the implementation of these tests. Panel A of of Figure 3 shows a player-level histogram of p-values from a test of the null hypothesis that the player chooses hot numbers at a rate less than or equal to random. While nearly two thirds (63.8%) of players in Haiti never make a hot hand bet (not shown), 9
6.3% choose hot numbers with a frequency that is significantly greater than the random rate (95% confidence). The rate of hot hand play is roughly constant across the year, although there are occasional spikes. The round after the winning numbers were 5-50-55, nearly 40% of bets included a hot number. The next five rounds with the highest shares of hot hand betting occur after a win by 0 or 10, two of the most popular number choices. 15 15 p-values shown for players with hot hand play rate > random p-values shown for players with hot hand play rate > random Among this group, 25.5% significantly > random (95% confidence) Among this group, 32.1% significantly > random (95% confidence) Among all players, 6.3% significantly > random (95% confidence) Among all players, 15.7% significantly > random (95% confidence) 10 10 Density Density 5 5 0 0 0 .05 .25 .5 .75 1 0 .05 .25 .5 .75 1 p-values, null hypothesis is that hot hand play rate
choices, if their beliefs about other players’ choices affect their own selections. Second, the online system in Denmark chooses seven of each player’s selected numbers, at random, to be their actual bet. Although this intermediate selection is unaffected by recent wins, we do not know what players believe about the subselection process. This step is like an individual mini-lottery within each round, which may itself be subject to biased beliefs. Third, System Lotto has less coverage than Boloto. The time period for the Haiti data is nearly twice as long, and involves 730 rounds of play compared to 28 in Denmark. There are almost five times as many unique players in Haiti than in Denmark (the population of Haiti was roughly twice that of Denmark during the periods that generated the data). Finally, the probability of a number winning is much higher in System Lotto than Boloto. To faciliate comparison with SGT we will treat the previous six rounds as the period of recent history for assessing GF, HHF, and streak switching in Denmark (see below). With seven unique numbers winning each week, in expectation there will be 26.16 (out of 36) numbers that are winners over any six consecutive rounds in Denmark. With so many “streaking” numbers, it is difficult for players to react similarly to all of them (and we see below that they in fact do not). Despite these differences, an important aspect of our study is that we are able to implement similar tests in individually identifiable lottery data from two countries that differ in many respects. In 2019, Denmark ranked 11th in terms of the Human Development Index, while Haiti ranked 169th. Any commonalities that emerge between these two settings would seem to be suggestive of deeply engrained patterns in human cognition and statistical perception, rather than purely environmental factors. 3.5 Construction of Dependent Variables Our analysis examines the relationship between a number’s winning history and the prob- ability that it is bet. To construct an outcome variable for the Haitian Boloto, we first represent each number selection as 100 separate choices: 1 decision to play a number, and 99 decisions not to play all others. This allows us to take advantage of both player and number fixed effects. Let dijnrp be a dummy variable equal to 1 if player i in bet j plays number n in round r in position p, and 0 otherwise. The position p refers to the first, second, or 11
third number in the bet. The numbers n lie in the set {00, 01, 02, . . . , 99}. The round, r, includes both the date and the time (midday or evening) of the game. The bet indicator, j, captures the possibility that a player places multiple bets per round. We use Jir to denote the number of bets placed by player i in round r. In our analysis the dependent variable is P layedinr = Jj=1 P ir P3 p=1 {dijnrp }, a count of the number of times in round r that i played n in any position and any bet.10 This is approximately proportional to the amount wagered on the number. At the player-number-round level, the full dataset contains a little over 143 million observations. The mean value of P layedinr in the Haiti data is 0.094. To construct an outcome variable for the Danish System Lotto, we perform a similar transformation, so that each selection of m numbers is characterized as m separate decisions to play a number and 36-m decisions not to play the others. We then define P layedinr as the effective bet placed on number n by player i in round r. This value is calculated by multiplying the total amount wagered by player i in round r with the share of the number n in i’s round r number selections. This is equivalent to the dependent variable “Money bet” in SGT, and is roughly comparable to the definition of the dependent variable for Haiti. The mean value of P layedinr in the Denmark data is 2.76. 4 Empirical Approach We use a common set of empirical specifications to separately analyze the Haiti and Denmark data. To provide a baseline characterization of how the amount bet on a number is related to its recent success, we first estimate OLS regressions of the following form: R X P layedinr = βl W innern,r−l + ηXinr + inr (1) l=1 where P layedinr is as defined in the previous section; W innern,r−l is a binary variable indi- cating whether n was one of the drawn numbers in round r − l; Xinr includes fixed effects for players, numbers, and rounds; and inr is a statistical error term. With a sufficiently large 10 Our findings are broadly similar if we define the dependent variable as P layedDummyinr = maxJj=1 ir {max3p=1 {dijnrp }}, which measures the extensive margin choice to play a number at the player- number-round level. See Appendix B. 12
choice of R, a plot of the βl coefficients will non-parametrically trace out the time path of effects of past wins on current betting. Evidence of βl < 0 (βl > 0) is consistent with a GF effect (HHF effect) that persists for l rounds. In practice, we set R to be large enough that at the longer lags there is no evidence of an effect on betting.11 Specification (1) does not account for the probability that a number drawn in round r − l will be drawn again prior to round r, which is increasing in l. The effect of winning, es- pecially winning in the distant past, may be underestimated if a number wins multiple times. To account for this, we also estimate OLS regressions based on the following specification: R X P layedinr = βl M ostRecentW inn,r−l + ηXinr + inr (2) l=1 which is identical to (1), except the key independent variable M ostRecentW inn,r−l takes a value of 1 only if r − l is the most recent round in which n was drawn, and 0 otherwise. Once again, a finding of βl < 0 (βl > 0) is consistent with the GF (HHF). Estimation of specifications (1) and (2) provides the average effect of lagged wins on current betting. To test predictions about players’ reactions to streaks, we need to allow for more complex interactions between past events. Lengthy winning streaks are rare in both games, but particularly so in Boloto where the probability that a specific number wins in any given round is only 0.0297. Following SGT, we define a streak as the co-occurrence of a win in the previous round with a history of winning in other recent rounds. Formally, let Hotnessnr be the number of times that n was a winner during rounds r − 2 to r − S, for some integer S ≥ 2; and let Hotnrc be a dummy variable equal to 1 if Hotnessnr = c, and 0 otherwise. For each r, the winning streak of number n is given by W innern,r−1 ×{W innern,r−1 +Hotnessnr } (e.g., the streak has length 3 if n was drawn in the previous round and was drawn twice in rounds 2 . . . S). To semi-parametrically estimate the average response to streaks of different 11 Because the Haitian Boloto is played twice per day and the Danish System Lotto is played once per week, R is much larger for Haiti than for Denmark (i.e., it covers more rounds), but it covers a shorter time period. 13
length, we estimate OLS regressions of the following form: C X P layedinr = βW innern,r−1 + {δc Hotnrc + γc (W innern,r−1 × Hotnrc )} + ηXinr + inr c=1 (3) where all variables are as defined above, C is sufficiently large to include all observed streaks, and inr is a statistical error term. For Denmark we follow SGT and set S = 6, which covers the previous six rounds / weeks. For the Boloto in Haiti, which is played more often and hence provides more flexibility in how we define recent events, we report results for S ∈ {6, 14, 60}, equivalent to defining streaks over the previous 3, 7, and 30 days. There is an important tension in the choice of lag used to define streaks. The esti- mated effect of past wins on betting will be attenuated if we use a recall period that extends back beyond when wins are salient (which likely varies across players), because players will be reacting to streaks that they perceive to be shorter than those defined by us. The baseline findings from equations (1) and (2) will provide some indication of the persistence of any influence of past results on number selection. In equation (3), the player, number, and round fixed effects account for average differences between players, average popularity of numbers, and temporal patterns in betting. The total effect on current betting of a streak of length 1 is given by β. The total effect of a streak of length d > 1 is νc = β + δc + γc , where c = d − 1. If players are not influenced by either the GF or the HHF, we expect β = δc = γc = 0 for all c. The predictions of Rabin and Vayanos (2010) are equivalent to (i) β < 0, (ii) νc < 0 for all c, and (iii) νd < νc for any d > c (because the model predicts that longer streaks induce a larger GF effect). Alternatively, if players exhibit streak switching—i.e., if the average player does not believe that the lottery DGP is i.i.d.—then we expect β < 0 and νc > 0 for all c of sufficient length. Estimates based on equation (3) provide average effects for streaks of a given length. To allow for complete flexibility in the estimated response to any combination of past wins, we also estimate a fully non-parametric model for the previous six rounds (S = 6). In this model, the dependent variable is P layedinr , and the independent variables are dummy variables for all observed combinations of wins during the previous 6 rounds. As always, we 14
include player, number, and round fixed effects. For all models we report standard errors clustered at the player level. We do not impose balance on the panel, taking as given the extensive margin decision to participate in the lottery in any particular round (however, in a robustness check described below, we partially account for game entry and exit). 5 Results For our baseline estimates of equations (1) and (2) in Haiti, we use a set of 84 dummy variables covering every round in the previous 6 weeks (R = 84). For Denmark, we use a set of 10 dummy variables covering the previous 10 weeks (R = 10). Panel A of Figure 4 plots the coefficients on the dummy variables representing a win each lagged period, along with 95% confidence intervals, for equation (1) estimated with the Haiti data. Panel B shows the equivalent plot for equation (2). Panels C and D provide the analogous plots for Denmark. In Haiti, the average effect of a number being drawn in lagged rounds 2-84 indicates a surprisingly persistent GF response. Winning never leads to an increase in betting, on average. Players avoid numbers that have won recently, but the effect is attenuated as the win fades into the past. The deterrent effect of a recent win is statistically significant for approximately 60 rounds (30 days).12 A win in lagged round 2, 3, or 4 reduces the number of times a number is selected by 0.032–0.035, a reduction of over a third from the mean of 0.094. The effects are even larger in magnitude when we restrict attention to only the most recent win (Panel B). The exception to the pattern of diminishing effect size in Haiti is from a win in the immediately preceding round. The point estimate for a win in the preceding round is approximately −0.01 in Panel A, less than a third of the magnitude of the effect of a win in lagged rounds 2–4. This attenuation may be driven by the 6.3% of ‘hot hand types’ in Haiti (Figure 3, Panel A). For Denmark, we also see evidence in the baseline analysis of a GF effect after recent wins (panels C and D of Figure 4). The average effect of a win in the previous 1-2 rounds 12 How do players keep track of numbers’ winning histories so far back? Some physical lottery stalls post the recent winning numbers. It is also a popular service to receive the winning numbers by text message, which provides a convenient archive of recent wins. 15
0 0 Estimated coefficient w/ 95% C.I. Estimated coefficient w/ 95% C.I. -.02 -.02 -.04 -.04 -.06 -.06 0 20 40 60 80 0 20 40 60 80 Lag Lag A. Haiti, effect of any past win B. Haiti, effect of most recent win .02 .02 Estimated coefficient w/ 95% C.I. Estimated coefficient w/ 95% C.I. 0 0 -.02 -.02 -.04 -.04 0 2 4 6 8 10 0 2 4 6 8 10 Lag Lag C. Denmark, effect of any past win D. Denmark, effect of most recent win Figure 4: Effect of prior wins on amount bet on a number Notes: Authors’ calculations from individually identifiable, administrative lottery data from Haiti and Denmark. Each panel represents one regression. Figures show OLS coefficients with 95% confidence intervals from regressions of the amount bet at the player-number-round level on a set of binary variables that describe the winning history of the number over the lagged rounds displayed on the horizontal axes. All regressions include player, number, and round fixed effects, with standard errors clustered at the player level. The Boloto game in Haiti is played twice per day (60 rounds = 30 days). The System Lotto game in Denmark is played once per week (4 rounds = 28 days). The relative magnitudes of estimated coefficients are much larger in Haiti, where the mean value of the dependent variable is 0.094, than they are in Denmark, where the mean of the dependent variable is 2.762. 16
is to reduce the amount bet by 0.013-0.022 (DKK). These are much smaller effects than we find in Haiti, representing less than one percent of the mean bet of 2.76 DKK. The deterrent effect of a win disappears after two rounds. On average, System Lotto players do not react systematically to wins that occurred three or more weeks previously. Table 1 shows estimates of equation (3) for both countries. Columns 1, 2, and 3 report the findings for streaks defined over the previous 3 days, 7 days, and 30 days, using the Haiti data. Panel A reports coefficient estimates, and Panel B reports the estimated marginal effects (νc ). Across all streak lengths and specifications, there are no positive marginal effects of a winning streak on the probability that a number is bet. All 13 of the estimated effects in Panel B, columns 1–3 are negative and statistically different from zero. In column 1, the negative effect of a win streak on the probability that a number is selected is increasing in streak length. Betting on a number falls by 0.011 percentage points (11.6% of the mean), 0.021 percentage points (22.1%), and 0.031 percentage points (32.6%) after streaks of length 1, 2, and 3, respectively. The finding that the deterrent effect increases in streak length is consistent with Rabin and Vayanos (2010) when the DGP is i.i.d. (and participants are aware of that). When we define streaks over periods of 7 or 30 days, the GF again dominates (columns 2 and 3), but the marginal effects do not increase monotonically in streak length. This could be due to using an overly long window for streak definition, as discussed in Section 4. As we saw in Figure 4, the magnitude of the GF effect is smaller for more distant wins, indicating a weakening over time of the salience of past wins. Column 4 of Table 1 reports estimates from specification (3) using the Danish data. As in Haiti, all of the estimated marginal effects are negative. Only the effects of streaks of length 1 and 2 are statistically different from zero, and the magnitude of the deterrent effect does not increase in streak length. Effect sizes are again much smaller in Denmark than in Haiti. In Haiti the magnitudes represent reductions in betting that are 13-38% of the mean, while in Denmark the statistically significant effects represent reductions on the order of one hundredth of a percent of the mean (specifically, 0.007-0.012%). Table 2 show estimates from the fully non-parametric model, in which the independent variables are dummy variables for all observed combinations of wins during the previous 6 rounds. Columns 1-3 are for Haiti, and Columns 4-6 are for Denmark. There is no pattern 17
Table 1: The Effects of Winning Streaks on Betting Dependent variable: Amount of money bet by player i on number n in round r HAITI DENMARK Lag used to define streaks Specification 3 days 7 days 30 days Ours Modified SGT (1) (2) (3) (4) (5) Panel A: Estimated Coefficients Winner -0.012*** -0.013*** -0.016*** -0.033*** -0.061*** (0.001) (0.001) (0.001) (0.010) (0.010) Hot 1 -0.031*** -0.026*** -0.022*** -0.009 -0.007* (0.001) (0.001) (0.001) (0.006) (0.004) Hot 2 -0.037*** -0.034*** -0.033*** -0.005 0.003 (0.001) (0.001) (0.001) (0.007) (0.005) Hot 3 -0.044*** -0.039*** -0.040*** -0.006 -0.015** (0.003) (0.001) (0.001) (0.010) (0.006) Hot 4 -0.035*** -0.038*** -0.022 0.010 (0.005) (0.001) (0.019) (0.019) Hot 5 -0.037*** (0.003) Winner × Hot 1 0.020*** 0.011*** 0.010*** 0.020** 0.051*** (0.001) (0.001) (0.001) (0.008) (0.009) Winner × Hot 2 0.012*** 0.021*** 0.014*** 0.020** 0.053*** (0.003) (0.002) (0.001) (0.009) (0.010) Winner × Hot 3 0.036*** 0.025*** 0.032** 0.019 (0.004) (0.002) (0.016) (0.019) Winner × Hot 4 0.023*** 0.006 0.106** (0.003) (0.038) (0.045) Winner × Hot 5 0.024*** (0.006) Observations 1.42e+08 1.42e+08 1.42e+08 1.04e+07 1.01e+07 R2 0.053 0.053 0.053 0.347 0.485 Mean of dep. variable 0.095 0.095 0.095 2.762 2.756 Panel B: Marginal Effects Streak length 1 -0.012*** -0.013*** -0.016*** -0.033*** -0.061*** (0.001) (0.001) (0.001) (0.010) (0.010) Streak length 2 -0.023*** -0.028*** -0.028*** -0.021** -0.018** (0.001) (0.001) (0.001) (0.010) (0.008) Streak length 3 -0.036*** -0.027*** -0.034*** -0.017 -0.005 (0.004) (0.002) (0.002) (0.012) (0.009) Streak length 4 -0.016*** -0.030*** -0.007 -0.058*** (0.004) (0.002) (0.017) (0.015) Streak length 5 -0.030*** -0.049 0.055 (0.004) (0.039) (0.038) Streak length 6 -0.030*** (0.005) Notes: Authors’ calculations from individually identifiable, administrative lottery data from Haiti and Denmark. W inner is an indicator variable for the number being drawn in the previous round. Hot X is an indicator for the number being drawn X times in rounds r − 2 to r − S. S is noted in the header for columns 1-3, and S = 6 for columns 4-5. Regressions in columns 1-4 include player, number, and round fixed effects. The regression in column 5, which matches a main specification in Suetens, Galbo-Jørgensen and Tyran (2016), includes number fixed effects, a lagged dependent variable, and an indicator for jackpot roll-over weeks. Relative to SGT, the only modification in column 5 is that hotness enters non-parametrically here, and linearly in SGT. Standard errors in all regressions are clustered at the player level. 18
of recent wins that increases average betting on a number in Haiti. Out of 37 coefficients in column 1, 35 are statistically different from zero, and all of those are negative. Longer streaks are almost universally associated with the largest reductions in betting in Haiti, confirming that the pattern in column 1, Panel B, Table 1 is not an artifact of an outlier or single event. Columns 4-6 of Table 2 show the non-parametric estimates for Denmark. There are six statistically significant effects for combinations that include a win in the previous round (lag 1). All are negative, except for the weakly positive effect of winning in periods {1, 3, 5, 6}. This appears to be spurious, as the point estimates for wins in lags {1, 2, 3, 5}, {1, 2, 4, 5}, and {1, 2, 3, 4, 6} are all negative (if imprecise), and we know from column 4 of Table 1 that the average effects of streaks of any length are never positive and statistically significant. The central takeaway from our main results in both Haiti and Denmark is that players tend to bet in accordance with the GF. There is no evidence of HHF betting or streak switching on average. We find that the GF effect is stronger for longer streaks, but only when using the shortest recall period in Haiti. These results are robust to the inclusion of a lagged dependent variable to account for entry and exit related to wins by preferred numbers (Appendix A), and to defining the dependent variable as the extensive margin decision to play a number (Appendix B). 5.1 Reconciling our Results with SGT SGT use the System Lotto data from Denmark to test predictions very similar to those tested here. They claim to find evidence of streak switching. Specifically, they conclude that “players tend to bet less on numbers that have been drawn in the preceding week, as suggested by the ‘gambler’s fallacy’, and bet more on a number if it was frequently drawn in the recent past, consistent with the ‘hot-hand fallacy’ (Suetens, Galbo-Jørgensen and Tyran, 2016). Why do our analyses arrive at different conclusions about streak switching from the same data set? There are a few differences in the empirical specifications used by us and SGT, all of which likely matter to some degree. But the fundamental difference arises from SGT’s deci- sion to impose linearity in the possible effect of Hotness on the amount bet on a number. In 19
specification (3), the hotness of a number enters our regression equation non-parametrically. SGT force that relationship to be linear. Because the large majority of streaks are short (Table 2, column 6), the slope coefficient on Hotness in SGT is determined primarily by changes that occur between streaks of length 0, 1, and 2. In Table 1, column 4, Panel B, the coefficient on a streak of length 1 is -0.033, and that on a streak of length 2 is -0.021. Extrapolating this line linearly, one would eventually find that long streaks lead to hot hand betting, which looks like streak switching. But as our non-parametric analysis reveals, such extrapolation would be spurious. Gamblers in Denmark do not react systematically to winning streaks longer than 2. To further verify this interpretation, we re-estimate SGT’s main specification, with the only difference that we relax their linearity assumption.13 To match their model we omit player and round fixed effects from equation (3), include a lagged dependent variable, and include a control for rollover jackpots. Column 5 of Table 1 shows the results. The implied “slope” of the line between streaks of length 1 and 2 is steeper than in our preferred specifi- cation (compare columns 4 and 5, panel B). It is easy to see how linear extrapolation of these coefficients would generate the appearance of streak switching. Yet, any such appearance would be spurious. There are three statistically significant marginal effects in column 5, and all are negative. The coefficient on streaks of length 5 is positive, but not statistically differ- ent from zero (and is identified by only a single instance in which a number achieves a streak of 5). Relaxing linearity in the SGT model removes any indication of streak switching.14 13 Here we focus on SGT’s analysis of “active players.” In other models they impose balance on the panel by assigning zeroes for all numbers during non-played rounds. We do not estimate such models for Haiti or for Denmark. 14 See Appendix C for a discussion of other differences between our approach and the SGT analysis. 20
Table 2: The Effects of Winning Streaks on Betting: Non-Parametric Dependent variable: Amount of money bet by player i on number n in round r HAITI DENMARK Standard Number of Standard Number of Coefficient error occurrences Coefficient error occurrences (1) (2) (3) (4) (5) (6) Win in lag 1 -0.012*** 0.001 1867 -0.032*** 0.010 61 Win in lag 2 -0.037*** 0.001 1870 -0.030*** 0.009 65 Win in lag 3 -0.034*** 0.001 1870 -0.009 0.007 60 Win in lag 4 -0.032*** 0.001 1867 -0.003 0.006 63 Win in lag 5 -0.027*** 0.001 1864 -0.002 0.006 65 Win in lag 6 -0.026*** 0.001 1852 0.002 0.006 60 Win in lag 1, 2 -0.016*** 0.002 52 -0.012 0.013 16 Win in lag 1, 3 -0.023*** 0.002 56 -0.030** 0.012 21 Win in lag 1, 4 -0.027*** 0.003 54 -0.033** 0.013 17 Win in lag 1, 5 -0.020*** 0.002 58 -0.024* 0.014 15 Win in lag 1, 6 -0.030*** 0.002 58 -0.008 0.013 21 Win in lag 2, 3 -0.036*** 0.002 51 -0.014 0.012 15 Win in lag 2, 4 -0.036*** 0.002 55 -0.004 0.013 18 Win in lag 2, 5 -0.042*** 0.002 52 -0.026** 0.012 19 Win in lag 2, 6 -0.037*** 0.002 59 -0.007 0.012 16 Win in lag 3, 4 -0.040*** 0.001 50 -0.013 0.012 15 Win in lag 3, 5 -0.036*** 0.002 55 0.012 0.011 20 Win in lag 3, 6 -0.039*** 0.002 54 -0.010 0.010 23 Win in lag 4, 5 -0.032*** 0.001 49 0.001 0.011 14 Win in lag 4, 6 -0.035*** 0.002 58 0.022** 0.010 17 Win in lag 5, 6 -0.029*** 0.001 49 -0.002 0.011 15 Win in lag 1, 2, 3 -0.012 0.009 1 -0.018 0.020 4 Win in lag 1, 2, 4 0.005 0.014 2 -0.019 0.017 6 Win in lag 1, 2, 5 -0.047*** 0.011 2 -0.013 0.020 3 Win in lag 1, 2, 6 -0.016 0.034 1 Win in lag 1, 3, 4 -0.050*** 0.005 3 -0.021 0.023 3 Win in lag 1, 3, 5 -0.021 0.019 4 Win in lag 1, 3, 6 -0.047*** 0.009 1 0.013 0.033 2 Win in lag 1, 4, 5 -0.039*** 0.007 3 -0.003 0.019 6 Win in lag 1, 4, 6 -0.022 0.018 5 Win in lag 1, 5, 6 -0.085*** 0.009 1 -0.049** 0.023 3 Win in lag 2, 3, 4 -0.045*** 0.005 1 -0.041* 0.022 3 Win in lag 2, 3, 5 -0.045*** 0.013 2 0.004 0.023 3 Win in lag 2, 3, 6 -0.056*** 0.012 2 -0.015 0.020 3 Win in lag 2, 4, 5 -0.056*** 0.005 3 -0.028 0.026 2 Win in lag 2, 4, 6 0.011 0.020 5 Win in lag 2, 5, 6 -0.059*** 0.005 3 -0.002 0.018 7 Win in lag 3, 4, 5 -0.028*** 0.006 1 0.012 0.018 5 Win in lag 3, 4, 6 -0.043*** 0.014 2 -0.043** 0.019 4 Win in lag 3, 5, 6 -0.053*** 0.005 3 0.053 0.034 1 Win in lag 4, 5, 6 -0.041*** 0.005 1 0.000 0.018 5 Win in lag 1, 2, 3, 5 -0.028 0.023 3 Win in lag 1, 2, 4, 5 -0.017 0.034 1 Win in lag 1, 2, 4, 6 0.003 0.030 1 Win in lag 1, 3, 5, 6 0.065* 0.035 1 Win in lag 1, 4, 5, 6 -0.013 0.038 1 Win in lag 2, 3, 4, 5 0.015 0.044 1 Win in lag 2, 3, 4, 6 0.003 0.025 2 Win in lag 2, 3, 5, 6 -0.097*** 0.035 1 Win in lag 3, 4, 5, 6 -0.015 0.037 1 Win in lag 1, 2, 3, 4, 6 -0.049 0.040 1 Observations 142462600 10379916 R2 .053 .347 Notes: Authors’ calculations from individually identifiable, administrative lottery data from Haiti and Denmark. All regressions include player, number, and round fixed effects, with standard errors clustered at the player level. Column 3 reports the count of times over 73,000 observed number-rounds in Haiti that a number was a winner in the listed combination of lags. Column 6 reports the equivalent counts over 1,008 observed number-rounds in Denmark. Columns 1-3 are left blank for combinations of lagged wins that are never observed in Haiti. 21
6 Conclusion This paper uses large natural experiments in different settings to test predictions of the Rabin and Vayanos (2010) model of streak switching. In our context, the DGP is i.i.d., but players may not believe that it is, leaving open the possibility of complex biases. We find broad support for the foundational conditions that make streak switching possible, but we reject the idea that the average player believes the lottery is not i.i.d. In both Haiti and Denmark, the average player bets in accordance with the GF by avoiding numbers that recently won, and never bets the hot hand. We also find evidence of ideological attachment to the HHF by a small share of players. In Haiti (Denmark), 6.3% (15.7%) of players bet recent winners more frequently than would be predicted by chance. This identification of two distinct player types in high-frequency administrative data may provide new foundations for models in which players of finite types make (potentially biased) choices that have important influence on aggregate outcomes. Perhaps the most striking aspect of these results is their qualitative similarity in Haiti and Denmark, despite the many differences between these countries. This suggests that ideological attachment to the GF, which is a precondition for streak switching, is a deeply rooted aspect of human cognition. While applications from finance in rich countries motivated the formulation of streak switching models such as Rabin and Vayanos (2010), we speculated in the introduction that streak switching may help us understand the evolution of beliefs about climate change, racial bias, and online dating. If baseline attachment to the GF is as universal as is implied by our findings, it seems all the more likely that streak switching could be present in these other choice domains, where agents may be even less likely to believe that the DGP is i.i.d. (and where it may indeed not be). Applying the streak switching framework to the study of choice and belief formation in other real-world contexts is a promising avenue for future research. 22
References Benjamin, Daniel J. 2019. “Errors in probabilistic reasoning and judgment biases.” In Handbook of Be- havioral Economics: Applications and Foundations 1. Vol. 2, 69–186. Elsevier. Bernstein, Rachel L. 2015. “In Pursuit of the Transformational Sum: Lottery and Savings in Haiti.” University of California, Davis. Bhatia, Pooja. 2010. “Dream Ticket.” The National, Friday, April 2: 3–5. Camerer, Colin F. 1989. “Does the Basketball Market Believe in theHot Hand,’ ?” The American Economic Review, 79(5): 1257–1261. Clotfelter, Charles T, and Philip J Cook. 1993. “Notes: The “gambler’s fallacy” in lottery play.” Management Science, 39(12): 1521–1525. Deeg, K, E Lyon, A Leiserowitz, E Maibach, and J Marlon. 2019. “Who is changing their mind about global warming and why?” Yale University and George Mason University. New Haven, CT: Yale Program on Climate Change Communication. Edwards, Ward. 1961. “Probability learning in 1000 trials.” Journal of Experimental Psychology, 62(4): 385. Farrell, Lisa, Roger Hartley, Gauthier Lanot, and Ian Walker. 2000. “The demand for lotto: the role of conscious selection.” Journal of Business & Economic Statistics, 18(2): 228–241. Judson, Ruth A, and Ann L Owen. 1999. “Estimating dynamic panel data models: a guide for macroe- conomists.” Economics letters, 65(1): 9–15. Kahneman, Daniel, Stewart Paul Slovic, Paul Slovic, and Amos Tversky. 1982. Judgment under uncertainty: Heuristics and biases. Cambridge university press. Lien, Jaimie W, and Jia Yuan. 2015. “The cross-sectional “Gambler’s Fallacy”: Set representativeness in lottery number choices.” Journal of Economic Behavior & Organization, 109: 163–172. Miller, Joshua B, and Adam Sanjurjo. 2018. “Surprised by the hot hand fallacy? A truth in the law of small numbers.” Econometrica, 86(6): 2019–2047. Nickell, Stephen. 1981. “Biases in dynamic models with fixed effects.” Econometrica: Journal of the econometric society, 1417–1426. Papachristou, George, and Dimitri Karamanis. 1998. “Investigating efficiency in betting markets: Evidence from the Greek 6/49 Lotto.” Journal of Banking & Finance, 22(12): 1597–1615. Rabin, Matthew. 2002. “Inference by believers in the law of small numbers.” The Quarterly Journal of Economics, 117(3): 775–816. Rabin, Matthew, and Dimitri Vayanos. 2010. “The gambler’s and hot-hand fallacies: Theory and applications.” The Review of Economic Studies, 77(2): 730–778. Roger, Patrick, and Marie-Hélène Broihanne. 2007. “Efficiency of betting markets and rationality of players: evidence from the French 6/49 lotto.” Journal of Applied statistics, 34(6): 645–662. Schetzer, Alana. 2017. “Dating burnout: The fallout from serial online dating disappointment.” SBS Australia. 2 May 2017. Scoggins, John F. 1995. “The lotto and expected net revenue.” National Tax Journal, 61–70. 23
Suetens, Sigrid, and Jean-Robert Tyran. 2012. “The gambler’s fallacy and gender.” Journal of Eco- nomic Behavior & Organization, 83(1): 118–124. Suetens, Sigrid, Claus B Galbo-Jørgensen, and Jean-Robert Tyran. 2016. “Predicting lotto num- bers: a natural experiment on the gambler’s fallacy and the hot-hand fallacy.” Journal of the European Economic Association, 14(3): 584–607. Terrell, Dek. 1994. “A test of the gambler’s fallacy: Evidence from pari-mutuel games.” Journal of risk and uncertainty, 8(3): 309–317. Tesler, Michael. 2020. “The Floyd protests will likely change public attitudes about race and policing. Here’s why.” Washington Post. 5 June 2020. Wang, Tong V, Rogier Jan Dave Potter van Loon, Martijn J Van den Assem, and Dennie Van Dolder. 2016. “Number preferences in lotteries.” Judgment and Decision Making, 11(3): 243–259. 24
Online Appendix “The Gambler’s Fallacy Prevails in Lottery Play” Brian Dillon and Travis J. Lybbert A Robustness: Including a lagged dependent variable Many lottery players have favorite numbers, which they play often. It is conceivable that intermittent play combined with strong number preferences could be responsible for the smaller magnitude single-lag effects in Figure 4, relative to the effects of wins that occurred two or more rounds in the past. The idea is that some players may choose to play in a particular round specifically because one of their favored numbers was drawn in the previous round. While this behavior has the flavor of hot hand play, other interpretations are possible. A player that sees his favorite number win might simply be reminded of the game and the fun of gambling. Or, he may choose to interpret a win by his favorite number as a signal that it is time to play again, even if he does not believe that the odds of winning are mutable. The player fixed effects in our main analysis do not fully account for these possible behaviors. Player fixed effects control for time invariant number preferences and for players that never change their bets, but they do not control for the extensive margin decision to play a favorite number in response to recent events. To account for this potential mechanism, we re-estimate all of our main specifications with the inclusion of a lagged dependent variable as an additional independent variable. Specifically, when the dependent variable is P layedinr , we define the lagged dependent vari- able as the value of P layedinq , where q < r is the last round in which player i placed a bet (not necessarily the round immediately prior to r). In an unbalanced panel with a short T for many players, including this lagged dependent variable could bias all coefficients, though to our knowledge the exact form of the bias in this circumstance is not known (Nickell, 1981). We provide this analysis for robustness, but are cautious in our interpretations due to the potential for Nickell bias, and for this reason prefer the results in the main table which do not include a lagged dependent variable. Figure S1, Table A, and Table A contain results analogous to those in Figure 4, 1
Table 1, and Table 2 in the main paper, augmented with the inclusion of a lagged dependent variable. In all cases, the coefficient on the lagged dependent variable is highly statistically significant, and lies between 0 and 1 (not reported). Of greater interest to us is the stability of the estimated coefficients on variables representing the recent winning history of a number. Those estimates are broadly consistent with the main tables. Level effects change with the inclusion of the lagged dependent variable, but the signs, relative magnitudes, and patterns of statistical significance are qualitatively similar to our main results. Player attachments to specific numbers may be responsible for some periodic entry and exit from these lottery games, but not so much that they alter our main conclusions about the gambler’s fallacy and the lack of streak switching. 2
0 0 Estimated coefficient w/ 95% C.I. Estimated coefficient w/ 95% C.I. -.02 -.02 -.04 -.04 -.06 -.06 0 20 40 60 80 0 20 40 60 80 Lag Lag A. Haiti, effect of any past win B. Haiti, effect of most recent win .02 .02 Estimated coefficient w/ 95% C.I. Estimated coefficient w/ 95% C.I. 0 0 -.02 -.02 -.04 -.04 0 2 4 6 8 10 0 2 4 6 8 10 Lag Lag C. Denmark, effect of any past win D. Denmark, effect of most recent win Figure S1: Effect of prior wins on amount bet on a number, with lagged dependent variable Notes: Authors’ calculations from individually identifiable, administrative lottery data from Haiti and Denmark. Each panel represents one regression. Figures show OLS coefficients with 95% confidence intervals from regressions of the amount bet at the player-number-round level on a set of binary variables that describe the winning history of the number over the lagged rounds displayed on the horizontal axes. All regressions include player, number, and round fixed effects, as well as the value of the dependent variable from the last period in which the player was active, with standard errors clustered at the player level. The Boloto game in Haiti is played twice per day (60 rounds = 30 days). The System Lotto game in Denmark is played once per week (4 rounds = 28 days). The relative magnitudes of estimated coefficients are much larger in Haiti, where the mean value of the dependent variable is 0.094, than they are in Denmark, where the mean of the dependent variable is 2.762. 3
You can also read