How (In)accurate Are Demand Forecasts in Public Works Projects?

Page created by Wesley Norman
 
CONTINUE READING
How (In)accurate Are Demand Forecasts in Public Works Projects?


                                                How (In)accurate Are
This article presents results from the first
statistically significant study of traffic
forecasts in transportation infrastructure
                                                Demand Forecasts in
projects. The sample used is the largest
of its kind, covering  projects in 
nations worth U.S.$ billion. The study
                                                Public Works Projects?
shows with very high statistical signifi-
cance that forecasters generally do a poor
job of estimating the demand for trans-
portation infrastructure projects. For 
out of  rail projects, passenger forecasts
                                                The Case of Transportation
are overestimated; the average overestima-
tion is %. For half of all road projects,    Bent Flyvbjerg, Mette K. Skamris Holm, and Søren L. Buhl
the difference between actual and fore-
casted traffic is more than ±%. The
result is substantial financial risks, which
are typically ignored or downplayed by
                                                          espite the enormous sums of money being spent on transportation

                                                D
planners and decision makers to the det-
riment of social and economic welfare.                    infrastructure, surprisingly little systematic knowledge exists about the
Our data also show that forecasts have not                costs, benefits, and risks involved. The literature lacks statistically valid
become more accurate over the -year           answers to the central and self-evident question of whether transportation infra-
period studied, despite claims to the con-
trary by forecasters. The causes of inaccu-
                                                structure projects perform as forecasted. When a project underperforms, this is
racy in forecasts are different for rail and    often explained away as an isolated instance of unfortunate circumstance; it is
road projects, with political causes playing    typically not seen as the particular expression of a general pattern of underper-
a larger role for rail than for road. The       formance in transportation infrastructure projects. Because knowledge is wanting
cure is transparency, accountability, and       in this area of research, until now it has been impossible to validly refute or con-
new forecasting methods. The challenge
                                                firm whether underperformance is the exception or the rule.
is to change the governance structures
for forecasting and project development.
                                                     In three previous articles (Flyvbjerg, Holm, et al., , , ), we
Our article shows how planners may help         answered the question of project performance as regards costs and cost-related
achieve this.                                   risks. We found that projects do not perform as forecasted in terms of costs:
                                                almost  out of  projects fall victim to significant cost overrun. We also inves-
Bent Flyvbjerg is a professor of planning       tigated the causes and cures of such inaccurate cost projections (see Flyvbjerg,
at Aalborg University, Denmark. He is           Bruzelius, et al., ). In this article we focus on the benefit side of investments
founder and director of the university’s
                                                and answer the question of whether projects perform as forecasted in terms of
research program on large-scale infra-
structure planning. His latest books are        demand and revenue risks. We compare forecasted demand with actual demand
Megaprojects and Risk (Cambridge Uni-           for a large number of projects. Knowledge about cost risk, benefit risk, and com-
versity Press, , with Nils Bruzelius        pound risk is crucial to making informed decisions about projects. This is not to
and Werner Rothengatter), Making Social         say that costs and benefits are or should be the only basis for deciding whether to
Science Matter (Cambridge University            build. Clearly, forms of rationality other than economic rationality are at work
Press, ), and Rationality and Power
                                                in most infrastructure projects and are balanced in the broader frame of public
(University of Chicago Press, ). Mette
K. Skamris Holm is a former assistant           decision making. But the costs and benefits of infrastructure projects often run
professor of planning at Aalborg Univer-        in the hundreds of millions of dollars, with risks correspondingly high. Without
sity. She now works as a planner with           knowledge of such risks, decisions are likely to be flawed.
Aalborg Municipality. Søren L. Buhl                  As pointed out by Pickrell () and Richmond (), estimates of the
is an associate professor of mathematics        financial viability of projects are heavily dependent on the accuracy of traffic
at Aalborg University. He is associate
statistician with the university’s research
                                                demand forecasts. Such forecasts are also the basis for socioeconomic and envi-
program on large-scale infrastructure           ronmental appraisal of transportation infrastructure projects. According to the
planning.                                       experiences gained with the accuracy of demand forecasting in the transportation
                                                sector, covering traffic volumes, spatial traffic distribution, and distribution be-
Journal of the American Planning Association,
Vol. , No. , Spring .
                                                tween transportation modes, there is evidence that demand forecasting—like cost
© American Planning Association, Chicago, IL.   forecasting, and despite all scientific progress in modeling—is a major source of
How (In)accurate Are Demand Forecasts in Public Works Projects?
 Journal of the American Planning Association, Spring , Vol. , No. 

uncertainty and risk in the appraisal of transportation                     small samples used in existing studies; it does not hold for
infrastructure projects.                                                    the project population. When we enlarge the sample of
     Traffic forecasts are routinely used to dimension the                  projects by a factor – to a more representative one, we
construction of transportation infrastructure projects.                     find a different picture. Road traffic forecasts are not gen-
Accuracy in such forecasts is a point of considerable im-                   erally overestimated, although they are often very inaccu-
portance for the effective allocation of scarce funds. For                  rate, whereas forecasts of rail patronage are generally over-
example, Bangkok’s U.S.$ billion Skytrain was hugely                       estimated, often dramatically so.
overdimensioned because the passenger forecasts were .                          We follow common practice and define the inaccuracy
times higher than actual traffic. As a result, station plat-                of a traffic forecast as actual minus forecasted traffic in per-
forms are too long for the shortened trains that now oper-                  centage of forecasted traffic. Traffic is measured as number
ate the system, a large number of trains and cars are idly                  of passengers for rail, and number of vehicles for roads
parked in the train garage because there is no need for them,               Actual traffic is counted for the first year of operations (or
terminals are too large, etc. The project company has ended                 the opening year). Forecasted traffic is the traffic estimate
up in financial trouble, and even though urban rail is prob-                for the first year of operations (or the opening year) as esti-
ably a good idea for a congested and air-polluted city like                 mated at the time of decision to build the project. Thus
Bangkok, overinvesting in idle capacity is hardly the best                  the forecast is the estimate available to decision makers
way to use resources, and especially not in a developing                    when they made the decision to build the project in ques-
nation where capital for investment is scarce. Conversely,                  tion. If no estimate was available at the time of decision to
a U.K. National Audit Office () study identified a num-                 build, then the closest available estimate was used, typically
ber of road projects that were underdimensioned because                     a later estimate, resulting in a conservative bias in our
traffic forecasts were too low. This, too, led to multimillion-             measure for inaccuracy.
dollar inefficiencies, because it is much more expensive to                       We measured inaccuracy of traffic forecasts in a sample
add capacity to existing, fully used roads than it is to build              of  transportation infrastructure projects with compar-
the capacity up front. For these and other reasons, accuracy                able data for forecasted and actual traffic. The sample
in traffic forecasts matters.                                               comprises a project portfolio worth approximately U.S.$
     Nevertheless, rigorous studies of accuracy are rare.                   billion in actual costs ( prices). The portfolio includes
Where such studies exist, they are characteristically small-                 rail projects and  road projects completed between
N research; that is, they are single-case studies or they                    and . The project types are urban rail, high-speed
cover a sample of projects too small or too uneven to allow                 rail, conventional rail, bridges, tunnels, highways, and free-
systematic, statistical analyses (Brooks & Trevelyan, ;                 ways. The projects are located in  countries on  conti-
Fouracre et al., ; Fullerton & Openshaw, ; Kain,                    nents, including both developed and developing nations:
; Mackinder & Evans, ; National Audit Office,                       Brazil, Chile, Denmark, Egypt, France, Germany, Hong
, ; Pickrell, ; Richmond, ; Walmsley &                      Kong, India, Mexico, South Korea, Sweden, Tunisia, the
Pickett, ; Webber, : World Bank, ). Despite                     U.K., and the U.S. Projects were selected for the sample
their value in other respects, with these and other studies,                based on the availability and quality of data. As far as we
it has so far been impossible to give statistically satisfying              know, this is the largest sample of transportation infrastruc-
answers to questions about how accurate traffic forecasts                   ture projects that has been established with comparable
are for transportation infrastructure projects.                             data on forecasted and actual traffic. For a full description
     The objective of the present study has been to change                  of the sample, data, and methods of testing for inaccuracy,
this state of affairs by establishing a sample of transpor-                 please see Flyvbjerg ().
tation infrastructure projects that is sufficiently large to
permit statistically valid answers to questions of accuracy.
In addition, it has been a practical objective to give plan-                Are Rail or Road Forecasts More
ners the tools for carrying out realistic and valid risk assess-
                                                                            Accurate?
ment of projects as regards travel demand. Existing studies
almost all conclude there is a strong tendency for traffic                       Figures  and  show the distribution of inaccuracy of
forecasts to be overestimated (Fouracre et al., , pp.  &               traffic forecasts for the  projects in the sample split into
; Mackinder & Evans, , p. ; National Audit Office,                  rail and road projects. Perfect accuracy is indicated by zero;
, app. .; Pickrell, , p. x; Thompson, , pp.                  a negative figure indicates that actual traffic is that many
–; Walmsley & Pickett, , p. ; World Bank, ).                     percent lower than forecasted traffic; a positive figure indi-
We will show that this conclusion is a consequence of the                   cates that actual traffic is that many percent higher than
How (In)accurate Are Demand Forecasts in Public Works Projects?
Flyvbjerg: How (In)accurate Are Demand Forecasts in Public Works Projects?                                                                

forecasted traffic. The most noticeable attribute of Figures                     higher than forecasted traffic (sd=., % confidence
 and  is the striking difference between rail and road                         interval of . to .).
projects. Rail passenger forecasts are much more inaccurate
(inflated) than are road traffic forecasts.                                       Here it would be interesting to compare toll roads
     Tests show that of the  rail projects included in the                 with non-toll roads, but unfortunately the present data do
statistical analyses, two German projects should be consid-                  not allow this.
ered as statistical outliers. These are the two projects repre-                   We see that the risk is substantial that road traffic fore-
sented by the two rightmost columns in the rail histogram                    casts are wrong by a large margin, but the risk is more bal-
in Figure  and the two uppermost plots in the rail box-                     anced than for rail passenger forecasts. Testing the differ-
plot diagram shown in Figure . Excluding statistical out-                   ence between rail and road, we find at a very high level of
liers, we find the following results for the remaining  rail               statistical significance that rail passenger forecasts are less
projects (results including the two statistical outliers are                 accurate and more inflated than road vehicle forecasts
given in brackets):                                                          (p
How (In)accurate Are Demand Forecasts in Public Works Projects?
 Journal of the American Planning Association, Spring , Vol. , No. 

           Percentage of
             Projects
                

                   

                   

                    

                   

                     

                    

                                   −             −                                                                     
                                                                 Inaccuracy (%) for rail projects
           Percentage of
             Projects
                

                   

                   

                    

                   

                     

                    

                                   −             −                                                                     
                                                                Inaccuracy (%) for road projects

Figure . Inaccuracies of traffic forecasts in  transportation infrastructure projects, -, split into  rail and  road projects.
Flyvbjerg: How (In)accurate Are Demand Forecasts in Public Works Projects?                                                                       

        Inaccuracy (%)

              

              

               

               

                

             −

             −

                                     Rail projects (passengers)                            Road projects (vehicles)

Figure . Inaccuracies of traffic forecasts in  transportation infrastructure projects, -.

                                                                                  Rail projects                                Road projects

Average inaccuracy (%)                                                   −. (sd=.) [−. (sd=.)]                       . (sd=.)
Percentage of projects with inaccuracies larger than ±%                             []                                           
Percentage of projects with inaccuracies larger than ±%                             []                                           
Percentage of projects with inaccuracies larger than ±%                             []                                           

Note: Figures in brackets include two statistical outliers.

Table . Inaccuracy in forecasts of rail passenger and road vehicle traffic for  transportation infrastructure projects, -.
 Journal of the American Planning Association, Spring , Vol. , No. 

stages (Mierzejewski, ; Zhao & Kockelman, ).                                  traffic has been consistently overestimated during the
The data presented above provide the empirical basis on                               -year period studied. The U.S. Federal Transit Admin-
which planners may establish risk assessment and manage-                              istration (FTA) has a study underway indicating that rail
ment, and below we propose methods and procedures for                                 passenger forecasts may have become more accurate re-
doing so.                                                                             cently (Ryan, ). According to an oral presentation of
                                                                                      the study at the annual Transportation Research Board
                                                                                      meeting in , of  new rail projects, % achieved
                                                                                      actual patronage less than % of forecast patronage. This
Have Forecasts Become More Accurate                                                   is a  percentage point improvement over the rail projects
Over Time?                                                                            in our sample, where % of rail projects achieved actual
                                                                                      patronage less than % of that forecasted (see above). It is
     Figures  and  show how forecast inaccuracy varies over                         also an improvement over the situation Pickrell () de-
time for the projects in the sample for which inaccuracy                              picted. It is unclear, however, whether this reported
could be coupled with information about year of decision                              improvement is statistically significant, and despite the
to build and/or year of project completion. Statistical tests                         improvement, the same pattern of overestimation contin-
show there is no indication that traffic forecasts have become                        ues. Ryan’s () preliminary conclusion thus dovetails
more accurate over time, despite claims to the contrary                               with ours: “Risk of large errors still remains” (slide ). A
(American Public Transit Association, , pp. , ). For                            report from the FTA study is underway.
road projects (Figure ), forecasts even appear to become                                  For road projects, inaccuracies are larger towards the
more inaccurate toward the end of the -year period                                  end of the period, with highly underestimated traffic.
studied. Statistical analyses corroborate this impression.                            However, there is a difference between Danish and other
     For rail projects (Figure ), forecast inaccuracy is inde-                       road projects. For Danish road projects, we find at a very
pendent of both year of project commencement and year                                 high level of statistical significance that inaccuracy varies
of project conclusion. This is the case whether the two                               with time (p
Flyvbjerg: How (In)accurate Are Demand Forecasts in Public Works Projects?                                                                  

        Inaccuracy (%)                                                     Inaccuracy (%)
                                                                             
                             Denmark                                                              Denmark
                             Other EU                                                             Other EU

                                                                               

                                                                                    

              −                                                               −

                                                                                              
                              Year of commencement                                                   Year of conclusion

Figure . Inaccuracy over time in forecasts of vehicle traffic in road projects (N=).

of the s to the second half of the s, inaccuracy of                      sis of data from the past. The so-called energy crises of 
Danish road traffic forecasts increased  fold, from  to                       and  and associated increases in petrol prices plus de-
% (see Figure ).                                                              creases in real wages had a profound, if short-lived, effect
     The Danish experience with increasing inaccuracy                            on road traffic in Denmark, with traffic declining for the
in road traffic forecasts is best explained by what Ascher                       first time in decades. Danish traffic forecasters adjusted and
() calls “assumption drag” (pp. , –), that is,                       calibrated their models accordingly, on the assumption
the continued use of assumptions after their validity has                        that they were witnessing an enduring trend. The assump-
been contradicted by the data. More specifically, traffic                        tion was mistaken. When during the s the effects of
forecasters typically calibrate forecasting models on the ba-                    the two oil crises and related policy measures tapered off,

        Inaccuracy (%)                                                    Inaccuracy (%)
                                                                             

                                                                               

                                                                                    

              −                                                               −

                                                                                              

                              Year of commencement                                                   Year of conclusion

Figure . Inaccuracy over time in forecasts of vehicle traffic for Danish road projects (N=).
 Journal of the American Planning Association, Spring , Vol. , No. 

traffic boomed again, rendering forecasts made on s                     passenger forecasts are not significantly dependent on esti-
assumptions inaccurate.                                                     mated number of passengers, neither directly (p=.) nor
     We conclude that accuracy in traffic forecasting has                   taking logarithms (p=.).
not improved over time. Rail passenger forecasts are as                          For road projects, based on  cases, inaccuracies in
inaccurate—that is, inflated—today as they were  years                    vehicle forecast are not significantly dependent on costs,
ago. Road vehicle forecasts even appear to have become                      neither directly (p=.) nor logarithmically (p =.).
more inaccurate over time, with large underestimations to-                  Based on  cases, inaccuracies in vehicle forecast are sig-
wards the end of the -year period studied. If techniques                  nificantly dependent on estimated number of vehicles,
and skills for arriving at accurate traffic forecasts have im-              both directly (p=.) and even stronger taking loga-
proved over time, our data do not show it. This suggests                    rithms (p
Flyvbjerg: How (In)accurate Are Demand Forecasts in Public Works Projects?                                                                     

     In order to arrive at a more systematic analysis of                                Figure  shows the stated causes for inaccuracies in
causes of inaccuracies in traffic forecasts, we identified such                    traffic forecasts for rail and road, respectively. For each
causes for  transportation infrastructure projects. For a                       transportation mode and stated cause, a column shows the
number of projects we were able to identify causes of inac-                        percentage of projects for which this cause was stated as a
curacies but not the numerical size of inaccuracies. This                          reason for inaccuracy.
explains why we have more projects () in this part of                                Again the results are very different for rail and road.
our analysis than in the previous part (). Causes of in-                       For rail projects, the two most important stated causes
accuracies are stated causes that explain differences between                      are “uncertainty about trip distribution” and “deliberately
actual and forecasted traffic for the first year of operations                     slanted forecasts.” Trip distribution in rail passenger mod-
or the opening year. For the projects on which we collected                        els, while ideally based on cross-sectional data collected
data, project managers were asked to account for the factors                       from users of transportation systems, is often adapted to
that would explain why actual traffic was different from                           fit national or urban policies aimed at boosting rail traffic.
forecasted traffic. For the other projects the stated causes                       Here, too, it is difficult for forecasters and planners to gain
are a mixture of this type of statement by managers sup-                           acceptance for realistic forecasts that run counter to idealis-
plemented by statements by researchers about what caused                           tic policies. But such policies frequently fail, and the result
such differences. For these projects, the data do not allow an                     is the type of overestimated passenger forecast that we have
exact distinction between manager statements and researcher                        documented above as typical for rail passenger forecasting
statements, though such a distinction would be desirable.                          (Flyvbjerg, Bruzelius, et al., , ch. ). As regards delib-
A problem with using stated causes is that what people                             erately slanted forecasts, such forecasts are produced by rail
say they do is often significantly different from what they                        promoters in order to increase the likelihood that rail proj-
actually do. Identifying revealed causes for inaccuracy in                         ects get built (Wachs, ). Such forecasts exaggerate pas-
traffic forecasting is therefore an important area for further                     senger traffic and thus revenues. Elsewhere we have shown
research. For the time being, we have to make do with                              that the large overestimation of traffic and revenues docu-
stated causes.                                                                     mented above for rail goes hand-in-hand with an equally

     Percentage of projects
              
                                                                                                                        Rail      Road

                  

                  

                   

                  

                    

                   
                        gen Trip
                                 n

                                           pm e
                                             ent

                                                         dis Trip
                                                                      on

                                                                     ing
                                                                     del

                                                          ted ately
                                                                       st

                                                               liab y/
                                                                        y

                                                                                                 cha n
                                                                                                     nge

                                                                                                           ide Not
                                                                                                                 fied

                                                                                                                            her
                                              s

                                                                    ilit
                              tio

                                                                                                    sig
                                                                   eca
                                     dev nd u

                                                                     a
                                                                  uti

                                                                 mo

                                                       vic g del
                                                                 ast

                                                                                                                          Ot
                                                                                                              nti
                                                                                                 De
                           era

                                                     slan liber
                                                               for
                                                             trib

                                                             rec
                                        La
                                        elo

                                                          e re
                                                            n
                                                          Fo

                                                         De

                                                         eni
                                                    Op
                                                   ser

Figure . Stated causes of inaccuracies in traffic forecasts (N= rail projects and  road projects).
 Journal of the American Planning Association, Spring , Vol. , No. 

large underestimation of costs (Flyvbjerg, Holm, et al.,                    management as something planners could and should do to
, ). The result is cost-benefit analyses of rail proj-              improve planning and decision making for transportation
ects that are inflated, with benefit-cost ratios that are useful            infrastructure projects. Today, the benefit risks generated
for getting projects accepted and built.                                    by inaccurate travel demand forecasts are widely ignored or
     For road projects, the two most often stated causes for                underestimated in planning, just as cost risks are neglected
inaccurate traffic forecasts are uncertainties about “trip gen-             (Flyvbjerg, Holm, et al., ).
eration” and “land-use development.” Trip generation is                          When contemplating what planners can do to reduce
based on traffic counts and demographic and geographic                      inaccuracy, bias, and risk in forecasting, we need to distin-
data. Such data are often dated and incomplete, and fore-                   guish between two fundamentally different situations: ()
casters quote this as a main source of uncertainty in road                  Planners consider it important to get forecasts right, and
traffic forecasting. Forecasts of land use development are                  () planners do not consider it important to get forecasts
based on land use plans. The land use actually implemented                  right, because optimistic forecasts are seen as a means to
is often quite different from what was planned, however.                    getting projects started. We consider the first situation in
This, again, is a source of uncertainty in forecasting.                     this section and the second in the following one.
     The different patterns in stated causes for rail and                        If planners genuinely consider it important to get fore-
road, respectively, fit well with the figures for actual fore-              casts right, we recommend they use a new forecasting meth-
cast inaccuracies documented above. Rail forecasts are sys-                 od called “reference class forecasting” to reduce inaccuracy
tematically and significantly overestimated to a degree that                and bias. This method was originally developed to compen-
indicates intent and not error on the part of rail forecasters              sate for the type of cognitive bias in human forecasting that
and promoters. The stated causes, with “deliberately slant-                 Princeton psychologist Daniel Kahneman found in his
ed forecasts” as the second to largest category, corroborate                Nobel prize-winning work on bias in economic forecasting
this interpretation, which corresponds with findings by                     (Kahneman, ; Kahneman & Tversky, ). Reference
Wachs (); Flyvbjerg, Holm, and Buhl (); and the                     class forecasting has proven more accurate than convention-
U.K. Department for Transport (, pp. –). Road                       al forecasting. For reasons of space, we present here only an
forecasts are also often inaccurate, but they are substan-                  outline of the method, based mainly on Lovallo and Kahne-
tially more balanced than rail forecasts, which indicates a                 man () and Flyvbjerg (). In a different context we
higher degree of fair play in road traffic forecasting. This                are currently developing what is, to our knowledge, the first
interpretation is corroborated by the fact that deliberately                instance of practical reference class forecasting in planning
slanted forecasts are not quoted as a main cause of inac-                   (U.K. Department for Transport, ).
curacy for road traffic forecasts, where they are replaced                       Reference class forecasting consists in taking a so-called
by more technical factors like trip generation and land use                 “outside view” on the particular project being forecast. The
development. This is not to say that road traffic forecasts                 outside view is established on the basis of information from
are never politically manipulated. It is to say, however, that              a class of similar projects. The outside view does not try to
this appears to happen less often and less systematically for               forecast the specific uncertain events that will affect the
road than for rail projects. It is also not to say that road                particular project, but instead places the project in a statis-
projects generally have a stronger justification than rail                  tical distribution of outcomes from this class of reference
projects—just that they have less biased forecasts.                         projects. Reference class forecasting requires the following
                                                                            three steps for the individual project:

What Planners Can Do to Reduce                                                  . Identifying a relevant reference class of past proj-
Inaccuracy, Bias, and Risk in                                                      ects. The class must be broad enough to be statis-
                                                                                   tically meaningful but narrow enough to be truly
Forecasting
                                                                                   comparable with the specific project.
     The results presented above show that it is highly risky                   . Establishing a probability distribution for the
to rely on travel demand forecasts to plan and implement                           selected reference class. This requires access to
large transportation infrastructure investments. Rail pas-                         credible, empirical data for a sufficient number of
senger forecasts are overestimated in  out of  cases, with                      projects within the reference class to make statis-
an average overestimation above %. Half of all road                             tically meaningful conclusions.
traffic forecasts are wrong by more than ±%. Forecasts                        . Comparing the specific project with the reference
have not become more accurate over the past  years. This                         class distribution in order to establish the most
state of affairs points directly to better risk assessment and                     likely outcome for the specific project.
Flyvbjerg: How (In)accurate Are Demand Forecasts in Public Works Projects?                                                               

     Daniel Kahneman relates the following story about                       influence the project’s future course. Instead, it examined
curriculum planning to illustrate reference class forecasting                the experiences of a class of similar projects, laid out a
in practice (Lovallo & Kahneman, , p. ). We use                        rough distribution of outcomes for this reference class, and
this example because similar examples do not exist yet in                    then positioned the current project in that distribution.
the field of city planning. Some years ago, Kahneman was                     The resulting forecast, as it turned out, was much more
involved in a project to develop a curriculum for a new sub-                 accurate.
ject area for high schools in Israel. The project was carried                      Similarly—to take an example from city planning—
out by a team of academics and teachers. In time, the team                   planners in a city preparing to build a new subway would
began to discuss how long the project would take to com-                     first establish a reference class of comparable projects. This
plete. Everyone on the team was asked to write on a slip of                  could be the urban rail projects included in the sample for
paper the number of months needed to finish and report                       this article. Through analyses the planners would establish
the project. The estimates ranged from  to  months.                      that the projects included in the reference class were indeed
One of the team members—a distinguished expert in                            comparable. Second, if the planners were concerned about
curriculum development—was then posed a challenge by                         getting patronage forecasts right, they would then establish
another team member to recall as many projects similar to                    the distribution of outcomes for the reference class regard-
theirs as possible and to think of these projects in a stage                 ing the accuracy of patronage forecasts. This distribution
comparable to their own. “How long did it take them at                       would look something like the rail part of Figure . Third,
that point to reach completion?” the expert was asked. After                 the planners would compare their subway project to the
a while he answered, with some discomfort, that not all the                  reference class distribution. This would make it clear to
comparable teams he could think of ever did complete their                   the planners that unless they had reason to believe they are
task. About % of them eventually gave up. Of those re-                     substantially better forecasters and planners than their col-
maining, the expert could not think of any that completed                    leagues who did the forecasts and planning for projects in
their task in less than  years, nor of any that took more                   the reference class, they are likely to grossly overestimate
than . The expert was then asked if he had reason to be-                   patronage. Finally, planners may then use this knowledge
lieve that the present team was more skilled in curriculum                   to adjust their forecasts for more realism.
development than the earlier ones had been. The expert said                        The contrast between inside and outside views has
no, he did not see any relevant factor that distinguished                    been confirmed by systematic research (Gilovich et al.,
this team favorably from the teams he had been thinking                      ). The research shows that when people are asked sim-
about. His impression was that the present team was slightly                 ple questions requiring them to take an outside view, their
below average in terms of resources and potential. The wise                  forecasts become significantly more accurate. However,
decision at this point would probably have been for the                      most individuals and organizations are inclined to adopt
team to break up, according to Kahneman. Instead, the                        the inside view in planning major initiatives. This is the
members ignored the pessimistic information and pro-                         conventional and intuitive approach. The traditional way
ceeded with the project. They finally completed it  years                   to think about a complex project is to focus on the project
later, and their efforts were largely wasted—the resulting                   itself and its details, to bring to bear what one knows about
curriculum was rarely used.                                                  it, paying special attention to its unique or unusual features,
     In this example, the curriculum expert made two                         trying to predict the events that will influence its future.
forecasts for the same problem and arrived at very different                 The thought of going out and gathering simple statistics
answers. The first forecast was the inside view; the second                  about related cases seldom enters a planner’s mind. This is
was the outside view, or reference class forecast. The inside                the case in general, according to Lovallo and Kahneman
view is the one that the expert and the other team mem-                      (, pp. –). And it is certainly the case for travel
bers adopted. They made forecasts by focusing tightly on                     demand forecasting. Despite the many forecasts we have
the case at hand, considering its objective, the resources                   reviewed, we have not come across a single genuine refer-
they brought to it, and the obstacles to its completion.                     ence class forecast of travel demand. If our readers have
They constructed in their minds scenarios of their coming                    information about such forecasts, we would appreciate
progress and extrapolated current trends into the future.                    their feedback for our ongoing work on this issue.
The resulting forecasts, even the most conservative ones,                          Planners’ preference for the inside view over the out-
were overly optimistic. The outside view is the one pro-                     side view, while understandable, is unfortunate. When
voked by the question to the curriculum expert. It com-                      both forecasting methods are applied with equal skill, the
pletely ignored the details of the project at hand, and it                   outside view is much more likely to produce a realistic
involved no attempt at forecasting the events that would                     estimate. That is because it bypasses cognitive and organ-
 Journal of the American Planning Association, Spring , Vol. , No. 

izational biases such as appraisal optimism and strategic                        However, the literature is replete with things planners
misrepresentation and cuts directly to outcomes. In the                     and planning “must” strive to do, but which they don’t.
outside view, planners and forecasters are not required to                  Planning must be open and communicative, but often it is
make scenarios, imagine events, or gauge their own and                      closed. Planning must be participatory and democratic, but
others’ levels of ability and control, so they cannot get any               often it is an instrument of domination and control. Plan-
of these things wrong. Surely the outside view, being based                 ning must be about rationality, but often it is about power
on historical precedent, may fail to predict extreme out-                   (Flyvbjerg, ; Watson, ). This is the “dark side” of
comes, that is, those that lie outside all historical prece-                planning and planners identified by Flyvbjerg () and
dents. But for most projects, the outside view will produce                 Yiftachel (), which is remarkably underexplored by
more accurate results. In contrast, a focus on inside details               planning researchers and theorists.
is the road to inaccuracy.                                                       Forecasting, too, has its dark side. It is here that “plan-
     The comparative advantage of the outside view is most                  ners lie with numbers,” as Wachs () has aptly put it.
pronounced for nonroutine projects, understood as proj-                     Planners on the dark side are busy not with getting fore-
ects that planners and decision makers in a certain locale                  casts right and following the AICP Code of Ethics but with
have never attempted before—like building an urban rail                     getting projects funded and built. And accurate forecasts
system in a city for the first time, or a new major bridge or               are often not an effective means for achieving this objec-
tunnel where none existed before. It is in the planning of                  tive. Indeed, accurate forecasts may be counterproductive,
such new efforts that the biases toward optimism and stra-                  whereas biased forecasts may be effective in competing for
tegic misrepresentation are likely to be largest. To be sure,               funds and securing the go-ahead for construction. “The
choosing the right reference class of comparative past proj-                most effective planner,” says Wachs (), “is sometimes
ects becomes more difficult when planners are forecasting                   the one who can cloak advocacy in the guise of scientific or
initiatives for which precedents are not easily found, such                 technical rationality” (p. ). Such advocacy would stand
as the introduction of new and unfamiliar technologies.                     in direct opposition to AICP’s ruling that “the planner’s
However, most large-scale transportation projects are both                  primary obligation [is] to the public interest” (American
nonroutine locally and use well-known technologies. Such                    Planning Association, , B.). Nevertheless, seemingly
projects are, therefore, particularly likely to benefit from                rational forecasts that underestimate costs and overestimate
the outside view and reference class forecasting. The same                  benefits have long been an established formula for project
holds for concert halls, museums, stadiums, exhibition                      approval (Flyvbjerg, Bruzelius, et al., ). Forecasting is
centers, and other local one-off projects.                                  here mainly another kind of rent-seeking behavior, result-
                                                                            ing in a make-believe world of misrepresentation that
                                                                            makes it extremely difficult to decide which projects de-
When Planners Are Part of the                                               serve undertaking and which do not. The consequence, as
                                                                            even one of the industry’s own organs, the Oxford-based
Problem, Not the Solution
                                                                            Major Projects Association, acknowledges, is that too many
      In the present section, we consider the situation where               projects proceed that should not. We would like to add
planners and other influential actors do not find it impor-                 that many projects don’t proceed that probably should,
tant to get forecasts right and where planners, therefore, do               had they not lost out to projects with “better” misrepre-
not help to clarify and mitigate risk but instead generate                  sentation (Flyvbjerg, Holm, et al., ).
and exacerbate it. Here planners are part of the problem,                        In this situation, the question is not so much what
not the solution. This situation may need some explica-                     planners can do to reduce inaccuracy and risk in forecast-
tion, because it might sound to many like an unlikely state                 ing, but what others can do to impose on planners the
of affairs. After all, it may be agreed that planners ought to              checks and balances that would give planners the incentive
be interested in being accurate and unbiased in forecasting.                to stop producing biased forecasts and begin to work ac-
It is even stated as an explicit requirement in the AICP                    cording to their Code of Ethics. The challenge is to change
Code of Ethics and Professional Conduct that “A planner                     the power relations that govern forecasting and project
must strive to provide full, clear and accurate information                 development. Better forecasting techniques and appeals to
on planning issues to citizens and governmental decision-                   ethics won’t do here; institutional change with a focus on
makers” (American Planning Association, , A.), and                     transparency and accountability is necessary.
we certainly agree with the Code. The British Royal Town                         Two basic types of accountability define liberal democ-
Planning Institute () has laid down similar obligations                 racies: () public sector accountability through transparency
for its members.                                                            and public control, and () private sector accountability via
Flyvbjerg: How (In)accurate Are Demand Forecasts in Public Works Projects?                                                             

competition and market control. Both types of accountabil-                       casts. An example of a professional penalty would be
ity may be effective tools to curb planners’ misrepresenta-                      the exclusion from one’s professional organization for
tion in forecasting and to promote a culture that acknowl-                       violating its code of ethics. An example of a criminal
edges and deals effectively with risk. In order to achieve                       penalty would be punishment as the result of prosecu-
accountability through transparency and public control, the                      tion before a court or similar legal body, for instance
following would be required as practices embedded in the                         where deceptive forecasts have led to substantial mis-
relevant institutions:                                                           management of public funds (Garett & Wachs, ).
                                                                                 Malpractice in planning should be taken as seriously as
   • National-level government should not offer discre-                          it is in other professions. Failure to do this amounts to
     tionary grants to local infrastructure agencies for the                     not taking the profession of planning seriously.
     sole purpose of building a specific type of infrastruc-
     ture, for instance rail. Such grants create perverse in-                    In order to achieve accountability in forecasting via
     centives. Instead, national government should simply                    competition and market control, the following would be
     offer “infrastructure grants” or “transportation grants”                required, again as practices that are both embedded in and
     to local governments and let local political officials                  enforced by the relevant institutions:
     spend the funds however they choose, but ensure that
     every dollar they spend on one type of infrastructure                     • The decision to go ahead with a project should, where
     reduces their ability to fund another.                                      at all possible, be made contingent on the willingness
   • Forecasts should be made subject to independent peer                        of private financiers to participate without a sovereign
     review. Where large amounts of taxpayers’ money are                         guarantee for at least one third of the total capital
     at stake, such review may be carried out by national or                     needs. This should be required whether projects pass
     state accounting and auditing offices, like the General                     the market test or not; that is, whether projects are
     Accounting Office in the U.S. or the National Audit                         subsidized or not or provided for reasons of social
     Office in the U.K., who have the independence and                           justice or not. Private lenders, shareholders, and stock
     expertise to produce such reviews. Other types of inde-                     market analysts would produce their own forecasts or
     pendent review bodies may be established, for instance                      would critically monitor existing ones. If they were
     within national departments of finance or with rele-                        wrong about the forecasts, they and their organizations
     vant professional bodies.                                                   would be hurt. The result would be more realistic fore-
   • Forecasts should be benchmarked against comparable                          casts and reduced risk.
     forecasts, for instance using reference class forecasting                 • Full public financing or full financing with a sovereign
     as described in the previous section.                                       guarantee should be avoided.
   • Forecasts, peer reviews, and benchmarkings should be                      • Forecasters and their organizations must share finan-
     made available to the public as they are produced, in-                      cial responsibility for covering benefit shortfalls (and
     cluding all relevant documentation.                                         cost overruns) resulting from misrepresentation and
   • Public hearings, citizen juries, and the like should be                     bias in forecasting.
     organized to allow stakeholders and civil society to                      • The participation of risk capital should not mean that
     voice criticism and support of forecasts. Knowledge                         government gives up or reduces control of the project.
     generated in this way should be integrated in planning                      On the contrary, it means that government can more
     and decision making.                                                        effectively play the role it should be playing, namely as
   • Scientific and professional conferences should be                           the ordinary citizen’s guarantor of safety, environmen-
     organized where forecasters would present and defend                        tal quality, risk management, and a proper use of
     their forecasts in the face of colleagues’ scrutiny and                     public funds.
     criticism.
   • Projects with inflated benefit-cost ratios should be                         If the institutions with responsibility for developing
     reconsidered and stopped if recalculated costs and                      and building major transportation infrastructure projects
     benefits do not warrant implementation. Projects with                   would effectively implement, embed, and enforce such
     realistic estimates of benefits and costs should be                     measures of accountability, then the misrepresentation
     rewarded.                                                               in transportation forecasting, which is widespread today,
   • Professional and occasionally even criminal penalties                   might be mitigated. If this is not done, misrepresentation is
     should be enforced for planners and forecasters who                     likely to continue, and the allocation of funds for transpor-
     consistently and foreseeably produce deceptive fore-                    tation investments is likely to be wasteful.
 Journal of the American Planning Association, Spring , Vol. , No. 

Conclusions                                                                 Failing to do so amounts to not taking the profession of
                                                                            planning seriously.
     We conclude that the patronage estimates used by
planners of rail infrastructure development are highly, sys-                Acknowledgments
tematically, and significantly misleading (inflated). This                  The authors wish to thank Daniel Kahneman, Dan Lovallo, Don Pick-
results in large benefit shortfalls for rail projects. For road             rell, James Ryan, Martin Wachs, the JAPA editors, and four anonymous
projects the problem of misleading forecasts is less severe                 referees for their valuable help. Research for the article was supported by
and less one sided than for rail. But even for roads, for half              the Danish Transportation Council and Aalborg University, Denmark.
the projects the difference between actual and forecasted
traffic is more than ±%. On this background, planners                     Notes
and decision makers are well advised to take with a grain of                . All projects that we know of for which comparable data on forecasted
salt any traffic forecast that does not explicitly take into ac-            and actual traffic were obtainable were considered for inclusion in the
count the uncertainty of predicting future traffic. For rail                sample. This was  projects, of which  were then rejected because
                                                                            of unclear or insufficient data quality. More specifically, of the  proj-
passenger forecasts, a grain of salt may not be enough.
                                                                            ects rejected,  were rejected because inaccuracy had been estimated in
     The risks generated from misleading forecasts are typi-                ways different from and incomparable to the way we decided to estimate
cally ignored or downplayed in infrastructure planning, to                  inaccuracy;  projects were rejected because inaccuracies for these proj-
the detriment of social and economic welfare. Risks, there-                 ects had been estimated on the basis of adjusted data for actual traffic
fore, have a doubly negative effect in this particular type of              instead of using original, actual count data as we decided to do. All
planning, since it is one thing to take on a risk that one has              projects for which valid and reliable data were available were included in
                                                                            the sample. This covers both projects for which we ourselves collected
calculated and is prepared to take, much as insurance com-                  the data, and projects for which other researchers in other studies did
panies and professional investors do, while it is quite an-                 the data collection. Our own data collection concentrated on large
other matter—one that moves risk-taking to a different                      European projects, because too few data existed for this type of project
and more problematic level—to ignore risks altogether.                      to allow comparative studies. We collected primary data on the accuracy
This is especially the case when risks are of the magnitude                 of traffic forecasts for  projects in Denmark, France, Germany, Swed-
                                                                            en, and the U.K. and were thus able to increase by many times the
we have documented here, with many demand forecasts
                                                                            number of large European projects with reliable data for both actual and
being off by more than % on investments that measure                      estimated traffic, allowing for the first time comparative studies for this
in hundreds of millions of dollars. Such behavior is bound                  type of project where statistical methods can be applied. Other projects
to produce losers among those financing infrastructure, be                  were included in the sample from the following studies: Webber ();
they tax payers or private investors. If the losers or, for fu-             Hall (); National Audit Office (, ); Fouracre, Allport, and
ture projects, potential losers, want to protect themselves,                Thomson (); Pickrell (); Walmsley and Pickett (); Skamris
                                                                            (); and Vejdirektoratet (). Statistical tests showed no differences
then our study shows that the risk of faulty forecasts, and
                                                                            between data collected through our own surveys and data collected from
related risk assessment and management, must be placed at                   the studies carried out by other researchers.
the core of planning and decision making. Our goal with                     . The figures mentioned here should be interpreted with caution. With-
this article has been to take a first step in this direction by             out a published report for the FTA study, it is difficult to evaluate the
developing the necessary data and approach.                                 assumptions behind the study and thus the validity and comparability of
     The policy implications of our findings are clear. First,              its results. When the study report has been published, such evaluation
                                                                            should be possible.
the findings show that a major planning and policy prob-
                                                                            . We find that the estimated quantities are better than the actual
lem—namely misinformation—exists for this highly expen-                     quantities as a measure for project size in the evaluation of inaccuracy,
sive field of public policy. Second, the size and perseverance              because the estimates are what is known about size at the time of deci-
over time of the problem of misinformation indicate that it                 sion to build (and the time of making the forecasts), and using actual
will not go away by merely pointing out its existence and                   quantities would result in the mixing of cause and effect.
appealing to the good will of project promoters and plan-                   . As in the other parts of our analyses, we include here both projects for
                                                                            which we ourselves collected primary data and projects for which other
ners to make more accurate forecasts. The problem of mis-                   researchers did the data collection as part of other studies, which we then
information is an issue of power and profit and must be                     used as secondary sources. Again, our own data collection concentrated
dealt with as such, using the mechanisms of transparency                    on large European projects, because data were particularly wanting for
and accountability we commonly use in liberal democracies                   this project type. By means of a survey questionnaire and meetings with
to mitigate rent-seeking behavior and the misuse of power.                  project managers, we collected primary data on causes of inaccurate
                                                                            traffic forecasts for  projects, while we collected secondary data for 
To the extent that planners partake in rent-seeking behavior
                                                                            projects from the following studies: Webber (), Hall (), Nation-
and misuse of power, this may be seen as a violation of their               al Audit Office (), Fouracre et al. (), Pickrell (), Wachs
code of ethics—that is, malpractice. Such malpractice                       (), Leavitt et al. (), U.K. Department for Transport (),
should be taken seriously by the responsible institutions.                  Skamris (), and Vejdirektoratet ().
Flyvbjerg: How (In)accurate Are Demand Forecasts in Public Works Projects?                                                                              

. The closest we have come to an outside view on travel demand                 Gordon, P., & Wilson, R. (). The determinants of light-rail transit
forecasts is Gordon and Wilson’s () use of regression analysis on           demand: An international cross-sectional comparison. Transportation
an international cross section of light-rail projects to forecast patronage     Research A, A(), –.
in a number of light-rail schemes in North America.                             Hall, P. (). Great planning disasters. Harmondsworth, UK: Penguin
. The lower limit of a one-third share of private risk capital for such        Books.
capital to effectively influence accountability is based on practical expe-     Kahneman, D. (). New challenges to the rationality assumption.
rience. See more in Flyvbjerg, Bruzelius, and Rothengatter (, pp.           Journal of Institutional and Theoretical Economics, , –.
–).                                                                       Kahneman, D., & Tversky, A. (). Prospect theory: An analysis of
                                                                                decisions under risk. Econometrica, , –.
                                                                                Kain, J. F. (). Deception in Dallas: Strategic misrepresentation in
References                                                                      rail transit promotion and evaluation. Journal of the American Planning
American Planning Association. (). AICP code of ethics and profes-          Association, (), –.
sional conduct (adopted October , amended October ). Retrieved          Leavitt, D., Ennis, S., & McGovern, P. (). The cost escalation of rail
November, , from http://www.planning.org/ethics/conduct.html                projects: Using previous experience to re-evaluate the CalSpeed estimates
American Public Transit Association. (). Off track: Response of the         (Working Paper No. ). Berkeley: University of California, Berkeley,
American Public Transit Association to the UMTA report “Urban rail              Institute of Urban and Regional Development.
transit projects: Forecast versus actual ridership and costs.” Washington,      Lovallo, D., & Kahneman, D. (, July). Delusions of success: How
DC: Author.                                                                     optimism undermines executives’ decisions. Harvard Business Review,
Ascher, W. (). Forecasting: An appraisal for policy-makers and plan-        –.
ners. Baltimore: Johns Hopkins University Press.                                Mackinder, I. H., & Evans, S. E. (). The predictive accuracy of Brit-
Brooks, J. A., & Trevelyan, P. J. (). Before and after studies for          ish transportation studies in urban areas (Supplementary Report ).
inter-urban road schemes. In Planning and Transport Research and                Crowthorne, UK: Transportation and Road Research Laboratory.
Computation Co. Ltd., Highway planning and design: Proceedings of               Maldonado, J. (). Strategic planning: An approach to improving airport
seminar N held at the PTRC summer annual meeting, University of                 planning under uncertainty. Unpublished master’s thesis, Massachusetts
Warwick, England from – July  (pp. –). London: PTRC                Institute of Technology, Cambridge, MA.
Education and Research Services Ltd.                                            Mierzejewski, E. A. (). A new strategic urban transportation planning
Flyvbjerg, B. (). The dark side of planning: Rationality and Real-          process. University of South Florida, Center for Urban Transportation
rationalität. In S. Mandelbaum, L. Mazza, & R. Burchell (Eds.), Explor-         Research.
ations in planning theory (pp. –). New Brunswick, NJ: Center for          National Audit Office. (). Department of transport: Expenditure on
Urban Policy Research Press.                                                    motorways and trunk roads. London: Author.
Flyvbjerg, B. (). Rationality and power: Democracy in practice.             National Audit Office. (). Department of Transport, Scottish Devel-
Chicago: University of Chicago Press.                                           opment Department and Welsh Office: Road planning. London: Her
Flyvbjerg, B. (, December). Delusions of success: Comment on                Majesty’s Stationary Office.
Dan Lovallo and Daniel Kahneman. Harvard Business Review, –.              National Audit Office. (). Department of Transport: Contracting for
Flyvbjerg, B. (). On measuring the inaccuracy of travel forecasts:          roads. London: Author.
Methodological considerations. Manuscript submitted for publication.            National Audit Office. (). PFI: Construction performance (Report
Flyvbjerg, B., Bruzelius, N., & Rothengatter, W. (). Megaprojects           by the Comptroller and Auditor General HC  Session –).
and risk: An anatomy of ambition. Cambridge, UK: Cambridge Univer-              London: Author.
sity Press.                                                                     Pickrell, D. H. (). Urban rail transit projects: Forecast versus actual
Flyvbjerg, B., Holm, M. K. S., & Buhl, S. L. (). Cost underesti-            ridership and cost. Washington, DC: U.S. Department of Transportation.
mation in public works projects: Error or lie? Journal of the American          Richmond, J. E. D. (). New rail transit investments: A review.
Planning Association, (), –.                                           Cambridge, MA: Harvard University, John F. Kennedy School of
Flyvbjerg, B., Holm, M. K. S., & Buhl, S. L. (). How common and             Government.
how large are cost overruns in transport infrastructure projects? Transport     Royal Town Planning Institute. (). Code of professional conduct.
Reviews, (), –.                                                          Retrieved November, , from http://www.rtpi.org.uk/about-the
Flyvbjerg, B., Holm, M. K. S., & Buhl, S. L. (). What causes cost           -rtpi/codecond.pdf
overrun in transport infrastructure projects? Transport Reviews, (), –.   Ryan, J. (, January ). Predicted and actual ridership for recent new
Fouracre, P. R., Allport, R. J., & Thomson, J. M. (). The per-              starts projects (P–). Presentation at the rd Annual TRB Meeting,
formance and impact of rail mass transit in developing countries (TRRL          Session , Washington, DC.
Research Report ). Crowthorne, UK: Transportation and Road                   Skamris, M. K. (). Large transportation projects: Forecast versus actual
Research Laboratory.                                                            traffic and costs (Report No. ). Aalborg: Aalborg University, Depart-
Fullerton, B., & Openshaw, S. (). An evaluation of the Tyneside             ment of Development and Planning.
Metro. In K. J. Button & D. E. Pitfield (Eds.), International railway           Thompson, L. S. (, March–April). Trapped in the forecasts: An
economics: Studies in management and efficiency (pp. –). Aldershot,       economic field of dreams. TR News, .
UK: Gower.                                                                      U.K. Department for Transport. (). Comparison of forecast and
Garett, M., & Wachs, M. (). Transportation planning on trial: The           observed traffic on trunk road schemes. London: Author.
Clean Air Act and travel forecasting. Thousand Oaks, CA: Sage.                  U.K. Department for Transport. (). Procedures for dealing with
Gilovich, T., Griffin, D., & Kahneman, D. (Eds.). (). Heuristics            optimism bias in transport planning: Guidance document. London:
and biases: The psychology of intuitive judgment. Cambridge, UK: Cam-           Author. Available at http://www.dft.gov.uk/stellent/groups/dft
bridge University Press.                                                        _localtrans/documents/page/dft_localtrans_.hcsp
You can also read