Mobile Phone Location Data for Disasters: A Review from Natural Hazards and Epidemics

Page created by Rita Guerrero
 
CONTINUE READING
Mobile Phone Location Data for Disasters: A Review from Natural Hazards and Epidemics
Mobile Phone Location Data for Disasters: A Review
                                                 from Natural Hazards and Epidemics
                                                 Takahiro Yabe1 , Nicholas K. W. Jones2 , P. Suresh C. Rao1,3 , Marta C. Gonzalez4,5 , and
                                                 Satish V. Ukkusuri1,*
                                                 1 LylesSchool of Civil Engineering, Purdue University, 550 Stadium Mall Avenue, West Lafayette, Indiana 47907
                                                 USA
                                                 2 Global Facility for Disaster Reduction and Recovery, The World Bank, 1818 H Street, N.W. Washington, DC 20433
arXiv:2108.02849v1 [physics.soc-ph] 5 Aug 2021

                                                 USA
                                                 3 Department of Agronomy, Purdue University, 550 Stadium Mall Avenue, West Lafayette, Indiana 47907 USA
                                                 4 Department of Civil and Environmental Engineering, UC Berkeley, 760 Davis Hall, University of California,

                                                 Berkeley, California 94720, USA
                                                 5 Department of City and Regional Planning, UC Berkeley, 228 Bauer Wurster Hall, Berkeley, California 94720, USA
                                                 * sukkusur@purdue.edu

                                                 ABSTRACT

                                                 Rapid urbanization and climate change trends are intertwined with complex interactions of various social, economic, and
                                                 political factors. The increased trends of disaster risks have recently caused numerous events, ranging from unprecedented
                                                 category 5 hurricanes in the Atlantic Ocean to the COVID-19 pandemic. While regions around the world face urgent demands
                                                 to prepare for, respond to, and to recover from such disasters, large-scale location data collected from mobile phone devices
                                                 have opened up novel approaches to tackle these challenges. Mobile phone location data have enabled us to observe, estimate,
                                                 and model human mobility dynamics at an unprecedented spatio-temporal granularity and scale. The COVID-19 pandemic has
                                                 spurred the use of mobile phone location data for pandemic and disaster response. However, there is a lack of a comprehensive
                                                 review that synthesizes the last decade of work leveraging mobile phone location data and case studies of natural hazards and
                                                 epidemics. We address this gap by summarizing the existing work, and pointing promising areas and future challenges for
                                                 using data to support disaster response and recovery.

                                                     With population growth in many of the developing countries and concentration of resources and opportunities in urban
                                                 areas, many cities around the world are experiencing rapid urbanization. The United Nations, Department of Economic and
                                                 Social Affairs (UN DESA) estimates that by 2050, 68% of the people in the world is projected to be living in cities, compared
                                                 to 55% in 20181 . In addition to rapid urbanization, continued anthropogenic emissions of greenhouse gases will cause further
                                                 changes in the climate system, increasing the likelihood of severe and pervasive climate related hazards, including hurricanes,
                                                 tropical cyclones, river floods, heat waves, and droughts2 . Taken together, rapid urbanization and climate change, combined
                                                 with complex interactions of various social, economic, and political factors, have increased and could further increase the risks
                                                 of disasters across the globe. For example, urbanization could lead to more population living in vulnerable locations to hazards,
                                                 and more frequent disasters could widen the economic gap due to disproportionate impacts, which could then lead to political
                                                 divide and instability. A “disaster” is a condition or event that leads to an unstable and dangerous situation for human society,
                                                 and covers a wide range of shocks, including climate related hazards such as hurricanes, non-climate related natural hazards
                                                 such as earthquakes and epidemics including COVID-19. Regions around the world need to urgently prepare for, respond to,
                                                 and to recovery from these multitude of disasters for sustainable development.
                                                     The pervasiveness of mobile devices (mobile phones, smartphones) across the globe has opened up massive opportunities to
                                                 collect large-scale location data from individual users at an unprecedented scale compared to previous approaches (see Figure 1
                                                 for number of publications on human mobility and mobile phone data). Human mobility (for a review, see Barbosa et al.3 ) is
                                                 a critical component to understanding various disaster events. Large scale natural hazards cause severe damage to housing
                                                 structures and infrastructure systems, triggering mass evacuation, displacement, and migration from affected areas. For agencies
                                                 who aim to aid those who fled their homes with essential services and supplies, locations of such movement destinations
                                                 serve as crucial input information. Infectious diseases are by definition, transmitted between humans. Understanding the
                                                 inter-regional mobility flows could assist epidemiologists predict the outbreak of the disease. Mobile phone location data are
                                                 pertinent for responding to and recovering from such disaster events.
                                                     Prior to the availability of mobile phone location data, household surveys have been the primary source of information on
Mobile Phone Location Data for Disasters: A Review from Natural Hazards and Epidemics
Figure 1. Number of research articles returned by searching “human mobility” and “mobile phone” in Google Scholar by year.
Research has substantially increased over the years and was further spurred in 2020 due to the COVID-19 pandemic. The count
for 2021 was computed on July 12th, 2021.

understanding human mobility. Household surveys, compared to mobile phone location data, are advantageous in collecting
detailed information about respondents’ socio-demographic and economic characteristics, and knowing the reasoning of why
the respondents behaved in a certain manner. Mobile phone location data, despite its drawbacks in data governance and quality
uncertainties (discussed in Section in detail), is able to provide us with location information of a massive number of samples
(often millions), a rapid manner (minimum a few days; e.g.,4 ), at a high frequency (e.g., around 50 data points each day),
longitudinal time frame (e.g., 6 months before and after the disaster event5 ), and high spatial granularity (∼100 meters in spatial
error). More recently, the coronavirus disease (COVID-19) pandemic has spurred and accelerated the use of mobile phone
location data for pandemic disaster response6 . The attention and interest towards mobile phone location data from government
agencies, researchers, and the public, has never been higher.
    Despite such interest in the analysis of mobile phone location data for disaster management, currently we lack a compre-
hensive review of literature that synthesizes the progress that has been made in the past decade. Cinnamon et al.7 reviews the
progress using mobile phone call detail record (CDR) data. Yu et al.8 and Akter et al.9 review the usage and applications of
various types of novel big data, including social media data and satellite image data. Wang et al.10 reviews the usage of various
mobile phone location data (including smartphone GPS data), and is the most recent and closest to our review. However, with
the spread of COVID-19, more types of mobile phone location data has become increasingly available in research. Pressing
social and technical issues around mobile phone location data, including the opaqueness of the data generative process and data
governance, comparisons between different types of data (e.g., CDR vs GPS from location intelligence firms vs GPS from
major tech firms), applications in COVID-19 response, and recent progress in collaborations between academia, government
agencies, and the industry, are important topics that need to be reviewed for further progress in this area.
    The increasing use of mobile phone location data in disaster management and social good, recently spurred by global
efforts in COVID-19 response, has highlighted the usefulness of these datasets for assisting response and recovery6 . However,
at the same time, concerns about personal privacy, data governance, and potential malicious uses of mobile phone location
data have been raised as well11 . To organize and understand what can be achieved using mobile phone location data for
disaster management, and also its limitations, as well as methodological, societal, and data-related issues, this article conducts
a comprehensive and interdisciplinary literature review on efforts that have used mobile phone location data for disaster
management. This review will not cover, however, the other types of data that are more frequently being used in disaster
management, including social media data (for a review, see e.g., Muniz et al.12 , Kryvasheyeu et al.13 ) and satellite imagery
data (for a review, see e.g.,14 ). In Section 2, we review the typology of mobile phone location data, Section 3 covers the
scientific progress, applications, and case studies in natural hazards and epidemics. Section 4 and 5 discusses and concludes
with opportunities and future challenges of using mobile phone location data for disaster response and recovery.

Types of Mobile Phone Location Data
Mobile phone location data can be classified into three main categories: mobile phone call detail records (CDR), smartphone
GPS location data collected by location intelligence companies, and smartphone GPS location data collected and processed by

                                                                                                                              2/18
Mobile Phone Location Data for Disasters: A Review from Natural Hazards and Epidemics
Data type                 Description                       Pros and Cons                        Providers (e.g.)
                                                                  (+) substantial coverage of the
                                Location information of cell
      Mobile phone call de-                                       population (-) Low spatial and       NCell, Orange, Voda-
                                phone towers when users
      tail records (CDR)                                          temporal resolution compared to      fone, Turkcell
                                make calls or text messages
                                                                  GPS datasets
                                                                  (+) precise location information
                                GPS data collected and ag-        of users (-) No transparency in
      Smartphone GPS lo-
                                gregated from several third       data generation process; covers      Cuebiq, Veraset, Safe-
      cation data (Location
                                party smartphone applica-         a small sample of population         graph, Unacast
      Intelligence firms)
                                tions                             compared to CDR; available for
                                                                  fewer countries
                                                                  (+) Available in standardized for-
      Smartphone GPS lo-        GPS data collected and ag-        mats across multiple countries
                                                                                                     Google, Facebook,
      cation data (Major        gregated from their own plat-     and across time (-) Outputs re-
                                                                                                     Apple, Yahoo Japan
      Tech firms)               forms                             stricted to selected metrics pro-
                                                                  duced by the tech firms

Table 1. Brief descriptions and applications of the four novel types of data: mobile phone location data, social media data,
web search query data, and satellite imagery night time light data.

major tech companies. Table 1 organizes how they are collected, the pros, cons, and examples of providers for each dataset.

Mobile Phone Call Detail Records (CDR)
During the last decade, mobile phone call detail records (CDR) have become one of the primary data sources for analyzing
human mobility patterns on the urban scale15 . Call detail records typically contain the unique ID of the user, timestamp, and
location information of the observed cell phone tower. Note that unlike smartphone GPS data introduced later, the location
information of CDRs are not the actual location of the user, thus contains typically around couple 100 meters to several
kilometers in the rural areas where cell phone towers are sparsely located. Using large-scale datasets of CDR data, a seminal
paper by Gonzalez et al. unraveled the basic laws of human mobility patterns16 . Several more papers have used CDR data to
understand spatio-temporal patterns of urban human mobility, routine behavior, and their predictability (e.g.,17, 18 ). Moreover,
human activity patterns and land use patterns have been studied using CDR data (e.g.,19 ). In addition to understanding human
behavioral laws, such data has enabled us to obtain dynamic and spatially detailed estimations of population distributions
(e.g.,20 ), social integration and segregation of mobility (e.g.,21 ), and macroscopic migration patterns (e.g.,22 ). Moreover, these
datasets have been applied to solve various urban problems such as preventing disease spread23–25 , estimating traffic flow
(e.g.,26, 27 ), and estimating socioeconomic statistics (e.g.,28 ) and impacts of shocks29 (for a full review, see30, 31 ).

Smartphone GPS Location Data from Location Intelligence Firms
More recently, we have seen an increase in the availability of mobile phone GPS location datasets collected by location
intelligence companies, such as Cuebiq (https://www.cuebiq.com/), Unacast (https://www.unacast.com/),
and Safegraph (https://www.safegraph.com/). Location intelligence companies collect location data (e.g., GPS data)
from third-party data partners such as mobile location-based application developers. Typically for each data point, a user
identifier, timestamp of observation, and the longitude and latitude information are included in the dataset. More recently, these
firms have started provided more aggregate (e.g., aggregated for each point-of-interest) data to preserve the privacy of the users.
Compared to CDR, GPS logs have higher spatial preciseness, and moreover, higher observation frequency, allowing us to
understand mobility patterns in more detail. However, often the specific sources of the location data nor the process in which
the data are collected and combined from several application services are undisclosed to the users. Therefore, using such data
requires a rigorous analysis of checking the representativeness of the mobile phone location dataset.

Smartphone Location Data from Major Tech Firms
Similar to the smartphone GPS location data collected by location intelligence firms, major tech firms such as Facebook,
Google, and Apple, also collect GPS location data from their users. The major difference in the data generative process is that
these major tech firms use data collected from their own platform, not by third party services. Often, these data are provided in
a pre-processed form, aggregated by both time and space. Facebook, through its “Data for Good” program, provides various

                                                                                                                                3/18
Figure 2. Population displacement after the Puebla Earthquake in Mexico City. Anomaly score (z score; number of
standard deviations more/less than the pre-earthquake mean) of population density during the day (left) and night (right) on
September 19th, 2017 in Mexico City. Significant displacement is observed during the night time of the day of the earthquake
(Source:38 )

types of location information products to researchers, agencies, and non-profits (https://dataforgood.fb.com). In
particular, the “Facebook Disaster Maps” provides detailed density maps of the population density and movement patterns
before, during, and after disaster events. The data is temporally aggregated (usually every 24 hours), spatially aggregated
(usually into 360,000 square meter tiles), and spatially smoothed, to anonymize and protect the users’ privacy32, 33 . The
Maps have been utilized by many significant nonprofit organizations and international agencies in disaster response, including
the International Federation of the Red Cross, the World Food Programme, the United Nations Children’s Fund (UNICEF),
NetHope, Direct Relief, and others.

Applications and Methodologies
Natural Hazards
Recently, mobile phone data has been utilized in many applications for disaster response and recovery, given its high spatial
and temporal granularity, scalability to analyze millions of individuals’ mobility, and increasing availability. In this section, the
studies using mobile phone data for natural hazard response and recovery are categorized into 3 categories of applications:
population displacement and evacuation modeling, longer-term recovery analysis, and inverse inference of damages to the built
environment. The required inputs, methodologies, obtained outputs, and case studies are presented for each application.

Population Displacement and Evacuation Modeling
The most widely studied applications of mobile phone location data in disaster response and recovery is to estimate the
population displacement and evacuation dynamics after disasters. In their seminal paper, Lu et al. used CDR to study the
predictability of displacement mobility patterns after the Haiti Earthquake in 201034 . Using data collected from 1.9 million
mobile phone users during the period from 42 days before to 341 days after the shock, the study estimated that 23% of the
population in Port-au-Prince had been displaced due to the earthquake. Despite the substantial displacement, they also found
that the destinations of the displaced people were highly correlated with their pre-earthquake mobility patterns. This finding
shed light on the possibility of predicting post-disaster mobility patterns, and had significant implications on relief operations
including the pre-positioning of distribution centers35 and evacuation shelters. Another seminal disaster event that highlighted
the use of mobile phone location data was the Gorkha Earthquake (intensity of 7.8Mw) which struck Nepal in 201536 . Wilson
et al. rapidly analyzed the displacement movements of 12 million de-identified mobile phone users after the earthquake within
nine days from the event4 . It was estimated that over 390,000 people left the Kathmandu Valley after the earthquake. These
results were released as a report with the United Nations Office for the Coordination of Humanitarian Affairs (UN OCHA) and
a range of relief agencies. This effort by Flowminder, a non-profit foundation for analyzing mobile phone location datasets, was
the first significant practical use-case of large scale mobile phone location data in disaster relief and response37 .
    Following the aforementioned two seminal works after the Haiti Earthquake and Gorkha Earthquake, several studies have
developed methods to estimate population displacement and post-disaster evacuation patterns using mobile phone location

                                                                                                                               4/18
data. A general framework for the spatio-temporal detection of behavioral anomalies using mobile phone data was proposed by
Dobra et al.39 . Using smartphone location data from before and after disasters, population displacement can be quantified by
measuring the anomaly score (z-score; the number of standard deviations more or less from the mean population on a typical
day) of the daytime and nighttime population in highly granular (1km x 1km) grid cells, as shown in Figure 238 . During the
night time after the earthquake, blue-colored clusters with z scores below -2, indicating a likelihood of less than 1% on a typical
day, can be observed in central Mexico City, showing significant decrease in nighttime population. Yabe et al. used smartphone
GPS location data collected by Yahoo Japan Corporation to analyze the evacuation rates after five earthquake events in Japan40 .
Cross-comparative analysis of five earthquakes and over 100 affected communities revealed similar relationships between
evacuation rates and seismic intensity levels, where evacuation rates significantly increased in communities that experienced
magnitudes above 5.5. Several computational frameworks have been proposed to estimate the spatial patterns of evacuation
destinations and hotspot locations using anomaly detection techniques on large-scale mobility data41 . Just after the Kumamoto
Earthquake in April 2016, population distribution and evacuation hotspot maps were produced jointly by researchers at the
University of Tokyo and Yahoo Japan Research, and were delivered to city governments for relief and response42 . Duan et al.
studied the evacuation patterns after a train collision incident in China using mobile phone location data, identifying a two-stage
evacuation process, and also behavioral changes in commuters’ travel route choices43 . Ghurye et al. study the displacement
patterns after the Rwanda Flood in 2012 using Markov Chain models and CDR44 . The study compares the observed human
behavior during a disaster with the behavior expected under normal circumstances to understand the causal effects of the disaster
event. Yin et al. combined mobile phone location data with agent based simulations (which are widely used in evacuation
analysis; e.g.,45 ) to improve the estimation accuracy of evacuation movement, proposing a hybrid approach46 .
    More computational approaches using data assimilation techniques have been explored for online, near real-time predictions
of post-disaster mobility patterns. Song et al. proposed a mobility prediction model based on a Hidden Markov Modeling
framework, and tested its validity using data collected from 1.6 million mobile phone users in Japan before, during, and after
the Great East Japan Earthquake in 201147 . Sudo et al. developed a Bayesian data assimilation framework by combining
the particle filter and Earth Mover’s Distance algorithms, that updates the urban-scale agent based mobility simulation in an
online manner using spatially aggregate mobile phone location data provided in real time48, 49 . Several online algorithms have
been proposed since these seminal works, including CityMomentum50 that uses a mixture of multiple random Markov chains,
CityCoupling51 that aims to perform cross-city predictions, and inverse reinforcement learning approaches that attempt to learn
the behavioral patterns of human mobility during disasters from large scale data52, 53 . Although these computational, online
approaches are shown to be effective in experimental and post-hoc settings, none have been utilized in real-time after real-world
disaster events.
  Policy applications: Evacuation and displacement estimation could be used for making various policy decisions during
  the response and preparation stages of the disaster risk management cycle, including quick identification of post-disaster
  needs, planning of emergency supply distribution networks, and pre-positioning of evacuation shelters and supplies.

Longer-term Analysis: Migration and Recovery
One advantage of mobile phone location data is the ability to track the movements of users over a long period of time (several
months ∼ year) with high frequency (e.g., hourly ∼ daily), which are extremely difficult to perform using household survey
data. Therefore, in the normal setting, there have been attempts to use mobile phone location data to estimate population
migration dynamics22, 54, 55 . In the disaster setting, Lu et al. studied the migration patterns in regions stressed by climate shocks
in Bangladesh using CDR56 . In addition to analyzing the short term human mobility patterns after Cyclone events (hours
∼ weeks), the study quantifies the incidence, direction, duration and seasonality of migration in Bangladesh. Acosta et al.
quantified the migration dynamics from Puerto Rico after Hurricane maria using mobile phone, showing a shift from rural to
urban areas after the disaster57 . Yabe et al. studied the population displacement and recovery patterns after five disaster events,
including Hurricanes Maria and Irma, earthquakes, floods, and tsunami using mobile phone GPS datasets from Japan and the
US (Figure 3)5 . Cross-comparative analysis of five major disasters revealed general exponential decay patterns of population
recovery, and predictability using a small number of key socio-economic factors including population density, infrastructure
recovery patterns, wealth indexes, and spatial network patterns. Marzuoli et al. used mobile phone data to analyze the recovery
dynamics of residents in South Texas after Hurricane Harvey58 . The study provided detailed statistics of population movement
and origin destination patterns for different zipcodes in Texas. In addition, the role of social networks59, 60 , hedonic behavior61 ,
and post-disaster spatial segregation62 have been tested using mobile phone location data after disasters. Apart from population
recovery and migration analysis, mobile phone location data has been used to quantify the recovery of business firms as well,
using foot traffic counts as a proxy for revenue63 . Regional and sector differences in disaster impacts have been quantified via a
Bayesian structural time series framework, using data collected from over 200 business entities in Puerto Rico before and after
Hurricanes Irma and Maria. Although mobile phone location data provide significant advantages in analyzing longer-term
phenomena (e.g., migration and recovery after disaster events) compared to household survey data, most studies focus on shorter

                                                                                                                                5/18
Figure 3. Similarity of macroscopic population recovery patterns across the five disasters. a. Location, spatial scale,
and severity of disasters that were studied. Red colors indicate the percentages of houses that were severely damaged in each
community. b-f. Macroscopic population recovery patterns after each disaster. Raw observations of displacement rates were
denoised using Gaussian Process Regression and were then fitted with a negative exponential function. D0 , D160 and τ denote
the displacement rates on day 0, day 160, and recovery time parameter of each fitted negative exponential function. Black
horizontal dashed line shows average displacement rates observed before the disaster. g. Normalized population recovery
patterns all follow an exponential decay. (Source:5 )

term displacement and evacuation analysis, leaving substantial room for research in understanding the long term recovery and
resilience of urban and rural areas to disasters.

  Policy applications: Analysis of migration and recovery estimation could be used for developing policies that focus
  on longer term recovery and mitigation of hazards. Such tasks include longer term infrastructure investment plans for
  building-back-better from disasters, strategies for harnessing community social capital for community resilience, and
  planning of urban land use master plans to prepare for longer term migration dynamics.

Inverse Inference of Damage to the Built Environment
The studies introduced in the previous two subsections studied the anomalies in human mobility patterns disrupted by shocks
(e.g., hurricanes, earthquakes, tsunami) inflicted to the built environment. However, several studies have approached the
problem in an inverse manner, by using anomalies observed in the mobile phone location data and human mobility dynamics to
inversely estimate the damage to and recovery of the built environment, which have traditionally been estimated using hazard
simulations and structural mechanics (e.g.,64 ). Andrade et al. propose a novel metric “reach score” that quantifies the amount
of movement of mobile phone users, and finds that the reach score has significant correlation with the damage inflicted to
infrastructure systems by the earthquake at the canton level in Ecuador65 . Pastor-Escuredo et al. show that by analyzing the
anomalous patterns in mobile phone communications, we are able to conduct infrastructure impact assessment due to flooding
events, using retrospective data collected from a flood in Mexico66 . Finally, Yabe et al. propose a machine learning algorithm
that combines mobile phone location data with terrain information to conduct a rapid and accurate estimate of the inundated
areas during a flood event67 . These studies show the potential of using mobile phone location data to infer the abnormal states
of the built environment. Mobile phone location data has several advantages compared to conventional methods in data quality,
including satellite imagery which are often observed sparse in time (e.g., once a day at most), and social media data which are
more sparsely observed. While the application potential of these studies are promising, we lack comprehensive analysis of its
real-time feasibility and accuracy under different types of events.

                                                                                                                           6/18
Policy applications: Detecting anomalies in human behavior and mobility patterns could provide rapid assessment of
  damage inflicted to the built environment under data scarcity, and be applied for various downstream tasks including the
  preparation of real-time flood inundation maps and identifying dysfunctional mobile phone tower locations.

Epidemics
Over the past decade, mobile phone location data have been utilized in modeling the outbreaks and spread of infectious diseases,
including cholera, malaria, and Ebola. Many studies have used mobile phone CDR to extract mobility inter-regional fluxes, and
have integrated such network dynamics with disease models to predict the spread of diseases and social dynamics68 . Moreover,
the coronavirus disease 2019 (COVID-19) has spurred the use of mobile phone location data for disease modeling. In this
section, we review the methods and case studies of the use of mobile phone location data for epidemic modeling.

Mobility Network Estimation for Epidemiological Modeling
The majority of the research have used mobile phone location data (mainly CDR) to extract the intra-regional mobility (origin
destination) patterns, and integrates such insights into epidemiological models (e.g., SIR, SEIR models) to predict disease
outbreaks. The seminal work on this topic performed by Wesolowski et al. used CDR from Kenya to quantify the importation
routes that contribute to malaria epidemiology on regional spatial scales24 . The identification of the sources and sinks of
imported infections due to human mobility showed significant potential in improving malaria control policies. Combined with
rapid risk mapping, mobile phone location data based approaches could aid the design of targeted interventions to maximally
reduce the number of cases exported to other regions while employing appropriate interventions to manage risk in places that
import them69 . A review and comparison of using survey based travel data and mobile phone data revealed that survey data
produces lower estimates of travel, however, provided demographic information and motivations of travelers, which could be
further utilized for modeling. On the other hand, mobile phone data provides a refined spatio-temporal description of travel
patterns, although it lacks demographic information about the travelers70 . Bengtsson et al. estimated the mobility network using
movements of 2.9 million anonymous mobile phone users (CDR) in Haiti during the 2010 cholera outbreak. The prediction
accuracy of the outbreak were compared with gravity model estimates, and it was shown that mobility networks generated
from mobile phone data had comparable accuracy with gravity models, however, mobile phone data was advantageous since
it required no model parameter calibration, unlike gravity models71 . Finger et al. used a mobile phone CDR dataset of over
150,000 users in Senegal to extract human mobility fluxes Qi j (t) ∀i, j across regions i and j, where Qi j (t) represents the
community-level average fraction of time that users living in region i spend in region j during day t. By directly incorporating
the mobility fluxes into a spatially explicit, dynamic epidemiological framework, they identified mass gatherings to be a key
driver of the cholera outbreak23 .
     Similar studies have been conducted in other regions, on other types of diseases, with several improvements to the epidemic
modeling methodologies. For example, Wesolowski et al. used seasonal fluctuations in travel patterns estimated from mobile
phone data to characterize seasonal fluctuations in risk across Kenya for rubella disease73, 74 . Moreover, seasonal asymmetric
mobility patterns were used to refine the epidemiological models in Kenya, Namibia, and Pakistan75 . Panigutti et al. assessed
the stochasticity in the epidemic modeling outcomes, and showed that model estimates are become more adequate when
epidemics spread between highly connected and heavily populated locations76 . Similarly, Tizzoni et al. compared the use
of mobile phone location data and census information in epidemiological models, and found that phone data matches the
commuting patterns reported by census well but tends to overestimate the number of commuters, leading to a faster diffusion of
simulated epidemics (shown in Figure 4)72 . Rubrichi et al. used mobile phone data and epidemiological models to evaluate
the effects of various spatial-based targeted disease mitigation strategies77 . In other regional and disease contexts, Vogel et al.
estimated the Ebola outbreak in Western Africa by using CDR in a simulation framework78 , Ihantamalala et al. estimated the
sources and sinks of malaria parasites in Madagascar79 , and Kiang et al. improved forecasts of Dengue fever in Thailand by
integrating human mobility data80 .
     During the coronavirus pandemic (COVID-19), we saw a rapid increase of the use of mobile phone smartphone GPS
location data to estimate mobility networks for epidemiological modeling81 . Schlosser et al. analyzed the structural changes
in the mobility network during mobility restrictions in Germany, and found that long-distance travel trips were reduced
disproportionately, enabling the flattening of the epidemic curve and delaying the spread to geographically distant regions82 .
Lai et al. used mobility data-driven travel networks combined with an SEIR model to evaluate the effects of various non-
pharmaceutical interventions on the spread of COVID-19 in China83 .

  Policy applications: Estimating human mobility networks in spatial and temporal granularity can be used to not only
  understand migration patterns, but as crucial input for epidemiological models (e.g., SIR, SEIR models), which can be
  used to predict the outbreak of the diseases, and the effects of various policies (e.g., lockdown, inter-regional mobility
  restrictions) in containing the spread.

                                                                                                                              7/18
Figure 4. Epidemic invasion trees. Full invasion trees for R0 = 3 are shown for Portugal (top row) and France (bottom row)
in the cases of the census network (a, d), the mobile phone network (b, e) and the radiation network. (Source:72 )

Monitoring and Forecasting of Non-Pharmaceutical Intervention Effects
While mobile phone location data has been shown to be an adequate data source to estimate the inter-regional mobility networks
which are crucial inputs for epidemiological models, they can also be used to evaluate the effects of non-pharmaceutical
interventions, including regional and national lockdowns and inter-regional travel restrictions, in restricting human behavior.
Prior to the COVID-19 pandemic, Peak et al. used mobile phone CDR data to evaluate the effects of a lockdown in Sierra
Leone during the Ebola epidemic84 . As many countries adopted non-pharmaceutical interventions (NPIs) during the COVID-19
pandemic, mobile phone location data (mainly GPS data) were used to evaluate the effects of such orders85 . Researchers from
academia, industry, and government agencies (e.g.,86 ) have utilized large-scale mobility datasets to estimate the effectiveness
of control measures in various countries. Such analyses were conducted and often frequently updated to monitor mobility
reduction situations87 . Kraemer et al. used mobile phone-generated mobility data from Wuhan and detailed case data including
travel history to show that especially during the early stages of the outbreak, the spatial distribution of the COVID-19 cases
were explained well by mobility data88 . Pepe et al. quantified three different aggregated mobility metrics (origin-destination
movements between provinces, radius of gyration, and average degree of a spatial proximity network) during the lockdown in
Italy using mobile phone location data89, 90 .
    In our past work in Japan as shown in Figure 5, human mobility metric including the social contact index was quantified
before, during, and after non-compulsory lockdowns91 . Analysis showed that even after non-compulsory orders, mobility
significantly dropped (70% reduction) and the effective reproduction number had decreased to below 1. Such analysis has been
conducted in the United States as well, assessing the effects of state-level interventions on mobility reduction92 , significant
geographical variations in social distancing metrics93 , and income inequality in social distancing94 . Similar studies on mobility
monitoring during non-pharmaceutical interventions were performed in Sweden95 , the United Kingdom96 , Italy97 , France98 ,
Spain99 , Switzerland100 , Finland101 , Taiwan102 , and Hong Kong103 .
    In addition to monitoring the effects of non-pharmaceutical interventions, there has been an increasing number of studies
focusing on forecasting and providing early warning of outbreaks. Kogan et al. used multiple sources of data including mobile
phone location data, social media data, and web search data, to prodive early warning signals of COVID-19 outbreaks. The
study showed that combining disparate health and behavioral data may help identify disease activity changes weeks before
observation using traditional epidemiological monitoring104 . Similarly, Yabe et al. used mobility data and web search data
provided by Yahoo Japan Corporation to develop risk indexes for microscopic geographical areas, and showed that such
metrics could predict local outbreaks two weeks beforehand105 . Chang et al. integrated human mobility network data into a
metapopulation SEIR model to simulate the spread of COVID-19, and identified specific points-of-interest which are if closed,

                                                                                                                             8/18
Figure 5. Macroscopic mobility dynamics. (A)-(C) show the population distributions on 3 different dates at same times
(12PM), each on the same day of week (Mondays). We observe substantial decrease in the population density at stations and
cities. (D) shows the amount of contacts an individual potentially encounters outside home for each time period. (E) shows the
non-linear relationship between the mobility metrics and R(t). (Source:91 )

could be effective in suppressing the disease spread106 . The use of mobile phone location data has become prevalent in the
field of economics, for example, Chetty et al. developed a platform to track the impacts of COVID-19 on businesses and
communities in real-time, using various types of data including Google Mobility Report data107 .

  Policy applications: Quantifying various human mobility metrics (e.g., stay-at-home rates, average travel distance, social
  co-location index) in near-real time, in high spatial and temporal granularity, can be used to assess the effects of various
  non-pharmaceutical interventions on human behavior.

Discussion
Opportunities
Increasing Availability of Data Products
As reviewed in Section 3, many studies have already utilized the various kinds of mobile phone location data for disaster
management. However, these were often enabled by direct partnerships or collaborations between researchers and private
companies who own the data, making the data extremely difficult to access for researchers outside the agreement. Due to
the increased attention and interest on mobile phone location data during the COVID-19 pandemic, there has been several
notable efforts where mobile phone location data, in their anonymized forms, are being made openly available for the public
use. For example, the PlaceKey community (https://www.placekey.io/) have contributed to this effort by providing
a semi-open platform where researchers can freely access aggregated mobile phone location data for analysis. The data are
spatially and temporally aggregated to point-of-interests, and also made sure that a substantial small number of visit counts are
masked, so that the individual users are unidentifiable. There are cases where researchers have led the efforts in anonymizing
the data and making the mobility data open source. The team of researchers from The Robert Koch Institute and Humboldt
University of Berlin have developed a dataset which contains mobility data collected from mobile phones in Germany during
the first half of 2020 (January-July), and mobility data from March 2019, which can be used to study changes in mobility during
the COVID-19 pandemic in 2020 (https://www.covid-19-mobility.org/).
    In addition to these efforts, various organizations including major tech firms have made significant contributions in
publishing aggregate statistics of mobility (e.g., social distancing, travel distance) during the COVID-19 for various regions
around the world. The Google COVID-19 Community Mobility Reports, which contained the time series data of travelled
distance in various cities around the world, was used by practitioners to monitor the effects of non-pharmaceutical policies on

                                                                                                                            9/18
mobility restrictions108 . A similar report on mobility patterns was also issued by Apple109 . Camber Systems developed the
county-level social distancing tracker based on aggregated and anonymous location data to understand how populations are
engaging in social distancing over time (https://covid19.cambersystems.com/). The COVID-19 Mobility Data
Network (CMDN) is a network of infectious disease epidemiologists at universities working with technology companies to use
aggregated mobility data to support the COVID-19 response. The CMDN developed the Facebook Data for Good Mobility
Dashboard, which visualizes the aggregate mobility trends, computed from Facebook mobility data, at the regional levels for
various countries around the world (https://visualization.covid19mobility.org/).

Data for Development
With the availability of various types of novel datasets including social media data, mobile phone location data (call detail
records, GPS), web search query data, and satellite imagery data, there has been significant efforts to utilize big data analytics
for tackling challenges in development110 . Several open data challenges have been initiated by collaborations between academia
and industry data providers, such as the Data4Development Challenge held by Orange, which provided mobile phone data
from Ivory Coast for analysis111 . Large tech firms, including Google, Facebook, Apple, and Microsoft, have all boosted
their efforts in utilizing the enormous amount of collected data for development and disaster management. Google.org, the
is the charitable arm of Google, has committed roughly US$100 million in investments and grants to nonprofits annually to
tackle various issues including disaster response, improving accessibility to education, and more recently, recovering from
COVID-19 impacts (https://www.google.org/). International agencies have also accelerated their engagement in
utilizing such big data sources for development projects. The World Bank has initiated the Development Data Partnership
(https://datapartnership.org/), which is a partnership between international organizations and companies, created
to facilitate the use of third-party data in research and international development. The Partnership includes more than 20
private companies, including location intelligence companies such as Google, Cuebiq, Safegraph, and CARTO, and social
media companies including Twitter and Facebook. To assist the utilization of these datasets, recently, the Global Facility for
Disaster Reduction and Recovery (GFDRR) - a partnership hosted within the World Bank - has undertaken efforts on using
GPS location data collected from smartphones to analyze post-disaster population displacement for disaster relief and urban
planning policy making. GFDRR has published working papers and publications on several case studies using smartphone
location data accessed through the Development Data Partnership initiative, including the population displacement patterns
and income inequality in Mexico City after the Puebla Earthquake38 and socioeconomic gaps in mobility reduction during the
COVID-19 pandemic in Colombia, Mexico, and Indonesia112 .

Open Source Toolkits for Mobility Analytics
To assist policy makers and non-data experts to leverage the increasing availability of mobile phone location datasets,
there has also been several efforts to develop open source toolkits for mobility data analytics. scikit-mobility is a
Python-based library that enables various operations and analyses on large-scale mobility data113 . Compared to previous
Python based mobility analysis libraries such as Bandicoot114 and movingpandas115 , scikit-mobility is most
comprehensive, containing functions for pre-processing, stop detection, computation of mobility metrics (e.g., displacements,
characteristic distance, origin-destination matrix), trajectory synthesis, visualizations, and privacy risk quantification. There
exists several libraries to conduct trajectory analysis in the R ecosystem, however, none of the libraries are optimized for
human mobility data, thus lacks functions for generating synthetic trajectories and producing advanced visualizations (for a
review, see116 ). OSMnx is a powerful library for acquiring, constructing, analyzing, and visualizing complex street networks
from OpenStreetMap117 . In combination with human mobility data, OSMnx enables users to perform various spatial analysis
including route estimation and point-of-interest visit estimation. More recently, the GFDRR developed an open-source location
data analytics toolkit in Python MobilKit in collaboration with Purdue University and MindEarth (a non-profit based in
Switzerland https://www.mindearth.org/), which extends the functions in scikit-mobility to conduct post-
disaster mobility analysis (https://github.com/GFDRR/mobility_analysis). To enable non-experts to use the
softwares, the codes are optimized using Dask118 for parallel computing, so that analysis on massive mobility datasets can be
conducted under constrained resources, on local laptop or desktop computers.

Challenges
Despite the enormous opportunities in using mobile phone location data for disaster management as reviewed in the previous
sections, the rise of novel mobile phone location data produced by location intelligence companies and the wide spread use
of the data by various stakeholders, pose new challenges in utilizing the dataset in an inclusive, transparent, and sustainable
manner. Here, we touch upon the main key challenges that we face in the usage of mobile phone location data, related to
assuring the data quality, governance, and open research directions in developing advanced analysis techniques.

                                                                                                                            10/18
Understanding the Data Generative Process
One of the key drawbacks of using the more recently available smartphone GPS location data is the lack of our understanding
in how these data are collected and processed. Several studies have conducted investigations on the representativeness of these
datasets (e.g.,5 ) using raw data, by quantifying the correlation between the number of mobile phone users estimated to be living
in each geographical region, and the census population information. This metric, however, is far from comprehensive, and
we have pressing demand for a more thorough investigation on various aspects of socio-demographic and socio-economic
characteristics, and to ensure that the observation samples in the data are not biased towards a specific population group of
wealth, region, ethnicity, gender, etc. This procedure becomes even more difficult when only aggregate information, such as
the total number of daily users in a specific region or the daily number of visitors to a specific point-of-interest, are provided
by the data providers. In addition to the uncertainties in the sample representativeness, the data collection procedure is not
transparent. For example, some softwares and applications collect location data when the device detects substantial movement,
therefore, only a very small number of points would be observed if the user stays at one location (e.g., home) during the entire
day. Other algorithms collect location information in extremely high frequency (e.g., every minute), irrespective of the amount
of movement. This is partly the reason why we observe such a large variance (i.e. truncated power law) in the number of
observation points per user5 . In the absence of methods and algorithms for correcting the bias in the data, the trustworthiness of
the data products and analysis will be undermined. A more open discussion between data users – researchers and practitioners –
and data providers to further understand the process of dataset generation, and a standardized way of quantifying and reporting
the representativeness biases and the potential errors present within the dataset are essential for more inclusive, fair, and
trustworthy data products for disaster response.

Data Governance
As we experience an increase and universal accessibility to large scale mobile phone location data, the protection of personal
privacy has never been more important11 . Previous studies have revealed that a very few number of data points could reveal the
identity of the user with high accuracy, highlighting the importance of anonymization techniques119 . Following such public
concerns, data providers have started to provide processed data, aggregated by space and time. For example, the Disaster Maps
data in the Facebook Data for Good program aggregates population density and flow into each day, into 6 kilometer size grid
cells, and further applies spatial smoothing algorithms to anonymize the data. This process, although effective in anonymizing
the data and protecting the users’ privacy, comes with a price in the data granularity and uncertainties in the data quality, as
explained in the previous Section. To address this issue and to balance out the data quality with privacy protection, the concept
and techniques of differential privacy are gaining attention. Differential privacy is a criterion, which tools are devised to satisfy.
It enables the collection, analysis, and sharing of statistical estimates using personal data while protecting the privacy of the
individuals in the dataset120 . Techniques such as differential privacy may serve as one baseline to ensure the safety of personal
privacy, but we are still amidst the search for a holistic framework that integrates technical solutions, ethical guidelines, and
regulations on the use of mobile phone location data.

Translating Analysis into Disaster Risk Management Policy and Operations
Mobility data has been shown to be effective in various applications that can be used for policy inputs, including conducting
rapid post disaster damage and needs assessment, business disruptions and recovery monitoring, and dynamic population
mapping. Key areas of opportunity include: (i) increasing situational awareness of emergency response managers through timely
information on the number and geographic location of displaced persons; (ii) improving damage and needs assessments through
quantification of foregone economic activity in business sectors; (iii) informing finance and policy support for post-disaster
recovery efforts by quantifying business recovery rates in affected districts. While there has been many successful cases of
translating the mobility data analysis into policy decision making, such as the population displacement maps after the Gorkha
Earthquake4 , there is still a limit to the number of organizations and research groups that are capable of conducting such an
end-to-end, analysis-to-policy translation. As many regions face an increasing likelihood of experiencing disaster events due to
urbanization and climate change, there is a pressing demand for expanding these mobility data-driven solutions across regions
and disaster events. One attempt to localize these data-driven solutions is to develop open source toolkits (as introduced in
Section 4.1.3) to increase the capacity of local stakeholders and data scientists to conduct such analysis. In addition to the
analytics tools, stakeholders require an effective scheme to share experiences, knowledge, and know-how across different
regions and stakeholders. To foster strong uptake of insights derived from human mobility data, further methodological research
is needed to address key challenges such as quantifying the representativeness and socio-economic bias of human mobility
datasets; and accounting for impact of network outages during disaster events on observed population numbers. Exploring a
way to expand the mobility data analytics into various local contexts is a critical operational challenge.

                                                                                                                               11/18
Future Research Directions
Cross-Comparative Analysis across Events
Mobile phone location data, with its global coverage and spatio-temporal scale in data size, allows us to conduct cross
comparisons across locations, disaster types and time scales, as shown in previous studies (e.g.,5, 40 ). Comparing the response
and recovery dynamics across different disaster events across regions, allows us to extract essential dynamics that govern
the disaster recovery process, as demonstrated in Yabe et al., where general patterns of recovery of population displacement
(i.e., negative exponential decay) were discovered5 . In addition, such parsimonious models allow us to show and explain the
differences/variability that exist across the regions, using socio-demographic and -economic factors. Such type of transferable
and universal models are much needed in the disaster science literature. Using such insights, we are able to build parsimonious
models of disaster response and recovery, which were difficult to do before using conventional household survey data.

Modeling the Disaster Recovery Process
As pointed out in Section 3.1.2, although we have a large collection of case studies that conduct displacement analysis using
mobile phone location data, the literature is still limited in studies that perform analysis and modeling of the long-term recovery
process after natural hazards. This is partly due to the lack of availability in long term data (over 6 months) in the same
region and data provider. As a result, one limitation in natural hazard response and recovery process modeling is the lack
of a standardized parsimonious model that captures the dynamics of population movement and recovery, similar to what the
SIR, SEIR models achieve in epidemiological modeling. Mobile phone location data, either independently or by fusing with
other data types, allows us to model correlations and interdependencies across various systems that compose cities. Modeling
these interdependencies will allow us to build dynamic and causal models that show how the social, built environment and
economic forces contribute to the response and recovery of communities after disasters. Recently, a system dynamics modeling
approach that captures the interdependent dynamics between social and technical systems has been proposed and tested using
the case study of recovery after Hurricane Maria in Puerto Rico121 . Despite its capability in replicating the recovery process
and understanding the system interdependencies that play a role in disaster recovery, we still lack parsimonious models that
capture the various aspects of disaster recovery including population migration.

Fusion with Other Data Sources
While we have seen a rapid increase of the usage of mobile phone location data, there are several other types of data that have
been used frequently in disaster management, including satellite imagery (for a review article, see14 ) and social media data (for
review articles, see12, 122 ). Satellite imagery, despite its low frequency of data collection, enables the observation of damages to
the natural and built environments in a detailed spatial scale. On the other hand, social media data contains rich information on
the peoples’ opinions, ideas, and sentiments at a high temporal granularity. Moreover, combining mobile phone location data
with household surveys could allow us to analyze both the post-disaster mobility patterns as well as the motivations behind such
behavior. More recently, credit card transaction data has become more available for research purposes (e.g.,123 ). Using credit
card data, we are able to understand the economic impacts of disasters and epidemics at a spatially and temporally granular
level. Combining these datasets with mobile phone location data and human mobility analytics (e.g., application in poverty
estimation124 ) could enable a more holistic understanding of the social, physical, and economic dimensions of the disaster
response and recovery dynamics.

Conclusions
Due to rapid urbanization, climate change, and complex interactions with various social, economic, and political factors, the
risks of disasters – natural hazards and pandemics – are continuing the increase across the globe. The comprehensive literature
review of research and efforts that have used mobile phone location data for (natural and pandemic) disaster management
throughout the past decade has shown that such data enables the implementation of various rapid, high-precision, large-scale
approaches to assist disaster management (response, recovery, and preparation), compared to conventional approaches using
household surveys. More specifically, applications in natural hazard response include population displacement and recovery
analytics, quantifying economic disruptions, and inferring the physical damage inflicted to critical infrastructure systems
through behavioral changes. Dynamic and high-resolution origin destination matrices are critical inputs for epidemiological
models, and have already been applied to predict the spread of a wide range of communicable diseases. The COVID-19
pandemic spurred the use of mobile phone location data for pandemic disaster response, showcasing the usefulness of the data
in pandemic response and recovery.
    With both the increase in demand and supply of location-based intelligence platforms and applications both from public
and private entities, we anticipate that the availability of location and human mobility datasets to continue its increasing trend.
The review of available data products, data-sharing ecosystems (e.g., Development Data Partnership of The World Bank), and
open source toolkits for the analysis of human mobility data for disaster response and recovery applications highlighted the

                                                                                                                              12/18
You can also read