Role of Participatory Health Informatics in Detecting and Managing Pandemics: Literature Review - Thieme Connect
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Published online: 2021-04-21 © 2021 IMIA and Georg Thieme Verlag KG Role of Participatory Health Informatics in Detecting and Managing Pandemics: Literature Review Elia Gabarron1*, Octavio Rivera-Romero 2*, Talya Miron-Shatz3, 4, Rebecca Grainger5, Kerstin Denecke6 1 Norwegian Centre for E-health Research, University Hospital of North Norway, Tromsø, Norway 2 Department of Electronic Technology, Universidad de Sevilla, Spain 3 Faculty of Business Administration, Ono Academic College, Israel 4 Winton Centre for Risk and Evidence Communication, Cambridge University, England 5 Department of Medicine, University of Otago, Wellington, New Zealand 6 Institute for Medical Informatics, Bern University of Applied Sciences, Bern, Switzerland Summary IEEE Xplore, ACM (Association for Computing Machinery) Digital vented reaching firm conclusions about the utility of PHI in de- Objectives: Using participatory health informatics (PHI) to detect Library, and Cochrane Library. Data from studies fulfilling the tecting and monitoring these disease outbreaks. For others, e.g., disease outbreaks or learn about pandemics has gained interest eligibility criteria were extracted and synthesized narratively. COVID-19, social media and online search patterns corresponded in recent years. However, the role of PHI in understanding and Results: Out of 417 citations retrieved, 53 studies were to disease patterns, and detected disease outbreak earlier than managing pandemics, citizens’ role in this context, and which included in this review. Most research focused on influenza-like conventional public health methods, thereby suggesting that PHI methods are relevant for collecting and processing data are still illnesses or COVID-19 with at least three papers on other epi- can contribute to disease and pandemic monitoring. unclear, as is which types of data are relevant. This paper aims to demics (Ebola, Zika or measles). The geographic scope ranged clarify these issues and explore the role of PHI in managing and from global to concentrating on specific countries. Multiple Keywords detecting pandemics. processing and analysis methods were reported, although often Epidemics, public health surveillance, social media Methods: Through a literature review we identified studies that missing relevant information. The majority of outcomes are explore the role of PHI in detecting and managing pandemics. reported for two application areas: crisis communication and Yearb Med Inform 2021: Studies from five databases were screened: PubMed, CINAHL detection of disease outbreaks. http://dx.doi.org/10.1055/s-0041-1726486 (Cumulative Index to Nursing and Allied Health Literature), Conclusions: For most diseases, the small number of studies pre- 1 Introduction time delays. In the last 20 years, the use of journalistic and other unofficial online in- PHI is a multidisciplinary field that uses information technology as provided through Detecting the spread of infection in an formation sources for epidemic intelligence the web, smartphones, or wearables to in- epidemic allows health systems and has gained interest. News aggregators such crease participation of individuals in their governments to implement timely public as HealthMap [2] or MediSys [3] collect care process and to enable them in practicing health interventions. The term ‘epidemic online news and information from websites self-care and shared decision-making [5]. PHI intelligence’ refers to “all the activities or blogs to provide an overview of the deals with the resources, devices, and methods related to early identification of potential worldwide disease situation for the purpose required to support active participation and health threats, their verification, assessment of real-time surveillance. Data are also col- engagement of the stakeholders, such as social and investigation in order to recommend lected from search engines or messages on media [5]. Goals to be achieved through PHI public health measures to control them” social media, such as Twitter. These data, include improving and maintaining health and [1]. Traditional epidemic intelligence sys- which are now being utilized as a form well-being; improving the healthcare system tems mainly use clinical epidemiological of participatory health informatics (PHI), and health outcomes; sharing experiences; data, such as reports from hospitals and may provide a complementary source of achieving life goals; and gaining self-edu- healthcare providers, which often lead to information to traditional sources such as cation [6]. Beyond eliciting epidemic intelli- health system, thereby helping to detect and gence, participatory health is used to engage predict the volume and spread of infection or inform citizens of disease outbreaks or * Contributed equally and share first authorship in epidemics [4]. governmental activities related to an outbreak. IMIA Yearbook of Medical Informatics 2021
Role of Participatory Health Informatics in Detecting and Managing Pandemics: Literature Review The COVID-19 pandemic has highlight- OR Crimean-Congo haemorrhagic fever studies or did not report results (i.e., study ed the potential role of PHI in pandemics. OR Ebola virus disease OR Hendra virus protocols, opinion, frameworks or review For example, during the COVID-19 crisis infection OR Influenza OR Lassa fever papers); or were published in languages Chinese central government agencies used OR Marburg virus disease OR Meningitis other than English. social media to promote citizen engagement OR MERS-CoV OR Monkeypox OR The selected articles were divided among [4, 7]. Other studies during the COVID-19 Nipah virus infection OR coronavirus three authors (EG, KD, OR) for data ex- pandemic have used online data to assess OR 2019-nCoV OR Covid OR Plague OR traction. We extracted: 1) Disease/epidemic, citizens‘ risk perceptions or attitudes and Rift Valley fever OR SARS OR Smallpox settings and country; 2) Objective, data opinions related to the pandemic [8, 9]. OR Tularaemia OR Yellow fever OR Zika source, type of information provided, user Given these research developments, we virus disease. group, epidemic and considered region; 3) aim to examine which methods and features • Keywords related to participatory Data preprocessing, analysis techniques and of PHI are considered, and which roles PHI health: Social media OR social network features; 4) Outcome and reasons that limit plays in assessing, managing and controlling site OR online social network OR online the outcome. Data were abstracted into a pandemics. Furthermore, we aimed to iden- community OR Facebook OR Twitter OR Microsoft Excel spreadsheet standardized tify and summarize the research about what YouTube OR Instagram OR WhatsApp for this review, piloted and refined with 10 roles citizens play in this process. Specifi- OR mHealth OR mobile health OR preliminary papers. The selected articles cally, we aim to use a literature search and e-health OR ehealth OR mobile applica- were included in the narrative synthesis. synthesis to address the following research tions OR apps. questions: • Keywords related to treatment / inter- • Which epidemics have been studied by ventions: Management OR Detection means of a PHI approach to disease sur- veillance? OR surveillance OR Infoveillance OR infodemic. 3 Results • Which tools of PHI and which methods 3.1 Sample are used to analyze citizens‘ contribu- The database search retrieved 461 records, tions? • Does citizen input correspond with epi- 2.2 Eligibility with 417 records remaining after duplicate We uploaded all search references to Rayyan removal; 53 papers met the inclusion criteria demic data? after full text review and were included in • What are the barriers to the use of social (https://rayyan.qcri.org) and removed dupli- cates. To assess the eligibility of the articles, the qualitative synthesis (Figure 1). A sum- media for pandemic detection and man- mary of the included studies can be found agement? in a first step, all titles and abstracts were divided among two reviewers (EG, OR) in Appendix 2. where each reviewer looked at half of the papers. After title and abstract screening, in 2 Methods a second step, full text of all potentially eli- 3.2 Which Epidemics Have Been gible articles was obtained, and articles were We undertook a literature review to identify reviewed to confirm their eligibility by two Studied? studies that help in answering the listed ques- reviewers independently (EG, OR). Conflicts The 53 papers considered various epidemics; tions above. The Preferred Reporting Items were discussed with a third reviewer (KD) influenza-like illnesses (27/53; 51%) and for Systematic Reviews and Meta-Analyses until consensus was reached. COVID-19 (11/53; 19%) were the most (PRISMA) criteria guided the conduct and frequently studied epidemics (Appendix 3). reporting of the review [10]. Almost all studies analyzed retrospectively collected social media data where contrib- 2.3 Inclusion and Exclusion Criteria utors were unaware of the usage of their Articles were included if they: a) focused on content for the purpose of epidemic surveil- 2.1 Search Strategy epidemics, disease outbreaks or any of the lance. In all these papers, the data sets were The full search was carried out on June 10th, 20 pandemic diseases recognized by WHO created based on predefined keywords such 2020 (see Appendix 1). The search covered (https://www.who.int/emergencies/diseas- as example tweets collected using keywords PubMed; ACM Digital Library; IEEE es/en/); b) addressed the role or features of like flu, influenza etc. (e.g., [11,12]). Only Xplore; CINAHL and Cochrane library social media, mobile health, or other PHI; c) one study was conducted prospectively [13] using the following keywords: were primary studies reporting results; and with data actively generated by users. • Keywords related to epidemics or d) were in the English language. Articles About a quarter (14/53; 26%) of the pa- pandemics: Epidemics (MeSH) OR Pan- were excluded if they: did not deal with pers had a global scope, with 10/53 (19%) demics (MeSH) OR Disease Outbreaks epidemics, outbreak diseases or pandemics; focused on the USA and Canada, and 7/53 (MeSH) OR Chikungunya OR Cholera did not deal with PHI; were not primary (13%) on China. Australia and Malaysia IMIA Yearbook of Medical Informatics 2021
Gabarron et al. A preprocessing stage that prepares data to be analyzed is required when automated analyses techniques are used. Preprocessing techniques clean data and transform them to the predictable and analyzable format required by analysis algorithms. Examples of common related techniques in natural language processing are lowercasing, stem- ming, lemmatization, stop word removal, and normalization. Preprocessing is crucial when data sources present a dynamic and specific language like those used in social networks. Reporting the data preprocessing techniques used in automated analysis is rel- evant for research reproducibility. Thirteen studies (13/53; 25%) did not require a data preprocessing stage for two main reasons [11, 13, 18, 25, 26, 28, 30, 39, 41, 58, 59, 61, 62]: they were based on quantitative data such as web search index, or authors followed a manual approach to analyze the collected data. Although a preprocessing stage was required in the remaining included studies, four of them (4/53; 7.5%) did not report any information regarding this stage [14, 17, 29, 60]. A total of 36 studies reported data preprocessing. Among those, data prepro- cessing and filtering was reported in 19 studies (19/40; 48%). Several approaches to Fig. 1 PRISMA Flowchart of the paper selection procedure. filtering data were reported. Nineteen of the studies reporting preprocessing information included a data preparation stage (19/40; 48%). Information regarding data cleaning were considered a region in three papers 3.3 Which Tools of PHI Were was the most frequently reported. However, (6%). Japan, Madagascar, the Netherlands, many of these studies reported vaguely about and Italy were the focus of two papers each Examined, and Which Methods data preparation that did not include details (4%). Greece or the UK were target locations Were Used? on specific techniques and tools used. Data of one paper (2%). Just over one in ten papers Social media was the data source in the ma- aggregation was implemented in 14 studies (6/53, 11%) did not specify a region. jority of the included studies (49/53; 92%). (14/40; 35%) in which data from several The objectives of the 53 studies were Among those, Twitter was the most common- sources such as Google Trends data, Twitter, classified into six main categories: analyzing ly used channel (34/53; 64%), followed by Web search indexes, or official Centre for contents related to epidemics (i.e., reactions, Sina Weibo (10/53, 19%) (Table 1). Other data Disease control (CDC) data were combined opinion, attitudes, quality of the information, sources were official disease outbreak data on a weekly or daily basis. See Appendix 6 sentiment analysis, distribution patterns, (15/53; 28%), Google Trends (6/53; 11%), for further details. etc.); detecting disease; disease monitoring and Internet search engines (6/53; 11%). Regarding the analysis techniques, the or disease surveillance; comparing number More than two thirds of the included stud- most common analyses in the included of posts with official disease numbers; pre- ies provided information about the content studies were the correlation analysis (23/53; dicting future epidemics; and tracing con- of social media posts (37/53; 70%). The next 43%), followed by spatiotemporal analysis tacts (Appendix 4). The three most common most common types of provided information using various techniques (22/53; 42%), and objectives were analyzing content related to were the reporting of spatiotemporal data classification problem (14/53; 26%) (Table 2). the epidemic (20/53; 38%), detecting disease from social media, found in 14 papers (26%); When it comes to features used in the (12/53; 23%), and monitoring or surveillance and Internet search queries in 12 papers analysis, spatiotemporal features were the (10/53; 19%). (23%) (Appendix 5). most common data used in the analysis of IMIA Yearbook of Medical Informatics 2021
Role of Participatory Health Informatics in Detecting and Managing Pandemics: Literature Review Table 1 Data source used by the studies included in the review (n=53) Some papers mentioned a positive effect of using social media for crisis communica- tion. Posting of latest news and information Data source References using this data source Number (%) of on how the government is handling the papers citing this data source pandemic can affect citizens’ engagement [7, 59], thereby demonstrating that posting Social media [7,11,12,14-58] 49 (92%) can be valuable for citizen education [18]. Twitter [11,12,15,16,20,23,25–30,32–38,40,42– 34 (64%) Other papers question the usefulness or 52,55,57,59] effectiveness of social media for commu- Sina Weibo [7,17–19,22,24,31,39,53,54] 10 (19%) nication on an epidemic [29, 32, 52] since Baidu [17,18,31,39] 4 (7.5%) there are discrepancies between the interest Social media (in general) [17,21,56,59] 4 (7.5%) or concerns of the population in general Coosto (social media monitoring tool) [32,41] 1 (2%) and the provided information by public health authorities. Misinformation and false Facebook [32] 1 (2%) alarms were mentioned in two papers [48, WeChat [14] 1 (2%) 58] (Table 3). Wikipedia [28] 1 (2%) YouTube [58] 1 (2%) Official disease outbreak data (CDC, [12,17,18,20,21,24,29,31,38,39,41,46,59–61] 15 (28%) WHO, laboratory,… etc) 3.5 Barriers to the Use of Social Google Trends [11,12,20,39,41,55] 6 (11%) Media for Pandemic Detection and Internet search / search engines [17,23,26,28,61,62] 6 (11%) Management News, online news [18,20,26,56,59] 5 (9%) In the included papers, we identified four groups of issues that impact the outcome of Surveys (any type) [13,21] 2 (4%) PHI for detecting or retrieving knowledge on Spinn3r (Web and social media [60] 1 (2%) pandemics. These are usage of app data, data indexing service) collection, behavior of individuals, and anal- ysis and interpretation of social media data. * Note: some studies included more than one data source When apps are used for disease surveil- lance, privacy issues and concerns about personal confidentiality can hamper the data usage. Authorizing social media apps to use the included studies. Thirty-four studies 3.4 Does Citizen Input Correspond personal data including personal informa- used time stamp or time series in the tion, activity status data, and spatiotemporal analysis (34/53; 64%). Most of the studies With Epidemic Data? data is still not acceptable [14]. Another used the location information to filter data Table 3 summarizes the reported outcomes barrier relates to data collection. Specifical- out in the collection stage, but only 16 by disease. Several papers reported correla- ly, not all data generated on social media is of them compared results from different tions between social media data related to available for analysis. Several research pa- geolocation (areas, cities, or countries) COVID-19, influenza-like illnesses, mea- pers used the Twitter API which only allows (16/53; 30%). sles, MERS-CoV, MRSA and the plague a collection of a subset of the data posted on Twenty-three of the included studies and official case numbers [11, 30, 41, 53, Twitter, which may have resulted in leaving analyzed features extracted from post 61, 62]. Results regarding the timeliness of relevant tweets unconsidered [12, 33, 34]. contents. Seven of those studies analyzed the detected events, i.e., whether PHI can Only one paper considered popular tweets, words or keywords (7/53; 13%). Six stud- help in detecting outbreaks ahead of official i.e., those that were re-tweeted many times ies used word frequencies or the Term statistics were rare and contradictory. For [52], which means that these tweets are not Frequency-Inverse Document Frequency conjunctivitis and COVID-19, two papers representative of all tweets. (TF-IDF) values (6/53; 11%). Five studies (3.8%) reported that social media data can Another bias may arise from censorship used “bag of words” in their analysis (5/53; detect outbreaks as early as, or earlier than, in countries like China, thus limiting the 9%). Other features such as linguistic the official reporting mechanisms [53, 62]. completeness of the data [22]. Data col- characteristics were used in a single study. For Ebola, one paper concluded it is un- lection from Google Trends only provides Further details about features used in the likely that online surveillance provides an relative and not absolute values, thus hin- analyses are reported in Appendix 7. alert more than a week before the official dering the possibility of further refining and announcement [47]. processing them [61]. Finally, data related to IMIA Yearbook of Medical Informatics 2021
Gabarron et al. Table 2 Analysis techniques used in included papers (n=53) disease surveillance, such as geolocation or other user data, might be unavailable Analysis Algorithm or technique (References using this technique) Number of or imprecise (e.g., when using search logs papers citing this from Google or access rates of Wikipedia technique pages) [28, 62]. Correlation analysis Spearman ([11,18,31,33,39,54]) 23 (43%) The third group of problems is related Pearson ([19,20,37,38,43,44,48,49,55,60]) to the behavior of individuals, or citizens´ unspecified ([12,22,25,30,50,51,61]) input. When a disease (e.g., influenza) be- Spatiotemporal analysis comes a hot topic, people do not post about Time series analysis FDR [23], DBNM [24], KLD [47], JI [51], GCt [53], DFt [53], ARIMAX 17 (32%) it [15]. In contrast, when celebrities are con- [55], ESA [57] cerned about a disease, there are more people unspecified ([13,14,30,35,38,43,49,50,61,62]) posting about it [62] which may generate Seasonal analysis SARIMA [17], SDA [17], LiR [17], BCP [28], SH-ESD [42], STL [54] 5 (9%) false concern. Public awareness of disease unspecified [48] surveillance methods using social media Classification could influence behavior and consequently Relevance classification SVM ([15,16,27,35,36,38,44,49,53]) 13 (25%) lead to false reporting [16]. LR ([35,46,55]) The fourth group of issues concerns NB ([36,42]) the analysis and interpretation of data, i.e., LiR ([46]) data processing for the purpose of disease ME ([48]) surveillance. First of all, when considering RF [53], DT ([53]), ET ([53]), KNN ([53]), MP ([53]) free text as a data source, misspellings, Emotions classification NB [47] 1 (2%) abbreviations, and use of slang hamper Detection processing [27]. Second, social media data Topic Identification LDA ([11,27,44,54]) 8 (15%) is dynamic: new words can appear (e.g., new BTM ([33]) slang referring to a disease). This requires RF ([54]) retraining classifiers so that new vocabulary Km ([57]) and new anomalies in social signals can be unspecified ([40,51]) learned [16, 27]. Symptom Recognition LDA ([21]) 1 (2%) LR ([21]) Language detection Unspecified ([32]) 1 (2%) Prediction/Estimation LiR ([12,21,22,37,38,46]) 8 (15%) 4 Discussion MRM ([30,46,61]) 4.1 Summary of Findings Sentiment analysis NB ([34]) 5 (9%) This literature review and synthesis con- ME ([34]) DLMC ([34]) firms that PHI has been used to address a SSA ([47]) wide variety of public health issues relating unspecified ([7,47,50]) to pandemics. Most literature has focused on influenza or influenza-like illness or, Other analysis in 2020, COVID-19. The vast majority of Qualitative analysis Thematic analysis ([59]) 2 (4%) Classification ([31]) studies have used data from social media posts or web search patterns with a wide Link analysis, Influence GNA ([60]) 1 (2%) variety of data analysis techniques. For analysis, and/or Communities identification most diseases, the small number of studies identified means that firm conclusions * Note: some studies reported more than one type of information about the utility of PHI in detecting and FDR: False Discovery Rate; DBNM: Dynamic Bayesian Network Model; KLD: Kullback-Leibler Divergence; monitoring these disease outbreaks cannot JI: Jaccard Index; GCt: Granger Causality Test; DFt: Dickey-Fuller tests; ARIMAX: AutoRegressive Integrated Moving Average model with eXogenous covariates; ESA: Exponential Smoothing Algorithm; SARIMA: Seasonal be reached. In comparison, the extensive AutoRegressive Integrated Moving Average; SDA: Seasonal Decomposition Analysis; LiR: Linear Regression; literature on influenza and COVID-19 (in BCP: Bayesian Change Point; SH-ESD: Seasonal-Hybrid Extreme Studentized Deviate; STL: Seasonal-Trend spite of the fact that the literature search decomposition based on Loess; SVM: Supported Vector Machine; LR: Logistic Regression; NB: Naïve Bayes; ended in June 2020) provides valuable ME: Maximum Entropy; RF: Random Forest; DT: Decision Tree; ET: Extra Tree; KNN: K nearest neighbors; MP: Multilayer Perceptron; LDA: Latent Dirichlet Allocation; BTM; Biterm Topic Model; Km: K-means; MRM: insights into the potential for PHI to pro- Multivariable Regression Model; DLMC: Dynamic Language Model Classifier; SSA: Stanford Sentiment Analyzer; vide additional, more timely or efficient GNA: Girvan-Newman Algorithm. pandemic monitoring. IMIA Yearbook of Medical Informatics 2021
Role of Participatory Health Informatics in Detecting and Managing Pandemics: Literature Review Table 3 Reported outcomes related to PHI per epidemic in the reviewed papers tributed across the population. For example, a recent secondary analysis of survey data Epidemic Outcome revealed that women were more likely to post on COVID-19 than men; that black, Conjunctivitis PHI (Google search data) enable earlier outbreak detection. [62] Latino, and other non-white males were COVID-19 PHI (number of tweets) correlates positively with daily case numbers [19,22,39,54] more likely to post on the topic than whites, PHI (Reports of symptoms and diagnosis of COVID-19 on social media) enabled to predict daily case and that people age 65 and above were counts up to 14 days ahead of official statistics [53] more likely to post than younger people Value for communication and information provision: Media richness negatively predicts citizen [68]. However, existing analysis tools take engagement through government social media, but dialogic loop facilitates engagement. Information this into account, and use the frequency relating to the latest news about the crisis and the government’s handling of the event positively of posting as a variable in and of itself. affects citizen engagement through government social media [7] Research has yet to study the association Dengue Social media user trust in information shared by health professionals and others in their online social between differential frequency of posting, networks [40] and outbreak detection. Ebola PHI did not enable to earlier outbreak detection [26]. It should also be noted that PHI is varied Analysis of emotions in social media microblogging data (Twitter) may be utilized as a source of and, as such, some types of PHI are better evidence for disease outbreak detection and monitoring [47]. at answering specific questions than oth- Influenza, Seasonal Outbreak detection: ers. For example, search data on the CDC flu, Influenza-like Partially positive correlation between local influenza-like illness percentages and tweet rates [16] website was better and faster at detecting illnesses, Avian Positive correlation between the number of tweets, search volume or frequent daily discussions influenza trends on a national, but not state influenza and daily case numbers [20,24,25,27,31,35,36,38,43,44,49,50,56,60]. level [69]. There is a need to clarify which PHI (Twitter) enabled earlier detection of outbreaks [42]. Differences in the degree of sensitivity questions are best suited to be answered by exist between social media: A high sensitivity of 92% was found for Google and a low sensitivity which source of PHI. The same holds true of 50% was calculated for Twitter. Wikipedia had the lowest sensitivity of 33% [28]. for analysis techniques. As individuals gen- PHI led to false alarms: Twitter flu surveillance erroneously indicated a typical flu season during erate increasingly more PHI, and its use for 2011-2012. detecting and managing pandemics persists, Measles Value for communication and information provision: Social media provides insight into the opinions newer, more refined tools and analyses are regarding the pandemic that are at a certain moment salient among the public [59] required to assess how PHI best assists in There is a positive correlation between the weekly number of social media messages and the weekly promoting health and, along with that, what number of online news articles [59] its characteristics are. MERS-CoV High correlations between social media data and the number of confirmed MERS cases. High Finally, recent work analyzing Tweets to correlations between social media data and the number of quarantined cases [11] capture public sentiment about COVID-19, Methicillin-resistant identified five dominant themes: health care PHI enabled rapid identification of potential MRSA outbreaks [41] Staphylococcus environment, emotional support, business aureus (MRSA) economy, social change, and psychological stress [70]. These are not captured in elec- Plague (bubonic Statistically significant positive correlations were found between Google trends search data and tronic health records, and yet they provide plague, pneumonic confirmed, suspected, and probable cases [30,61] plague) invaluable insights into population needs and concern which should receive public Zika Value for communication and information provision: Social media is unlikely to be useful or effective health attention. Since some papers claim for communication on an epidemic; There are discrepancies between what the general public was that early alerts cannot be achieved from most interested in, or concerned about, and what public health authorities provided [29,32,52] social media, more information needs to be collected on various diseases to understand to what degree patterns generalize. 4.2 Epidemics and PHI probably facilitated fast developments 4.3 PHI Tools, Methods and in relation to the COVID-19 emergency. Although most of the articles included Likewise, PHI research on COVID-19 may Citizens´ Input in our review focused on influenza-like be applicable for detecting and managing Analysis of social media and web searches illnesses and COVID-19, other epidem- future epidemics. shows that posting and search frequencies ics have also been considered in PHI PHI is an imperfect source of data with have consistent positive correlations with research (i.e., Ebola, Zika, Dengue, etc.). its own biases such that its content and official disease incidence numbers in the PHI research on previous pandemics has frequency of posting are not equally dis- cases of influenza, COVID-19 and for the IMIA Yearbook of Medical Informatics 2021
Gabarron et al. related MERS Co-V. Most recent studies The analysis of social media posts can data characteristics, several processing in COVID-19 suggest that analysis of these also be useful for assessing the effectiveness stages are required to prepare data to be data may even predict an increase in case of government or health authority communi- used by analysis models. However, artifi- numbers ahead of official health system cation with their populations. Again, issues cial intelligence supporting participatory generated data [53]. Previous literature of how representative social media users are health is still in its infancy [78]. Although synthesis has also concluded that online of the wider population remain unresolved. the most common application of artificial social networks generate data that can At present, each analysis requires a bespoke intelligence in participatory health is the track pandemic development [63] and approach to data collection and analysis. secondary analysis of social media data other disease outbreaks [64]. As these It would seem likely that this use of social [78], there are several challenges that must data are created by citizens in their daily media and web browsing data will comple- be addressed [78]. Additionally, a combina- lives and do not incur additional data col- ment traditional research approaches with tion of epidemiologic expertise, analytical lection costs, they therefore represent an the advantages of more immediacy of data. expertise, and advanced computational attractive additional source of information skills are required to interpretate data for to complement traditional disease surveil- pandemic surveillance [77]. lance data. The timeliness of PHI provides 4.4 Barriers of Using Social Media other significant benefits, such as noting a certain level of awareness of and concern During Pandemics 4.5 Limitations with COVID-19, which epidemiological We found several barriers of using social media for detecting or retrieving knowledge One limitation of our work is that the records do not convey. Further studies on pandemics: data privacy and concerns data collection ended on June 10th, 2020, validating these observations are a high about personal confidentiality, data collec- while the COVID-19 pandemic continued priority, particularly as the COVID-19 tion (technical limitation like the Twitter to evolve. Thus, for example, we did not pandemic unfolds. API data sample, lack of data completeness, include a study published in August 2020 Since citizen behavior and input may censorship, or potential inaccuracies), on how media coverage influences Google change as the pandemic evolves, these cor- behavior trends, and complexity of data search trends, so that they cannot be as- relations between infection incidence and analysis and interpretation. sumed to only reflect people’s health [66]. secondary data sources may not remain Data privacy and personal confidenti- Likewise, we did not include a study on stable. On the other hand, this might reflect ality are two of the most relevant issues of natural language processing that revealed psychological responses to the pandemic using social media for participatory health changes in large mental health groups (e.g., which are related to, but not the same purposes [72, 73]. Included studies reported SuicideWatch and Depression) on Reddit as, the actual prevalence of COVID-19. privacy and confidentiality barriers showing during the COVID-19 pandemic [67]. This Furthermore, lack of correlation between that there are still unresolved ethical, legal, novel, illuminating work was published disease incidence and media trends might and technological questions. Therefore, new after our inclusion date. Regardless, we result from adaptation and familiarity, models of responsible and transparent data believe it was important to end the search leading, for example, to fewer information searches (e.g., on Wikipedia). collection and treatment addressing these at that date so that COVID-19 was still in- The requirements for data cleaning questions are needed, especially in public cluded in the review and so that the results and analytic methods, which may need health emergencies [74]. Limitations in of the review are released when COVID-19 to rapidly evolve, may present additional data collection from social media sources is still of relevance, so they can be utilized barriers to this approach to disease sur- is an issue that is commonly reported in by health officials and researchers. veillance being widely adopted in multiple the scientific literature [75, 76]. The social geographic settings. The availability of influence that an individual’s posts on so- standardized surveillance approaches and cial media may have on others’ behaviors In order to move the field of PHI forward for efficient development of effective algo- is also reported as a relevant aspect to be detecting and managing future pandemics rithms have previously been identified as considered in digital surveillance systems we recommend: barriers to use of social media in surveil- using social media [77]. Simple methods Finding the best way to deal with the lance of illicit drug use [65]. Furthermore, are commonly used to collect data from current barriers to fuller impact of PHI the uncertainties about how representative social media sources, resulting in a dataset data (i.e., privacy issues, commercial data are and if sufficient population cover- including noise (data not related to specific practices, governmental practices, etc.) age is reached remain unresolved. pandemic). Then, a filtering stage is re- Clarifying which questions are best suited Our synthesis of the outcomes of using quired to select efficiently the data sample. to be answered by which source of PHI PHI in pandemics suggests that analysis of Both manual and automated filtering are Creating more refined tools and analyses social media posting is useful in assessing commonly used to classify collected data. is required to assess how PHI best assists disease-related informational needs, such Most automated methods are based on in promoting health during pandemics] as reducing vaccination hesitancy [71]. artificial intelligence. Due to social media IMIA Yearbook of Medical Informatics 2021
Role of Participatory Health Informatics in Detecting and Managing Pandemics: Literature Review 5 Conclusions and Rivera-Romero O, Franco-Martín M, De la Torre- Díez I. Suicide Risk Assessment Using Machine Ebola in West Africa: Evidence from the Online Big Data Platform. Int J Environ Res Public Health Recommendations for Future Learning and Social Networks: a Scoping Review. J Med Syst 2020 Nov 9;44(12):205. 2016 Aug 4;13(8):780. 19. Zhao Y, Cheng S, Yu X, Xu H. Chinese Public‘s Research 6. Syed-Abdul S, Gabarron E, Lau AY, Househ M. Chapter 1 - An introduction to participatory health Attention to the COVID-19 Epidemic on Social Media: Observational Descriptive Study. J Med This paper explored the role of PHI in through social media. In: Participatory Health Internet Res 2020 May 4;22(5):e18825. Through Social Media. Academic Press; 2016 Jan 20. Samaras L, García-Barriocanal E, Sicilia MA. managing and detecting pandemics. We Comparing Social media and Google to detect 1. p. 1-9. conclude that PHI provides an unmediated, 7. Chen Q, Min C, Zhang W, Wang G, Ma X, Evans R. and predict severe epidemics. Sci Rep 2020 Mar authentic, and readily available source of Unpacking the black box: How to promote citizen 16;10(1):4747. information that can be highly useful in the engagement through government social media 21. Daughton AR, Chunara R, Paul MJ. Comparison during the COVID-19 crisis. Comput Human of Social Media, Syndromic Surveillance, and detection and management of pandemics. Microbiologic Acute Respiratory Infection Data: Our findings highlight the ways in which Behav 2020 Sep;110:106380. 8. Lohiniva AL, Sane J, Sibenberg K, Puumalainen T, Observational Study. JMIR Public Health Surveill social media can be used as a form of Salminen M. Understanding coronavirus disease 2020 Apr 24;6(2):e14986. participatory health, to manage and detect (COVID-19) risk perceptions among the public to 22. Li J, Xu Q, Cuomo R, Purushothaman V, Mackey T. pandemics. They also illuminate the barri- enhance risk communication efforts: a practical Data Mining and Content Analysis of the Chinese approach for outbreaks, Finland, February 2020. Social Media Platform Weibo During the Early ers to fuller impact of such data: some of COVID-19 Outbreak: Retrospective Observational these barriers stem from privacy issues, Euro Surveill 2020 Apr;25(13):2000317. 9. World Health Organization. Coronavirus Infoveillance Study. JMIR Public Health Surveill some from commercial practices such as (COVID-19) events as they happen. Geneva: 2020 Apr 21;6(2):e18700. providing relative but not absolute rankings World Health Organization; 2020:1-0. Available 23. Yom-Tov E, Borsa D, Cox IJ, McKendry RA. Detecting disease outbreaks in mass gatherings of trends, while others are rooted in govern- from: https://www.who.int/emergencies/diseases/ using Internet data. J Med Internet Res 2014 Jun mental practices such as censorship. There novel-coronavirus-2019/interactive-timeline/ 18;16(6):e154. 10. Liberati A, Altman DG, Tetzlaff J, Mulrow C, is a series of questions that future studies Gøtzsche PC, Ioannidis JP, et al. The PRISMA 24. Huang J, Zhao H, Zhang J. Detecting flu trans- could aim to answer: What are the issues statement for reporting systematic reviews and mission by social sensor in China. In: 2013 IEEE that hamper citizens’ contribution and the International Conference on Green Computing and meta-analyses of studies that evaluate health care Communications and IEEE Internet of Things and value of their contribution, and what can interventions: explanation and elaboration. J Clin IEEE Cyber, Physical and Social Computing 2013 facilitate their contribution? To what extent Epidemiol 2009 Oct;62(10):e1-34. Aug 20. IEEE; 2013. p. 1242-7 11. Shin SY, Seo DW, An J, Kwak H, Kim SH, Gwack are citizens invited to contribute to outbreak J, et al. High correlation of Middle East respiratory 25. Lee B, Yoon J, Kim S, Hwang BY. Detecting social detection and crisis communication using signals of flu symptoms. In: 8th International Con- syndrome spread with Google search and Twitter ference on Collaborative Computing: Networking, PHI? Given that citizen input is instru- trends in Korea. Sci Rep 2016 Sep 6;6:32920. Applications and Worksharing (CollaborateCom) mental in early detection of diseases and is 12. Comito C, Forestiero A, Pizzuti C. Improving 2012 Oct 14. IEEE; 2012. p. 544-5. influenza forecasting with web-based social data. crucial in detection of mental distress re- In: 2018 IEEE/ACM International Conference on 26. Yom-Tov E. Ebola data from the Internet: An oppor- sulting from diseases, governments should tunity for syndromic surveillance or a news event? Advances in Social Networks Analysis and Mining strive to invite such input in a standardized, In: Proceedings of the 5th international conference (ASONAM) 2018 Aug 28; IEEE; 2018. p. 963-70. on digital health 2015; May 18. p. 115-9. anonymized manner. 13. Moberley S, Carlson S, Durrheim D, Dalton C. 27. Kagashe I, Yan Z, Suheryani I. Enhancing Seasonal Flutracking: Weekly online community-based Influenza Surveillance: Topic Analysis of Widely surveillance of influenza-like illness in Australia. Used Medicinal Drugs Using Twitter Data. J Med 2017 Annual Report. Commun Dis Intell 2019;43. Internet Res 2017;19:e315 References 14. Wang S, Ding S, Xiong L. A New System for Surveillance and Digital Contact Tracing for 28. Sharpe JD, Hopkins RS, Cook RL, Striley CW. Evaluating Google, Twitter, and Wikipedia as Tools 1. Kaiser R, Coulombier D, Baldari M, Morgan D, COVID-19: Spatiotemporal Reporting Over Net- for Influenza Surveillance Using Bayesian Change Paquet C. What is epidemic intelligence, and how work and GPS. JMIR Mhealth Uhealth 2020 Jun Point Analysis: A Comparative Analysis. JMIR is it being improved in Europe? Euro Surveill 2006 10;8(6):e19457. Public Health Surveill 2016 Oct 20;2(2):e161. Feb 2;11(2):E060202.4 15. Wakamiya S, Kawai Y, Aramaki E. After the boom 29. Khatua A, Khatua A. Immediate and long-term 2. Freifeld CC, Mandl KD, Reis BY, Brownstein JS. no one tweets: microblog-based influenza detection effects of 2016 Zika Outbreak: A Twitter-based HealthMap: global infectious disease monitoring incorporating indirect information. In: Proceedings study. In: 2016 IEEE 18th International Confer- through automated classification and visualization of the Sixth International Conference on Emerging ence on e-Health Networking, Applications and of Internet media reports. J Am Med Inform Assoc Databases: Technologies, Applications, and Theory Services (Healthcom) 2016 Sep 14. IEEE; 2016. 2008 Mar-Apr;15(2):150-7. 2016 Oct 17. p. 17-25. p. 1-6. 3. Rortais A, Belyaeva J, Gemo M, Van der Goot E, 16. Allen C, Tsou MH, Aslam A, Nagel A, Gawron 30. Al-Mohrej A, Agha S. Are Saudi medical students Linge JP. MedISys: An early-warning system for JM. Applying GIS and Machine Learning Meth- aware of middle east respiratory syndrome coro- the detection of (re-) emerging food-and feed- ods to Twitter Data for Multiscale Surveillance of navirus during an outbreak? J Infect Public Health borne hazards. Food Research International 2010 Influenza. PLoS One 2016 Jul 25;11(7):e0157734. 2017 Jul-Aug;10(4):388-95. Jun 1;43(5):1553-6. 17. Chen Y, Zhang Y, Xu Z, Wang X, Lu J, Hu W. 31. Gu H, Chen B, Zhu H, Jiang T, Wang X, Chen L, 4. Samaras L, García-Barriocanal E, Sicilia MA. Avian Influenza A (H7N9) and related Internet et al. Importance of Internet surveillance in public Syndromic surveillance using web data: a sys- search query data in China. Sci Rep 2019 Jul health emergency control and prevention: evidence tematic review. Innovation in Health Informatics 18;9(1):10434. from a digital epidemiologic study during avian 2020:39-77. 18. Liu K, Li L, Jiang T, Chen B, Jiang Z, Wang Z, influenza A H7N9 outbreaks. J Med Internet Res 5. Castillo-Sánchez G, Marques G, Dorronzoro E, et al. Chinese Public Attention to the Outbreak of 2014 Jan 17;16(1):e20. IMIA Yearbook of Medical Informatics 2021
Gabarron et al. 32. Barata G, Shores K, Alperin JP. Local chatter or (SOMA’10), 2010 Jul 25. p. 115-22. Study. JMIR Public Health Surveill 2019 Mar international buzz? Language differences on posts 46. Ofoghi B, Mann M, Verspoor K. Ealry discovery 8;5(1):e13142. about Zika research on Twitter and Facebook. PLoS of salient health threats: a social media emotion 61. Deiner MS, McLeod SD, Wong J, Chodosh J, One 2018 Jan 5;13(1):e0190482. classification technique. Pac Symp Biocomput Lietman TM, Porco TC. Google Searches and 33. Mackey T, Purushothaman V, Li J, Shah N, Nali 2016;21:504-15. Detection of Conjunctivitis Epidemics Worldwide. M, Bardier C, et al. Machine Learning to Detect 47. Mowery J. Twitter Influenza Surveillance: Quan- Ophthalmology 2019 Sep;126(9):1219-1229. Self-Reporting of Symptoms, Testing Access, and tifying Seasonal Misdiagnosis Patterns and their 62. Al-Garadi MA, Khan MS, Varathan KD, Mujtaba Recovery Associated With COVID-19 on Twitter: Impact on Surveillance Estimates. Online J Public G, Al-Kabsi AM. Using online social networks to Retrospective Big Data Infoveillance Study. JMIR Health Inform 2016 Dec 28;8(3):e198. track a pandemic: A systematic review. J Biomed Public Health Surveill 2020 Jun 8;6(2):e19509. 48. Wakamiya S, Kawai Y, Aramaki E. Twitter-Based Inform 2016 Aug;62:1-11. 34. Byrd K, Mansurov A, Baysal O. Mining Twitter Influenza Detection After Flu Peak via Tweets With 63. Bernardo TM, Rajic A, Young I, Robiadek K, Data for Influenza Detection and Surveillance. Indirect Information: Text Mining Study. JMIR Pham MT, Funk JA. Scoping review on search In: 2016 IEEE/ACM International Workshop Public Health Surveill 2018 Sep 25;4(3):e65. queries and social media for disease surveillance: on Software Engineering in Healthcare Systems 49. Comito C, Forestiero A, Pizzuti C. Twitter-based a chronology of innovation. J Med Internet Res (SEHS) May 14. IEEE; 2016. p. 43-9. Influenza Surveillance: An Analysis of the 2013 Jul 18;15(7):e147. 35. Broniatowski DA, Paul MJ, Dredze M. National 2016-2017 and 2017-2018 Seasons in Italy. In: 64. Kazemi DM, Borsari B, Levine MJ, Dooley B. and local influenza surveillance through Twitter: Proceedings of the 22nd International Database Systematic review of surveillance by social media an analysis of the 2012-2013 influenza epidemic. Engineering & Applications Symposium 2018 Jun platforms for illicit drug use. J Public Health (Oxf) PLoS One 2013 Dec 9;8(12):e83672. 18. p. 175-82. 2017 Dec 1;39(4):763-76. 36. Collier N, Son NT, Nguyen NM. OMG U got flu? 50. Kanhabua N, Nejdl W. Understanding the diversity 65. Sousa-Pinto B, Anto A, Czarlewski W, Anto JM, Analysis of shared health messages for bio-sur- of tweets in the time of outbreaks. In: Proceedings Fonseca JA, Bousquet J. Assessment of the Impact veillance. J Biomed Semantics 2011 Oct 6;2 Suppl of the 22nd International Conference on World of Media Coverage on COVID-19–Related Google 5(Suppl 5):S9. Wide Web 2013 May 13; p. 1335-42. Trends Data: Infodemiology Study. J Med Internet 37. Chew C, Eysenbach G. Pandemics in the age 51. Gui X, Wang Y, Kou Y, Reynolds TL, Chen Y, Mei Res 2020 Aug 10;22(8):e19611. of Twitter: content analysis of Tweets during Q, et al. Understanding the Patterns of Health In- 66. Low DM, Rumker L, Talkar T, Torous J, Cecchi G, the 2009 H1N1 outbreak. PLoS One 2010 Nov formation Dissemination on Social Media during Ghosh SS. Natural Language Processing Reveals 29;5(11):e14118. the Zika Outbreak. AMIA Annu Symp Proc 2018 Vulnerable Mental Health Support Groups and 38. Zhang K, Arablouei R, Jurdak R. Predicting prev- Apr 16;2017:820-9. Heightened Health Anxiety on Reddit During alence of influenza-like illness from geo-tagged 52. Shen C, Chen A, Luo C, Zhang J, Feng B, Liao COVID-19: Observational Study. J Med Internet tweets. In: Proceedings of the 26th International W. Using Reports of Symptoms and Diagnoses Res 2020 Oct 12;22(10):e22635. Conference on World Wide Web Companion 2017 on Social Media to Predict COVID-19 Case 67. Campos-Castillo C, Laestadius LI. Racial and Eth- Apr 3; p. 1327-34. Counts in Mainland China: Observational Info- nic Digital Divides in Posting COVID-19 Content 39. Garazzino S, Montagnani C, Donà D, Meini veillance Study. J Med Internet Res 2020 May on Social Media Among US Adults: Secondary A, Felici E, Vergine G, et al. Italian SITIP-SIP 28;22(5):e19421. Survey Analysis. J Med Internet Res 2020 Jul SARS-CoV-2 paediatric infection study group. 53. Han X, Wang J, Zhang M, Wang X. Using Social 3;22(7):e20472. Multicentre Italian study of SARS-CoV-2 in- Media to Mine and Analyze Public Opinion Relat- 68. Caldwell WK, Fairchild G, Del Valle SY. Surveil- fection in children and adolescents, preliminary ed to COVID-19 in China. Int J Environ Res Public ling Influenza Incidence With Centers for Disease data as at 10 April 2020. Euro Surveill 2020 Health 2020 Apr 17;17(8):2788. Control and Prevention Web Traffic Data: Demon- May;25(18):2000600. 54. Broniatowski DA, Dredze M, Paul MJ, Dugas A. stration Using a Novel Dataset. J Med Internet Res 40. He Z, Zhang CJ, Huang J, Zhai J, Zhou S, Chiu JW, Using Social Media to Perform Local Influenza 2020 Jul 3;22(7):e14337. Sheng et al. A New Era of Epidemiology: Digital Surveillance in an Inner-City Hospital: A Retro- 69. Hung M, Lauren E, Hon ES, Birmingham WC, Epidemiology for Investigating the COVID-19 spective Observational Study. JMIR Public Health Xu J, Su S, et al. Social Network Analysis of Outbreak in China. J Med Internet Res 2020 Sep Surveill 2015 Jan-Jun;1(1):e5. COVID-19 Sentiments: Application of Artifi- 17;22(9):e21685. 55. Corley CD, Cook DJ, Mikler AR, Singh KP. Adv cial Intelligence. J Med Internet Res 2020 Aug 41. Yousefinaghani S, Dara R, Poljak Z, Bernardo TM, Exp Med Biol 2010;680:559-64. 18;22(8):e22590. Sharif S. The Assessment of Twitter‘s Potential for 56. Odlum M, Yoon S. What can we learn about the 70. Wilson SL, Wiysonge C. Social media and vac- Outbreak Detection: Avian Influenza Case Study. Ebola outbreak from tweets? Am J Infect Control cine hesitancy. BMJ Global Health 2020 Oct Sci Rep 2019 Dec 3;9(1):18147. 2015 Jun;43(6):563-71. 1;5(10):e004206. 42. Nagel AC, Tsou MH, Spitzberg BH, An L, Gawron 57. Li HO, Bailey A, Huynh D, Chan J. YouTube as a 71. Rivera-Romero O, Konstantinidis S, Denecke K, JM, Gupta DK, et al. The complex relationship source of information on COVID-19: a pandemic Gabarrón E, Petersen C, Househ M, et al. Ethical of realspace events and messages in cyberspace: of misinformation? BMJ Glob Health 2020 Considerations for Participatory Health through case study of influenza and pertussis using tweets. May;5(5):e002604. Social Media: Healthcare Workforce and Policy J Med Internet Res 2013 Oct 24;15(10):e237. 58. Mollema L, Harmsen IA, Broekhuizen E, Clijnk R, Maker Perspectives: Contribution of the IMIA 43. Aslam AA, Tsou MH, Spitzberg BH, An L, De Melker H, Paulussen T, et al. Disease detection Participatory Health and Social Media Working Gawron JM, Gupta DK, et al. The reliability of or public opinion reflection? Content analysis of Group. Yearb Med Inform 2020 Aug;29(1):71-6. tweets as a supplementary method of seasonal tweets, other social media, and online newspapers 72. Golinelli D, Boetto E, Carullo G, Nuzzolese AG, influenza surveillance. J Med Internet Res 2014 during the measles outbreak in The Netherlands in Landini MP, Fantini MP. Adoption of Digital Nov 14;16(11):e250. 2013. J Med Internet Res 2015 May 26;17(5):e128. Technologies in Health Care During the COVID-19 44. Abd-Alrazaq A, Alhuwail D, Househ M, Hamdi 59. Corley CD, Cook DJ, Mikler AR, Singh KP. Text Pandemic: Systematic Review of Early Scien- M, Shah Z. Top Concerns of Tweeters During the and structural data mining of influenza mentions tific Literature. J Med Internet Res 2020 Nov COVID-19 Pandemic: Infoveillance Study. J Med in web and social media. Int J Environ Res Public 6;22(11):e22280. Internet Res 2020 Apr 21;22(4):e19016. Health 2010 Feb;7(2):596-615. 73. Almeida BD, Doneda D, Ichihara MY, Barral-Netto 45. Culotta A. Towards detecting influenza epidemics 60. Bragazzi NL, Mahroum N. Google Trends Pre- M, Matta GC, Rabello ET, et al. Personal data by analyzing Twitter messages. In Proceedings dicts Present and Future Plague Cases During the usage and privacy considerations in the COVID-19 of the 1st Workshop on Social Media Analytics Plague Outbreak in Madagascar: Infodemiological global pandemic. Ciência & Saúde Coletiva 2020 IMIA Yearbook of Medical Informatics 2021
Role of Participatory Health Informatics in Detecting and Managing Pandemics: Literature Review Jun 5;25:2487-92. Inform 2016 Aug;62:1-11. Correspondence to: 74. Castillo-Sánchez G, Marques G, Dorronzoro E, 76. Salathé M, Bengtsson L, Bodnar TJ, Brewer DD, Kerstin Denecke Rivera-Romero O, Franco-Martín M, De la Torre- Brownstein JS, Buckee C, et al. Digital epidemi- Bern University of Applied Sciences Díez I. Suicide Risk Assessment Using Machine ology. PLoS Comput Biol 2012;8(7):e1002616. Institute for Medical Informatics Learning and Social Networks: a Scoping Review. 77. Denecke K, Gabarron E, Grainger R, Konstan- Quellgasse 1 J Med Syst 2020 Nov 9;44(12):205. tinidis ST, Lau A, Rivera-Romero O, et al. Artificial 2502 Biel / Switzerland 75. Al-Garadi MA, Khan MS, Dewi K, Varathan GM, Intelligence for Participatory Health: Applications, E-mail: kerstin.denecke@bfh.ch Al-Kabsi AM. Using online social networks to Impact, and Future Implications. Yearb Med In- track a pandemic: A systematic review. J Biomed form 2019 Aug;28(1):165-73. IMIA Yearbook of Medical Informatics 2021
You can also read