Persistence and Attrition among Participants in a Multi-Page Online Survey Recruited via Reddit's Social Media Network

Page created by Cody Davis
 
CONTINUE READING
Persistence and Attrition among Participants in a Multi-Page Online Survey Recruited via Reddit's Social Media Network
$
         £ ¥€
                social sciences
Article
Persistence and Attrition among Participants in a Multi-Page
Online Survey Recruited via Reddit’s Social Media Network
Dirk H.R. Spennemann

                                          School of Agricultural, Environmental and Veterinary Sciences, Charles Sturt University, P.O. Box 789,
                                          Albury, NSW 2640, Australia; dspennemann@csu.edu.au

                                          Abstract: Participant attrition is a major concern for the validity of longer or complex surveys. Unlike
                                          paper-based surveys, which may be discarded even if partially completed, multi-page online surveys
                                          capture responses from all completed pages until the time of abandonment. This can result in different
                                          item response rates, with pages earlier in the sequence showing more completions than later pages.
                                          Using data from a multi-page online survey administered to cohorts recruited on Reddit, this paper
                                          analyses the pattern of attrition at various stages of the survey instrument and examines the effects of
                                          survey length, time investment, survey format and complexity, and survey delivery on participant
                                          attrition. The participant attrition rate (PAR) differed between cohorts, with cohorts drawn from
                                          Reddit showing a higher PAR than cohorts targeted by other means. Common to all was that the
                                          PAR was higher among younger respondents and among men. Changes in survey question design
                                          resulted in the greatest rise in PAR irrespective of age, gender or cohort.

                                          Keywords: Reddit; survey methodology; social media; online surveys; participant attrition

         
         
Citation: Spennemann, Dirk H.R.
                                          1. Introduction
2022. Persistence and Attrition                Online conducted surveys have become popular due to their ease of administration,
among Participants in a Multi-Page        low cost of dissemination and distributed data entry, as well as for their geographic reach.
Online Survey Recruited via Reddit’s      Like any other type of survey, online surveys suffer from participant attrition (‘survey
Social Media Network. Social Sciences     break off,’ ‘drop out’), i.e., the phenomenon that participants will abandon the survey once
11: 31. https://doi.org/10.3390/          they get distracted or bored; no longer perceive the questions to be relevant, or simply run
socsci11020031                            out of the amount time that they had set aside for it. In paper-based surveys, this may
Academic Editor: Nigel Parton             lead to incomplete surveys or, more likely, to the survey form not being returned at all.
                                          This directly affects the unit response rate (i.e., fully completed surveys). Online surveys,
Received: 13 December 2021                where the questions are delivered as a set of discrete pages (screenfuls) with the respondent
Accepted: 14 January 2022
                                          actively moving from one to the next, will save the response that has been submitted on
Published: 18 January 2022
                                          the previous page. This allows for partial responses to be recorded even when respondents
Publisher’s Note: MDPI stays neutral      abandon the effort part way through completion. Thus, while the entire survey may not
with regard to jurisdictional claims in   have been answered, the sets of questions on the saved discrete pages will have been,
published maps and institutional affil-   leading to different item response rates (Edwards 2002).
iations.                                       A number of studies have commented on participant attrition in online surveys
                                          (Monroe and Adams 2012), but only a few studies have been carried out to examine the
                                          underlying patterns (Hochheimer et al. 2016; Hochheimer et al. 2019; Zhou and Fishbach
                                          2016).
Copyright:      © 2022 by the author.
                                               Participant attrition may introduce a bias in the survey responses and thus their
Licensee MDPI, Basel, Switzerland.
                                          implied representativeness, either within the same survey cohort (Liu and Wronski 2018;
This article is an open access article
                                          Zhou and Fishbach 2016) or between survey cohorts of different years (longitudinal studies).
distributed under the terms and
                                          (Khadjesari et al. 2011) Factors that may cause participant attrition are questions that are
conditions of the Creative Commons
                                          deemed irrelevant to the respondent (Zhou and Fishbach 2016) as well as the complexity
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
                                          and length of the survey instrument (Hoerger 2010; Kato and Miura 2021; Liu and Wronski
4.0/).
                                          2018; Mirta and Michael 2009; Robb et al. 2017). In addition to outright attrition, studies

Soc. Sci. 2022, 11, 31. https://doi.org/10.3390/socsci11020031                                                   https://www.mdpi.com/journal/socsci
Persistence and Attrition among Participants in a Multi-Page Online Survey Recruited via Reddit's Social Media Network
Soc. Sci. 2022, 11, 31                                                                                            2 of 35

                         have shown that the quality of responses provided in the later part of a survey can be less
                         detailed, whereby respondents provided faster and more uniform responses (i.e., with less
                         reflection) than answers to questions earlier in the survey (Mirta and Michael 2009).
                              This paper will report on participant attrition in a multi-page online survey (examining
                         the perceptions of risk in outdoor creation activities) as administered to cohorts recruited
                         on the social media network Reddit.

                         Reddit as a Sampling Universe
                                Reddit is a social media network, touting itself as the ‘front page of the internet.’
                         It is, in essence, an array of multi-channel discussion board-type groupings and online
                         communities (sub-Reddits), where users (‘redditors’) congregate, express opinions, ask
                         questions and share images, videos and links to other social media and websites (Amaya
                         et al. 2019; Gaffney and Matias 2018; Shatz 2017). These sub-Reddits can be topical and
                         thematic (e.g., covering activities, hobbies, TV shows, etc.), geographic and country-specific
                         (e.g., Brazil, Kenya), event-specific (e.g., COVID-19) or generic (e.g., AskReddit). While the
                         primary language on Reddit is English, there are sub-Reddits in all other languages and
                         scripts supported by ASCII-based standard character encodings. The members of the online
                         community drive the nature and extent of content as well as the volume and frequency of
                         discussion threads and the posted comments. The nature of the content ranges from semi-
                         professional advice in Q&A formats to flippant postings or internet memes. The content
                         of each sub-Reddit is managed by sub-Reddit-specific moderators, who are guided by a
                         general system-wide Reddit code of conduct (Almerekhi et al. 2020), which is also enforced
                         by an artificial intelligence-driven auto-moderation bot. (Jhaver et al. 2019) Moderators
                         have the discretion to add sub-Reddit-specific usage rules and codes of conduct. (Moore
                         and Chuang 2017; Squirrell 2019) This is not the place to discuss the politics of content
                         moderation, or lack thereof, in some of the sub-Reddits, which has recently attracted some
                         media attention (Copland 2020; Gaudette et al. 2020; Potter 2021).
                                The breadth and specificity of the various Reddit communities make them an attractive
                         target for data mining, primarily the mining of discussion comments, such as in the fields of
                         public health (Balsamo et al. 2021; Bunting et al. 2021; Lu et al. 2019; Okon et al. 2020; Wang
                         et al. 2015), private finance (Glenski et al. 2019), education (Staudt Willet and Carpenter
                         2020), or public information and disinformation management (Achimescu and Chachev
                         2021; Balalau and Horincar 2021; Dosono et al. 2017; Duguay 2021). The use of Reddit as
                         a data set, however, is not without its critics, as the sociodemographic characteristics of
                         participants on Reddit are not comparable to the general population (Amaya et al. 2019).
                         The participation is skewed towards younger males from more affluent English-speaking
                         backgrounds (Amaya et al. 2019; Shatz 2017).
                                Descriptive demographic data exist for the representativeness of the overall Reddit
                         universe as it manifested itself in early 2019. The percentage of U.S. citizens using Reddit
                         decreases with age, with 22% of 18–29 years U.S. adults using Reddit, compared to 6% of
                         people over 50 (Tankovska 2021d). Reddit users are twice as likely to be male and tend to be
                         better educated, have a higher income, are more than three times as likely to be Caucasian
                         or Hispanic than African American, and are less likely to reside in rural areas (Tankovska
                         2021a, 2021b, 2021c, 2021e, 2021h). While the uptake of Reddit as a platform has increased,
                         the pattern of demographics has not changed much since 2013, with the exception being
                         that the dominance of male participants has decreased from a ratio of 3:1 in 2013 to 2:1 in
                         2019 (Duggan and Smith 2013).
                                The frequency of access has implications on survey responses and the latency of posts.
                         In mid-2020, 52% of Reddit users reputedly accessed the site on a daily basis, while 82%
                         accessed Reddit at least weekly (Tankovska 2021f). The consumption of Reddit was less
                         in other countries, such as Finland (35% daily, 77% weekly) (Tankovska 2020a), Sweden
                         (36% daily, 71% weekly), (Tankovska 2020b) Norway (41% daily, 78% weekly) (Tankovska
                         2020c), and Denmark (49% daily) (Tankovska 2020d).
Persistence and Attrition among Participants in a Multi-Page Online Survey Recruited via Reddit's Social Media Network
Soc. Sci. 2022, 11, 31                                                                                          3 of 35

                              In terms of geographical reach, U.S. citizens were the primary consumers of Reddit,
                         with 49.3% of the desktop traffic originating in the U.S.A. This was followed by Canada
                         and the United Kingdom (both 7.8%), Australia (4.3%) and Germany (3.1%) (Tankovska
                         2021g). The main reason for consumption of Reddit posts (by U.S. residents) was to ‘get
                         entertainment’ (72%) followed by ‘news’ (43%) and other (17%) (Tankovska 2020b).
                              Reddit usage varies between the time of day and day of the week. (Shatz 2017) Given
                         the diversity of Reddit communities, usage patterns cannot be generalized but will depend
                         on the specific sub-Reddit(s) targeted. Factors that are known to influence this are related
                         to global geographic location (time zones), age structure and socio-economics (Moore and
                         Chuang 2017; Shatz 2017).
                              Limited data exist on the nature and depth of engagement on Reddit, which may
                         have an influence on survey participation. Analyses of discussion comments showed that
                         older users, as well as women, tend to provide more detailed comments in a discussion
                         thread (Finlay 2014), while others favor the apparent anonymity that allows for voicing
                         contentious opinions (Kilgo et al. 2018).

                         2. Methodology
                               Between 2019 and 2021, the author carried out a survey into the perceptions and
                         attitudes towards risk in outdoor recreation activities. The findings of that survey will be
                         discussed elsewhere. The purpose of this paper is to examine a number of methodological
                         aspects associated with the administration of the survey.
                               This section describes the purpose of the main study (merely to provide context), the
                         survey instrument, the sampling frames and modes of administration, and the limitations.

                         2.1. Purpose of the Study
                               Adventure recreation encompasses a broad range of outdoor activities that require
                         physical and mental participation, as well as an element of risk of injury and misadventure.
                         Examples are SCUBA diving, mountain biking, mountaineering or hang gliding. Adventure
                         recreation includes internal motivations such as fear, control, skill development, and a
                         sense of achievement, as well as external motives such as social-based factors defined as
                         friends, image, escape, and competition with others or the environment (Buckley 2012).
                               While there is an abundance of literature on motivations for participation in out-
                         door recreation and adventure tourism (Albayearak and Caber 2017; Buckley 2012; Caber
                         and Albayearak 2016; Holm et al. 2017; Pomfret 2011; Yang et al. 2017), the vast major-
                         ity of research into the motivations for participation and perception of risk in adventure
                         recreation has drawn on participants during activities or their instructors. (Ewert et al.
                         2013; Maria Gstaettner et al. 2017) While valid in their own right, these are predefined
                         samples that do not consider the range of motivations exhibited by the general public,
                         nor do they explore the barriers to participation. Moreover, most of these studies rarely
                         included or explored social determinants for participation. A systematic literature sur-
                         vey (Yang et al. 2017), for example, showed that most of the surveys limit themselves to
                         querying gender without considering aspects of spousal status and responsibility of care
                         for children. Possible determinants such as education, occupation and ethnicity are also
                         rarely explored (Naidoo et al. 2015).
                               The research project explores the attitudes towards personal motivation and perceived
                         risk among a broad range of participants, both those who have and those who have not
                         (yet) participated in outdoor adventure recreation. This survey specifically looks at samples
                         of the general population.
                               The study was approved for general distribution by Charles Sturt University’s Human
                         Research Ethics Committee for the period from 6 February 2019 to 3 February 2024. The
                         study was also approved by the Institutional Review Board of the University of Guam, for
                         dissemination among the University of Guam staff and student population for the period
                         from 22 April 2021 to 31 May 2022.
Persistence and Attrition among Participants in a Multi-Page Online Survey Recruited via Reddit's Social Media Network
Soc. Sci. 2022, 11, 31                                                                                           4 of 35

                              The survey was first administered between March and May 2019 to various Australian
                         cohorts using a two-page paper form or its PDF version which could be disseminated and
                         completed electronically. The survey was repeated between March and May 2020 with
                         respondents from now on offered a completion via an online survey form disseminated via
                         the Survey Monkey platform.

                         2.2. The Survey Instrument
                              The survey instrument, containing the same questions, exists in three versions. A two-
                         page paper survey (Appendix A), a PDF version of the paper survey that could be dis-
                         seminated and completed electronically (Appendix B), and a multi-page online version
                         disseminated via the Survey Monkey platform.
                              The survey instrument comprises three sections: (1) demographics; (2) general at-
                         titudes towards risk and social determinants of risk-taking; and (3) questions related to
                         specific activities. In the paper/PDF-based survey, Section (1) and Section (2) formed
                         the front page, while Section (3) formed the obverse (see Appendix A). The participant
                         information sheet was provided as a separate document.
                              On conversion for delivery via the online platform Survey Monkey, the survey in-
                         strument was broken up into a number of individual screens. In the online form, page 1
                         comprised participant information that had to be agreed to in order to progress. Section (1)
                         (demographics) comprised two pages with a third conditional page collecting ZIP codes if
                         the answer to Q1 (In which country do you live?) was either Australia or the U.S.A. This
                         demographic section asked a total 11 (12) questions. Conditional on individuals stating a
                         disability, an additional page was inserted, asking the nature of the disability. Subsequent
                         Section (2) and Section (3) comprised eight pages each. Figure 1 shows the arrangement of
                         the discreet pages as delivered by Survey Monkey.
                              To ensure that the survey was as similar as possible across the three modes of appli-
                         cation (paper, PDF, SurveyMonkey), the responses in the online surveys to the responses
                         were not forced, i.e., a respondent could choose to not answer a question but still be able to
                         progress to the next page.
                              When filling out the paper-based survey, participants could estimate the required
                         survey effort at any given time. The presence of a progress bar notwithstanding, the
                         screen-by-screen online delivery caused fatigue among some respondents, leading to the
                         abandonment of the survey partway through. As the progression from one question screen
                         to another entailed the saving of the information that had been entered on that screen,
                         partial responses could be captured even when a respondent abandoned the survey.
                              To ensure that all activity sets had an even chance of being assessed even if the survey
                         was abandoned partway through, the pages of Section (3) were presented in a random
                         sequence until all pages were exhausted. As Survey Monkey records the page order, it is
                         possible to verify the randomization. The maximum deviation from the average number of
                         responses for the randomly delivered P11 to P19 was 6.5% ± 3.5%.

                         2.3. Sampling Frames
                              The project used two different sampling frames, a semi-random sample of the gen-
                         eral population (2019–2021) and a sample drawn from Reddit users (2021). None of the
                         respondents were offered incentives or rewards. The participant population was restricted
                         to persons aged 18 years of age or older. This was clearly spelled out in the information
                         provided to prospective participants.
Persistence and Attrition among Participants in a Multi-Page Online Survey Recruited via Reddit's Social Media Network
Soc. Sci.
Soc. Sci. 2021, 11, x31FOR PEER REVIEW
          2022, 10,                                                                                                        5 5ofof 35
                                                                                                                                   35

                   P1         participant information

                                                                   Australia
                   P2                    country
                                                                    USA

                                    other country                            P2a         PostCode / ZIP

                   P3           other demographic

                   P4           perceptions of self

                   P5                     fears

                   P6                motivations

                   P7            motivations (ct’d)

                                                                           P11       activity set 2 (9 options)
                   P8              consequences

                                                                           P12       activity set 3 (6 options)
                   P9              peer pressure

                                                                           P13       activity set 4 (7 options)
                   P10        activity set 1 (6 options)

                                                                           P14       activity set 5 (6 options)
                                  activity sets 2–8
                                    delivered in
                                    randomised                             P15       activity set 6 (3 options)
                                     sequence
                                                                           P16       activity set 7 (6 options)

                   P18          Thank you screen                           P17       activity set 8 (5 options)

                                   Figure 1.
                                   Figure    Flow chart
                                          1. Flow chart showing
                                                        showing the
                                                                 the arrangement
                                                                     arrangement of
                                                                                 of pages
                                                                                    pages as
                                                                                          as delivered
                                                                                             delivered by Survey
                                                                                                          Survey Monkey.

                                   2.3.1. General Population
                                   2.3. Sampling Frames
                                         The survey was administered by students enrolled in the subject Social Psychology
                                         The project used two different sampling frames, a semi-random sample of the general
                                   is Risk taught by the author. The subject forms part of the Bachelor of Applied Science
                                   population (2019–2021) and a sample drawn from Reddit users (2021). None of the re-
                                   (Outdoor Recreation and Ecotourism) offered by Charles Sturt University (Australia). This
                                   spondents were offered incentives or rewards. The participant population was restricted
                                   is a specialized degree offered in face-to-face and distance (online) mode of instruction,
                                   to persons aged 18 years of age or older. This was clearly spelled out in the information
                                   drawing participants from across Australia with a higher representation of southeast
                                   provided to prospective participants.
                                   Australian states (Queensland, New South Wales, Victoria). The students were required
                                   to administer 15 copies through direct contact (digital or in-person) to their social circle
Persistence and Attrition among Participants in a Multi-Page Online Survey Recruited via Reddit's Social Media Network
Soc. Sci. 2022, 11, 31                                                                                             6 of 35

                         of friends and family, for which they received course credit. While the course credit was
                         applied to attracting 15 completed questionnaires, the students had no control over whether
                         these surveys were fully completed or abandoned partway through. While each student
                         carried out a purposive sample selection, the aggregate sample across all students enrolled
                         in the subject generates a random sample of the population.
                               The survey was also administered by the author to the general public. This occurred
                         through direct contact (digital or in-person) with his personal and national, and interna-
                         tional professional networks with a separate response collector URLs for Australian and
                         overseas cohorts. In addition, participants were recruited through snowballing, i.e., inviting
                         contacts to send out invitations through their own networks. In addition, purposive sam-
                         pling occurred by targeting underrepresented participant classes, such as people over 65,
                         who were sampled at events through Rotary Clubs and retirement villages, or people with
                         below-school-age children, who were sampled through placement of surveys in waiting
                         areas of childcare centers and preschools. In addition, to obtain a larger cohort of a different
                         cultural group, students of the University of Guam were also sampled using invitations
                         disseminated through the university’s centralized mail system.
                               Several Reddit users engaged with the author in offline discussions about the project
                         after the call for participation had been posted on the sub-Reddits (see below). These
                         users were sent links to the overseas participant URL and invited to distribute this to their
                         non-Reddit social networks.

                         2.3.2. Reddit
                               Reddit users were sampled to obtain cohorts of the general public, but also cohorts of
                         participants who self-identified as having an interest in the general outdoors or in specific
                         adventure recreation activities.
                               To adequately assess differences in the perception of risk in outdoor recreation activi-
                         ties, five conceptual sampling frames were chosen on Reddit. Two frames were specific to
                         outdoor recreationists, i.e., adventure activity-specific sub-Reddits (e.g., canyoning, div-
                         ing) and general outdoor activity sub-Reddits (e.g., outdoors, hiking). In addition, two
                         general sampling frames were chosen to circumscribe the general population: mental gen-
                         eral and research-related sub-Reddits (e.g., sample size, psychology) and country-specific
                         sub-Reddits (e.g., Brazil, Pakistan). The latter was chosen as an attempt to address the
                         heavy North American-centered imbalance in Reddit responses. The fifth sampling frames
                         were health- and phobia-related sub-Reddits (e.g., depression, acrophobia). There is an
                         increasing realization that participation in adventure has mental health benefits. Thus it
                         was desirable to understand the perception of risk held by that cohort.
                               The online survey form was ‘cloned,’ with a series of cohort-specific URLs feeding into
                         the same dataset as discrete collectors (Appendix C). The calls for participation were posted
                         directly in the Reddit discussion forums (Figure 2) unless where sub-Reddit rules required
                         that pre-approval for surveys had to be sought from moderators. For the sub-Reddits
                         related to mental health, phobias and disabilities, prior moderator approval was sought
                         as a matter of principle, irrespective of stated rules. While this approval was not always
                         granted, some moderators promoted the survey on their sub-Reddits by ‘pinning’ the
                         thread to the top of the page for a set time.
                               The survey was progressively posted between 11 March and 19 April 2021. Attempts
                         were made to post early Saturday morning U.S. east coast time to ensure that the posts
                         would be read over the weekend. A reminder was sent out (as a repost) one week after
                         original posting, except in instances where the overall volume of posts was low, and the
                         original post was still within the ten most recent posts. A second reminder was sent one
                         week after the first reminder for all but the forums that were posted after 5 April. The data
                         collection was concluded on 28 April 2021.
Persistence and Attrition among Participants in a Multi-Page Online Survey Recruited via Reddit's Social Media Network
Soc. Sci.
Soc. Sci. 2021, 11, x31FOR PEER REVIEW
          2022, 10,                                                                                                                          7 7of
                                                                                                                                                 of 35
                                                                                                                                                    35

                                   Figure 2. Example of a call for participation on a sub-Reddit (canyoneering in this instance).
                                   Figure 2. Example of a call for participation on a sub-Reddit (canyoneering in this instance).

                                         The  survey
                                         Critical for awas    progressively
                                                         comprehensive          posted
                                                                             design    arebetween    11 March and
                                                                                            regular reminders     (Fan19   April
                                                                                                                         and   Yan2021.
                                                                                                                                   2010).Attempts
                                                                                                                                            Dillman
                                   were
                                   et al. advocate two, if not three, reminders (Dillman et al. 2009). Posting that
                                          made   to  post   early  Saturday     morning      U.S.  east coast  time    to  ensure         the posts
                                                                                                                                    reminders,      as
                                   would
                                   repostsbe ofread   over theonweekend.
                                                the survey                      A reminder
                                                                    Reddit, tended               wasthe
                                                                                          to incur    sent out (as
                                                                                                         wrath    of aforum
                                                                                                                         repost)  one week after
                                                                                                                                participants    who
                                   original
                                   consideredposting,   except posting
                                                  any repeat     in instances     where the even
                                                                            as spamming,         overall volume
                                                                                                       though       of posts
                                                                                                                 formal         was low,
                                                                                                                            reference   was and
                                                                                                                                              madethe
                                   original  post   was  still within   the  ten   most    recent  posts. A  second
                                   that this was standard survey methodology. Resistance by Reddit forum members (and   reminder     was   sent  one
                                   week
                                   some after   the first reminder
                                          moderators)      emerged for afterallthe
                                                                                butfirst
                                                                                      the forums
                                                                                          reminder  that were
                                                                                                       and     posted after
                                                                                                           occasionally         5 April.
                                                                                                                              became       The after
                                                                                                                                       vocal    data
                                   collection   was  concluded     on  28  April    2021.
                                   the second reminder. If a pre-notification were to have been posted a few days prior to the
                                         Critical
                                   survey,         for would
                                            that too   a comprehensive
                                                                have attracteddesign
                                                                                   to ireare
                                                                                           of regular  reminders which
                                                                                              vocal participants,    (Fan and     Yan would
                                                                                                                              in turn  2010). have
                                                                                                                                                Dill-
                                   man   et al.
                                   affected  theadvocate    two,toif participate.
                                                  willingness        not three, reminders (Dillman et al. 2009). Posting reminders,
                                   as reposts   of the the
                                         To increase    survey    on Reddit,
                                                             perception          tended toof
                                                                           of credibility      incur  the wrath
                                                                                                 the survey,  theofresearcher
                                                                                                                       forum participants       who
                                                                                                                                   did not merely
                                   considered
                                   post the survey of the forum, but engaged regularly and timely, responding to anythat
                                                 any  repeat    posting  as  spamming,       even   though  formal    reference    was   made     dis-
                                   this was comments
                                   cussion   standard survey       methodology.
                                                           that were    posted in the   Resistance   by Reddit
                                                                                            discussion   thread,forum
                                                                                                                   as wellmembers
                                                                                                                              as, where (and   some
                                                                                                                                           required
                                   moderators)
                                   or appropriate, emerged
                                                       offlineafter
                                                                withthe   first reminder
                                                                      specific   users whoand   hadoccasionally
                                                                                                     commented.    became     vocal
                                                                                                                      As noted,      after the
                                                                                                                                   several       sec-
                                                                                                                                             Reddit
                                   ond
                                   usersreminder.
                                          engaged Ifwitha pre-notification      were todiscussions
                                                              the author in offline         have been posted
                                                                                                         about thea few   days after
                                                                                                                       project   prior the
                                                                                                                                        to the
                                                                                                                                             callsur-
                                                                                                                                                   for
                                   vey, that too would
                                   participation    had been have  attracted
                                                                posted   on the tosub-Reddits.
                                                                                    ire of vocal participants,
                                                                                                    As many users  which     in turn would
                                                                                                                        are participants       have
                                                                                                                                            in more
                                   affected
                                   than onethe    willingness
                                               sub-Reddit,       to participate.
                                                               they  were invited to distribute a generic Reddit participation URL
                                         To increase the perception of credibility of the survey, the researcher did not merely
                                   post the survey of the forum, but engaged regularly and timely, responding to any
Persistence and Attrition among Participants in a Multi-Page Online Survey Recruited via Reddit's Social Media Network
Soc. Sci. 2022, 11, 31                                                                                               8 of 35

                         to their Reddit social networks beyond the specific sub-Reddit that generated the offline
                         discussions.

                         2.4. Data Cleaning and Statistical Analysis
                            The data used for this paper are a subset of the full data set provided by the Survey-
                         Monkey data collectors.

                         2.4.1. Data Cleaning
                              For the purposes of this paper, the full dataset was imported to MS Excel and reduced
                         from a total of 294 columns to 32 columns by retaining the survey administrative (collector,
                         timestamps) and demographic data and by replacing the answer columns with a set of
                         columns whether the respondents had progressed to a given page.
                              The Survey Monkey platform provides timestamp data that give the time of the
                         submission of the first page (in this case, the agreement with the participation information)
                         and a timestamp for the last page submitted, which can be the final survey page or any
                         page in between. This provides the opportunity of computing the time spent on the survey,
                         which ranged from 8 s (respondent did not progress past the country demographic) to a
                         maximum of two days, 13 h and 6 min (incomplete), which is clearly an unrealistic time.
                         In total, 2.6% of the respondents took longer than 2 h to complete (or abandon) the survey,
                         suggesting that they were interrupted or chose to set the survey aside and return to it later.
                         In each case, the final timestamp represents an active submission of that page, irrespective
                         of whether any questions were answered on that page, and not a mere closing of the
                         browser window (confirmed by testing). As these extreme data points would distort the
                         findings, all times longer than 2 h were excluded from analyses that included completion
                         times.
                              Careless and/or mischievous responders are known factors that are more prevalent
                         online than in paper-based surveys. (Robinson-Cimpian 2014; Ward et al. 2017) Cross-
                         checking of responses, taking into account age, and free-form responses to country of origin,
                         profession and cultural background identified some mischievous responses, which were
                         removed from the data set.

                         2.4.2. Statistical Analysis
                               The correlation between the various participation attrition curves of different cohorts
                         or survey methods was determined using the CORREL function in MSWord. Given that
                         the PAR continually declines as the users progress from page to page, it is inevitable that
                         the curves will always show some level of positive correlation. Thus, for the purposes of
                         this paper, a very high level (***) of correlation was arbitrarily attributed to r ≥ 0.995, a high
                         level (**) to r ≥ 0.985, and a moderate level (*) to r ≥ 0.95. A paired sample T-TEST was
                         used to compare the PAR between different cohorts or survey methods.

                         2.5. Limitations
                             There are a number of limitations to the survey, both of a general and a Reddit-specific
                         nature, which are placed on record here.

                         2.5.1. Data Quality
                               Since all data were self-reported, they are subject to a recall bias. While this does not
                         affect the data collected in Section (1) (demographic) and Section (2) (general attitudes
                         towards risk and social determinants of risk-taking), recall bias may affect responses to
                         the activity-specific questions (Section (3)), in particular the rating of the risk posed by
                         and apprehension of s participant in activities they had participated in their past. The
                         granularity of options (participated in the past year, prior, never) is (by necessity) coarse,
                         which allows for recall bias to creep in among those who answered ‘prior.’ In addition,
                         all responses from the 2021 cohort may be affected by the prolonged period of enforced
                         inactivity due to COVID-19.
Persistence and Attrition among Participants in a Multi-Page Online Survey Recruited via Reddit's Social Media Network
Soc. Sci. 2022, 11, 31                                                                                              9 of 35

                         2.5.2. Participation and Response Rate
                                The literature notes low response rates for online surveys in general (Monroe and
                         Adams 2012). To boost response rates, Dillman et al., as well other authors drawing on their
                         work, advocated the approach of a personalized and repeated contact (Cook et al. 2000;
                          Dillman et al. 2009; Fan and Yan 2010; Koitsalu et al. 2018). While this is possible with fixed,
                         well-circumscribed cohorts of known potential participants (Monroe and Adams 2012), it
                         was not possible in the general public cohorts or the Reddit cohorts. Other modes to boost
                          response rates are perceptions of scarcity (i.e., those surveyed are a group of a select few)
                         (Fan and Yan 2010) pre-notification, (Fan and Yan 2010; Koitsalu et al. 2018) and reminders
                         (Koitsalu et al. 2018).
                                Although there are dissenting opinions (Brown and Knowles 2019), remunerative
                          incentives are frequently commented upon favorably, (Fan and Yan 2010; Monroe and
                         Adams 2012) in particular for longitudinal studies, (Choga 2019; Khadjesari et al. 2011) with
                          better response rates resulting from uniform monetary incentives rather than prize draws
                         (Brown and Knowles 2019; Robb et al. 2017) and higher incentive values for longitudinal
                          studies (Khadjesari et al. 2011).
                                The main limitation to assessing participation is that the mode of survey administration
                          does not give the opportunity to adequately assess the response rate. Participation and
                          uptake on survey invitations were voluntary, and the selection of the cohorts for the general
                         population (see Section 2.3.1) was opportunistic. Thus, it can be surmised that the fact
                          of participation entails a bias of general interest in either outdoor activities (signaled via
                          the title of the survey), interest in the general issue of risk behavior, or social desirability
                          bias with participants feeling compelled to support research in general or the individual
                          disseminators of the survey.
                                Among Reddit users, the number of actual participants in a survey is subject to a range
                          of filters and represents a small fraction of the overall population registered for a specific
                          Sub-Reddit (Figure 3). While the total number of registered users in each sub-Reddit is
                         publicly posted, and while the number of participants reading the same sub-Reddit as a
                          user is also visible, the number of people actively (posting) engaging with a sub-Reddit
                          is not readily discernible. It can be posited that the total number of persons consuming
                          the content of a specific sub-Reddit will be greater than the number who registered for the
                          sub-Reddit. The universe of readers looking at a sub-Reddit at any given time is subject to
                          time richness of the participant population due to employment and social/family factors,
                          the time of day at the user’s location (day, night) and the geographic mix of the sub-Reddit
                         ’s users, i.e., whether it is primarily a single nation (with associated time zone implications)
                          or truly global. The readers looking at a sub-Reddit need to be sufficiently interested to
                          click on the headline of the specific post and then remain engaged to read that post. Only a
                          fraction of readers will be further motivated to click on the link that takes them to the survey
                          form hosted on SurveyMonkey. For reasons of survey ethics, all relevant participation
                          information needs to be posted on the first page of an online survey. A downside of the
                          lengthy required text is that it may further discourage participation. On the other hand,
                          that step may have filtered out some user which would have commenced, but then quickly
                          abandoned the survey after a handful of questions.
                                It can be assumed that reading the initial post and subsequent participation entails a
                          bias of general interest in the outdoor activities of the targeted sub-Reddits and/or a social
                          desirability bias with participants feeling compelled to support research in general or the
                          survey in particular. The commercial version of SurveyMonkey subscribed to by Charles
                          Sturt University records the answers but does not record the number of times the survey
                          form was called up but was not progressed beyond reading the participation information
                          section.
Persistence and Attrition among Participants in a Multi-Page Online Survey Recruited via Reddit's Social Media Network
Soc.
 Soc.Sci.  2021,11,
      Sci.2022,  10,31x FOR PEER REVIEW                                                                                                            10 of
                                                                                                                                                   10 of35
                                                                                                                                                         35

                                                                                                                                   registered
                                                                                                                                     users

                                                                                                                                users reading
                                                                                                                                the subReddit

                                                                                                                               users reading
                                                                                                                                 the post

                                                                                                                               users clicking
                                                                                                                                on the link

                                                                                                                                 users starting
                                                                                                                                  the survey

                                    Figure3.3.Sampling
                                   Figure      Samplingvs.
                                                        vs.participant
                                                            participantuniverse
                                                                        universeofofaasub-Reddit.
                                                                                       sub-Reddit.

                                           It canapproximations
                                          Some     be assumed thatof      reading    the initial
                                                                            participation         poststage,
                                                                                             at that     and subsequent
                                                                                                                 however, can   participation
                                                                                                                                     be made. entails
                                                                                                                                                   The lit-a
                                    bias of suggests
                                   erature    general interest
                                                           that 90%in the
                                                                       of anoutdoor
                                                                              internet activities
                                                                                         community,of thesuch
                                                                                                            targeted    sub-Reddits
                                                                                                                  as Reddit,     are pure and/or     a social
                                                                                                                                             consumers
                                    desirability
                                   (readers),    9%bias    with participants
                                                      contribute     in general,feeling    compelled
                                                                                    and only               to support
                                                                                               1% contribute        and research       in general
                                                                                                                           engage heavily              or the
                                                                                                                                                 (Carron-
                                    survey     in particular.    The   commercial      version   of  SurveyMonkey           subscribed
                                   Arthur et al. 2014; Gasparini et al. 2020; Glenski et al. 2017; Van Mierlo 2014). In December           to  by   Charles
                                    SturtReddit
                                   2020,    University
                                                     claimedrecords   the 52
                                                                to have    answers
                                                                              millionbut   does
                                                                                        daily      notout
                                                                                               users    record
                                                                                                             of a the
                                                                                                                  totalnumber      of times
                                                                                                                          population           themillion,
                                                                                                                                          of 430     survey
                                    form 2020)
                                   (Patel   was called     up butthat
                                                   suggesting       was12%notofprogressed     beyond
                                                                                  the users visit   daily.reading
                                                                                                             As thisthe    participation
                                                                                                                       usage    cuts across  information
                                                                                                                                                the entire
                                    section.
                                   site, the participation percentage will vary between sub-Reddits. Using the 1% rule, we can
                                   estimateSome thatapproximations
                                                      0.12% of registeredof participation
                                                                               users will be atactive
                                                                                                 that stage,     however, can be made. The liter-
                                                                                                          participants.
                                    ature    suggests that
                                          System-wide       data90%   of anthat
                                                                  suggest     internet   community,
                                                                                   the average    user willsuch   asthe
                                                                                                               visit  Reddit,     area 10-minute
                                                                                                                           site for    pure consumers dura-
                                   tion  (SimilarWeb
                                    (readers),            2021), butinitgeneral,
                                                  9% contribute           can be assumed
                                                                                     and only that
                                                                                                 1%the    durationand
                                                                                                      contribute      is longer
                                                                                                                            engage for heavily
                                                                                                                                       specific-interest
                                                                                                                                                  (Carron-
                                   sub-Reddits
                                    Arthur et al.which        have developed
                                                      2014; Gasparini               into Glenski
                                                                           et al. 2020;  their own     ecosystems.
                                                                                                    et al. 2017; VanMoreover,
                                                                                                                          Mierlo 2014). theInduration
                                                                                                                                                December  of
                                   active
                                    2020, users
                                            Redditwill    far exceed
                                                      claimed    to havethe52average.
                                                                               million daily users out of a total population of 430 million,
                                    (PatelA glimpse     of the user-reader-participant
                                             2020) suggesting      that 12% of the usersrelationship
                                                                                              visit daily. As  of this
                                                                                                                   sub-Reddits
                                                                                                                         usage cuts used   in the
                                                                                                                                        across   thesurvey
                                                                                                                                                       entire
                                   can
                                    site,bethe
                                             gleaned    from Table
                                                 participation          1. Most will
                                                                  percentage       sub-Reddit     forumssub-Reddits.
                                                                                        vary between         allow creating       poststhe
                                                                                                                               Using      with
                                                                                                                                             1%arule,
                                                                                                                                                    single-
                                                                                                                                                          we
                                   question,
                                    can estimate fixed   choice
                                                      that   0.12%polls   with a maximum
                                                                     of registered     users willof besix   short
                                                                                                       active       answer options, running for
                                                                                                                 participants.
                                   a duration      of between
                                           System-wide        data one    andthat
                                                                    suggest     seventhedays.
                                                                                         average A user
                                                                                                     simplewillpoll
                                                                                                                  visitofthe
                                                                                                                           three-day
                                                                                                                              site for a duration
                                                                                                                                          10-minutewas   du-
                                   administered        on  various    adventure      bicycling-related      sub-Reddits
                                    ration (SimilarWeb 2021), but it can be assumed that the duration is longer for specific- asking    respondents       to
                                   choose
                                    interesta sub-Reddits
                                                primary motivation
                                                                 which havefor their  participation
                                                                                developed              in the
                                                                                              into their     ownbicycling
                                                                                                                   ecosystems.activities.   The number
                                                                                                                                     Moreover,       the du-
                                   ofration
                                       usersofreading     the sub-Reddit
                                                 active users                  was recorded
                                                                 will far exceed                at six instances between 20:00 h and 8:00 h
                                                                                      the average.
                                   GMT A   (7 glimpse
                                               a.m. andof9the  p.m.   ADST) for three days,
                                                                  user-reader-participant           which allows
                                                                                                 relationship           us to calculate
                                                                                                                   of sub-Reddits        usedaninaverage
                                                                                                                                                    the sur-
                                   percentage      of  registered    users   reading    the discussions       at any
                                    vey can be gleaned from Table 1. Most sub-Reddit forums allow creating posts with aone    time.   The   percentages
                                   range    from 0.1% to
                                    single-question,          0.7%
                                                           fixed     (Tablepolls
                                                                  choice      1). While
                                                                                   with athis  is the average
                                                                                           maximum                  at anyanswer
                                                                                                           of six short       given time,
                                                                                                                                      options,it does   not
                                                                                                                                                   running
Soc. Sci. 2022, 11, 31                                                                                                      11 of 35

                                 allow to estimate the cumulative total over a single day or the total three-days exposure
                                 period of the poll. Neither does it indicate the duration of participation.

                                 Table 1. Participation statistics of a Reddit poll.

                                                                                                            Poll Participants in %
                                                 Average % of                           Poll Participants
                          Registered Users                             Total of Poll                         of Average Number
      Sub-Reddit                                  Registered                           in % of Registered
                         in the Sub-Reddit                             Participants                          of Registered Users
                                                 Users Online                                 Users
                                                                                                                   Online
         bmx                   38,200                 0.585                  64              0.168                  28.6
      cyclocross              18,000                  0.266                  54              0.300                 112.9
     dirtjumping                5900                  0.538                  36              0.610                  113.4
        fatbike                 8900                  0.465                  49              0.551                 118.4
      fixed gear              69,100                  0.539                 140              0.203                  37.6
    gravelcycling              45,000                 0.628                  97              0.216                  34.3
   mountainbiking              86,200                 0.283                  71              0.082                  29.1
         MTB                  223,000                 0.707                 170              0.076                  10.8
    single speed                6600                  0.105                 100              1.515                 1440.0
        xbiking               35,800                  0.499                  88              0.246                  49.2

                                      When considering the participation in the poll, the average percentage of registered
                                 users doing so ranges from 0.08% to 1.51% (Table 1).
                                      A formal response rate can be calculated for the student cohort recruited through the
                                 University of Guam mail system. The total e-mail list contains 3082 addresses. In total
                                 210 responses were received, resulting in a response rate of 6.8%.

                                 3. Results
                                 3.1. Demographics
                                      In total 4198 surveys were commenced, 422 by general online (not Reddit) users and
                                 3776 by Reddit users. The two online cohorts show a gender bias with male respondents
                                 significantly overrepresented both among the non-Reddit (χ2 = 3.92, df = 1, p = 0.0476)
                                 and the Reddit population (χ2 = 1758.59, df = 1, p < 0.0001). When examining the gender
                                 differential among the major Reddit cohorts, women respondents are significantly better
                                 represented among the mental health Reddits (53.3%, n = 210; χ2 = 39.65, df = 1, p < 0.0001)
                                 than among the general population (30.0%, n = 793) and the outdoor activities related
                                 Reddits (31.0%, n = 786; χ2 = 35.87, df = 1, p < 0.0001). The representation of female
                                 respondents among adventure activities related Reddits is a sixth that of of the male
                                 respondents (14.3%, n = 1613, χ2 = 723.74, df = 1, p < 0.0001).
                                      The gender representation varies between five-year age cohorts, with female repre-
                                 sentation among the non-Reddit respondents rising from 31.3% among the 16–19 years
                                 old age cohort to 70% among the 65–69 years old age cohort. No such trend is observable
                                 among Reddit respondents (Table 2). When looking at the age structure of the Reddit
                                 respondent population by gender, differences emerge (Figure 4). While the age curves
                                 generally track in a similar fashion, the general respondent cohorts tend to be younger
                                 than those in the adventure cohorts and those in the general outdoor cohort. Among both
                                 genders, adventure cohort respondents show a peak in the 25–29 year age bracket, while
                                 the outdoor cohort respondents peak in the 30–34 year age bracket. Among the general
                                 Reddit population, the age structure of female respondents shows a distinct peak in the
                                 16–19 year age bracket, while among men, it is more diffuse, spanning the 16–34 age range
                                 (Figure 4).
genders, adventure cohort respondents show a peak in the 25–29 year age bracket, while
                           the outdoor cohort respondents peak in the 30–34 year age bracket. Among the general
                           Reddit population, the age structure of female respondents shows a distinct peak in the
                           16–19 year age bracket, while among men, it is more diffuse, spanning the 16–34 age range
                           (Figure 4).
Soc. Sci. 2022, 11, 31                                                                                                 12 of 35

                           Table 2. Gender and age breakdown of the Reddit and non-Reddit respondent population.

                                                        Reddit
                         Table 2. Gender and age breakdown                                   Non-Reddit
                                                           of the Reddit and non-Reddit respondent population.
                                            Men           Women                n         Men           Women             n
                                16–19        77.3        Reddit
                                                           22.7               242        68.8     Non-Reddit
                                                                                                       31.3             16
                                            Men          Women            n             Men        Women              n
                                20–24        74.4            25.6             687        67.3            32.7           55
                               16–19
                                25–29       77.3
                                             73.4          22.7
                                                             26.6        242804         68.8
                                                                                         64.1          31.3
                                                                                                         35.9        16 64
                               20–24        74.4           25.6          687            67.3           32.7          55
                                30–34
                               25–29
                                             76.0
                                            73.4
                                                             24.0
                                                           26.6          804
                                                                              663        54.2
                                                                                        64.1
                                                                                                         45.8
                                                                                                       35.9          64
                                                                                                                        48
                                35–39
                               30–34         77.5
                                            76.0             22.5
                                                           24.0          663364          40.8
                                                                                        54.2             59.2
                                                                                                       45.8          48 49
                                40–44
                               35–39         81.8
                                            77.5             18.2
                                                           22.5          364242          46.4
                                                                                        40.8             53.6
                                                                                                       59.2          49 28
                               40–44
                                45–49       81.8
                                             81.5          18.2
                                                             18.5        242130         46.4
                                                                                         52.0          53.6
                                                                                                         48.0        28 25
                               45–49        81.5           18.5          130            52.0           48.0          25
                                50–54        78.6            21.4             117        48.4            51.6           31
                               50–54        78.6           21.4          117            48.4           51.6          31
                                55–59
                               55–59         69.8
                                            69.8             30.2
                                                           30.2           63 63          46.2
                                                                                        46.2             53.8
                                                                                                       53.8          26 26
                                60–64
                               60–64         64.6
                                            64.6             35.4
                                                           35.4           48 48          47.4
                                                                                        47.4             52.6
                                                                                                       52.6          19 19
                               65–69
                                65–69       88.2
                                             88.2          11.8
                                                             11.8         17 17         30.0
                                                                                         30.0          70.0
                                                                                                         70.0        10 10
                                70+         100.0           0.0            4            37.5           62.5           8
                                 70+        100.0            0.0               4         37.5            62.5            8
                                 All
                                 All         75.8
                                             75.8          24.2
                                                             24.2        33813381       53.6
                                                                                         53.6          46.4
                                                                                                         46.4        379379

                         (a)                                                              (b)
                          Figure4.4. Age
                         Figure      Age structure
                                         structure of
                                                   of the
                                                       theReddit
                                                           Redditrespondent
                                                                  respondentpopulation
                                                                                populationbybygender. (a) (a)
                                                                                                gender.   men;  (b) women.
                                                                                                              men;          (the
                                                                                                                    (b) women
                          general  category includes all sub-Reddits not classified as adventure, outdoor   or mental health).
                         (The general category includes all sub-Reddits not classified as adventure, outdoor or mental health).

                               Thenon-Reddit
                              The    non-Reddit    user
                                                user      respondents
                                                      respondents    camecame
                                                                           fromfrom     25 countries,
                                                                                 25 countries,          primarily
                                                                                                primarily           the(48.65),
                                                                                                            the U.S.A.   U.S.A.
                          (48.65), Australia
                         Australia  (35.1%) (35.1%)
                                              and Canadaand Canada
                                                              (4%). The(4%).Reddit
                                                                             The Reddit   respondents
                                                                                    respondents    camecamefromfrom   68 differ-
                                                                                                                 68 different
                          ent countries,
                         countries,        primarily
                                     primarily         the U.S.A.
                                                the U.S.A.          (66.6%),
                                                             (66.6%),   CanadaCanada    (8%),
                                                                                (8%), the  UKthe   UK and
                                                                                               (5.8%)   (5.8%)  and Australia
                                                                                                             Australia (4.2%).
                          (4.2%).
                         On       On ageographic
                             a major    major geographic
                                                    scale, thescale, the population
                                                                 Reddit   Reddit population   is dominated
                                                                                       is dominated            by participants
                                                                                                       by participants   from
                         North
                          from America     (64.6%),(64.6%),
                                North America        followed   by Europe
                                                              followed       (14.5%) and
                                                                         by Europe         Australia/New
                                                                                      (14.5%)                 ZealandZealand
                                                                                              and Australia/New        (5.2%).
                         Least represented
                          (5.2%).             are the Middle
                                   Least represented    are theEast,  Latin
                                                                 Middle      America
                                                                          East, Latin and  Southand
                                                                                       America     EastSouth
                                                                                                         Asia (0.2%  each),
                                                                                                               East Asia     as
                                                                                                                          (0.2%
                         well as East
                          each),      Asia
                                 as well  as and
                                             EastAfrica  (0.4%
                                                   Asia and      each).
                                                              Africa  (0.4% each).

                         3.2.
                          3.2.Participant
                               ParticipantAttrition
                                           Attrition
                               Participant
                                Participantattrition
                                             attritionrates (PAR)
                                                        rates      were
                                                               (PAR)     assessed
                                                                      were        by establishing
                                                                            assessed              howhow
                                                                                     by establishing   many  pages
                                                                                                           many    of theof
                                                                                                                 pages
                         multi-page   onlineonline
                          the multi-page      survey   a given
                                                    survey    a participant completed
                                                                given participant      before they
                                                                                   completed       abandoned
                                                                                               before         the survey.
                                                                                                      they abandoned   the
                         In the online form delivered by SurveyMonkey, Page 1 was the participant information
                         documentation and the invitation of the survey. The count started when participants
                         progressed from page 1 to page 2, thus equating the start of page 2 as 100% participation.
                         Pages 2 to 9 covered demographics and questions related to general attitudes towards risk
                         and social determinants of risk taking (coded as P2–P9 in the graphs). P10 was the first
                         page with questions related to specific activities, which also explained what was asked.
                         The following seven pages are related to specific activities. These were presented to the
                         participant in a randomized fashion to ensure that each had an equal chance of being
pants progressed from page 1 to page 2, thus equating the start of page 2 as 100%
                                    pation. Pages 2 to 9 covered demographics and questions related to general attitu
                                    wards risk and social determinants of risk taking (coded as P2–P9 in the graphs). P
                                    the first page with questions related to specific activities, which also explained w
                                    asked. The following seven pages are related to specific activities. These were pr
Soc. Sci. 2022, 11, 31                                                                                     13 of 35
                                    to the participant in a randomized fashion to ensure that each had an equal ch
                                    being answered in those cases where participants did not complete the survey (c
                                    R1–R7 in the graphs). In survey forms using the standard page layout (in paper o
                         answered in those cases where participants did not complete the survey (coded as R1–R7
                                    the first page equates to P2 to P9 and the obverse page to P10 and R1–R7.
                         in the graphs). In survey forms using the standard page layout (in paper or PDF), the first
                         page equates to P2 to P9 and the obverse page to P10 and R1–R7.
                                        3.2.1. Effects of the Mode of Submission
                         3.2.1. Effects of the Mode
                                             The     of Submission
                                                 different modes of submission resulted in different PARs. In the case of th
                                      ical survey,
                              The different    modeswhich         followedresulted
                                                          of submission        a standard      page layout
                                                                                          in different    PARs.  (inInpaper   or PDF),
                                                                                                                         the case  of thethe resp
                         physical survey,
                                      tendedwhichto followed      a standard
                                                     fully or almost       fullypage  layout the
                                                                                  complete     (in paper    or PDF),
                                                                                                    first page     (P2 the  respondents
                                                                                                                        to P9  equivalent), but t
                         tended to fully    or almost
                                      dropped              fully
                                                   after the       complete
                                                                first           the first related
                                                                      set of questions     page (P2  to to  P9 equivalent),
                                                                                                        specific                  but equivalent)
                                                                                                                    activities (P10   the
                         PAR dropped  5).after  the first the
                                           Thereafter,       set of
                                                                  PARquestions
                                                                          remainedrelated  to specific
                                                                                       stable   among activities       (P10 equivalent)
                                                                                                          paper surveys,        whereas it conti
                         (Figure 5). Thereafter,
                                      drop, albeit the gradually
                                                        PAR remained        stable
                                                                       (final PARamong
                                                                                     16.3%), paper
                                                                                                amongsurveys,    whereas filling
                                                                                                          respondents       it continued
                                                                                                                                    out the PDF v
                         to drop, albeit gradually     (final  PAR   16.3%),   among    respondents
                                      (final PAR 12.8%). The difference between the two PAR trajectoriesfilling  out   the PDF   versions
                                                                                                                                       is very sig
                         (final PAR 12.8%). The difference between the two PAR trajectories is very significant
                                      (paired t-test, p = 0.0017). By comparison, the PAR of respondents using onlin
                         (paired t-test, p = 0.0017). By comparison, the PAR of respondents using online forms
                                      dropped following the first set of demographic questions (P3), remained stable u
                         dropped following the first set of demographic questions (P3), remained stable until the
                                      end of the section dealing with questions related to general attitudes towards r
                         end of the section dealing with questions related to general attitudes towards risk and social
                         determinantssocial    determinants
                                         of risk-taking     (P4–P9) of but
                                                                        risk-taking   (P4–P9)
                                                                            then dropped      off but  thenfor
                                                                                                  steeply      dropped     off steeply
                                                                                                                 the questions    relatedfor the qu
                         to specific activities (Figure 5). The same trajectory was observed among Reddit users, among
                                      related    to  specific    activities   (Figure    5). The   same    trajectory     was   observed
                                      users,
                         except that the  PARexcept
                                                alreadythat      the PAR
                                                           dropped           alreadyamong
                                                                         continually    dropped      continually
                                                                                                the section    dealingamong     the section deali
                                                                                                                          with questions
                                      questions
                         related to general          related
                                              attitudes         to general
                                                           towards            attitudes
                                                                       risk and            towards riskofand
                                                                                  social determinants               social determinants
                                                                                                                risk-taking   (final PAR of risk
                         42.1%). The (final
                                      declinePARin PAR42.1%).
                                                           for theThe    decline related
                                                                     questions     in PARtofor    the questions
                                                                                               specific  activities related     to specific
                                                                                                                      was steeper    than activit
                         that of non-Reddit participants dropping to a final PAR value of 61.1%. While the PAR of 61.1%
                                      steeper    than    that  of  non-Reddit      participants    dropping      to  a final PAR   value
                         decay curves the(Figure    5) show
                                            PAR decay           a very
                                                             curves       high level
                                                                       (Figure         of correlation
                                                                                  5) show    a very high  (r =level
                                                                                                                0.998),   the difference
                                                                                                                     of correlation    (r = 0.998),
                         between the ference
                                       two PARbetween
                                                    trajectories
                                                               the is
                                                                    twohighly
                                                                           PARsignificant
                                                                                 trajectories(paired   t-test,
                                                                                                is highly       p < 0.0001).
                                                                                                             significant    (paired t-test, p < 0.

                         Figure 5. Differences in Differences
                                       Figure 5.  participant attrition between
                                                               in participant   variousbetween
                                                                              attrition modes ofvarious
                                                                                                 submission.
                                                                                                        modes of submission.

                              The gender differences in participant attrition for paper and PDF versions are shown
                         in Figure 6, with the greater final PAR by men filling out paper versions (92.6%) and the
                         least loss among men filling out PDF versions (98.4%).
The gender differences in participant attrition for paper and PDF versions are shown
                           in Figure 6, with the greater final PAR by men filling out paper versions (92.6%) and the
                           least loss among men filling out PDF versions (98.4%).
Soc. Sci. 2022, 11, 31                                                                                           14 of 35
                                The remainder of the discussion of results focuses solely on participant attrition rates
                           observed using online surveys hosted on Survey Monkey.

                         (a)                                                                            (b)
                          Figure 6. Differences in participant attrition by mode of submission and gender. (a) paper and pdf
                         Figure 6. Differences in participant attrition by mode of submission and gender. (a) paper and pdf
                          submission; (b) Reddit and non-Reddit online cohorts.
                         submission; (b) Reddit and non-Reddit online cohorts.

                           3.2.2.
                                TheEffects  of Gender
                                      remainder      of theondiscussion
                                                                Participant  ofAttrition     amongsolely
                                                                                results focuses        Reddit   onand    Non-Reddit
                                                                                                                    participant          Cohorts
                                                                                                                                   attrition  rates
                         observedGenderusingdifferentiation
                                               online surveysinhosted  PAR on canSurvey
                                                                                    be observed
                                                                                              Monkey.   both for Reddit and non-Reddit
                           online cohorts (Figure 6b). While the PAR trajectories among men and women show a
                         3.2.2.
                           high Effects
                                  level ofofcorrelation
                                              Gender oninParticipant          Attrition
                                                                 the non-Reddit            among
                                                                                      online          Reddit
                                                                                                 cohort          and Non-Reddit
                                                                                                           (r = 0.987)    and a veryCohorts
                                                                                                                                         high level
                           of correlation     the Reddit cohort
                                Gender differentiation         in PAR(rcan = 0.999),   women
                                                                               be observed          have
                                                                                                 both   foraReddit
                                                                                                               very significantly
                                                                                                                       and non-Reddit  lower    PAR
                                                                                                                                            online
                         cohorts
                           than men (Figure    6b). the
                                         in both     While    the PAR trajectories
                                                          non-Reddit      and the Reddit  among      men and
                                                                                                  cohorts     (bothwomen
                                                                                                                      at p
Soc.
Soc. Sci.Sci. 2021,
          2022,  11,10,
                     31x FOR PEER REVIEW                                                                                                                  15 15
                                                                                                                                                             of of
                                                                                                                                                                3535

                                    (a)                                                                                (b)
                                     Figure7.7.Differences
                                    Figure     Differences in
                                                            in participant
                                                               participant attrition
                                                                           attrition among
                                                                                     amongonline
                                                                                           onlinesurvey
                                                                                                  surveyrespondents
                                                                                                         respondentsbetween
                                                                                                                     betweenmajor
                                                                                                                             majorsub-
                                                                                                                                    sub-
                                     Reddit cohort groups. The curve for non-Reddit online surveys is shown for comparison. (a) Men;
                                    Reddit cohort groups. The curve for non-Reddit online surveys is shown for comparison. (a) Men;
                                     (b) Women.
                                    (b) Women.
                                      3.2.3.Effects
                                    3.2.3.   Effectsof  ofAge
                                                           Age on on Participant
                                                                      Participant Attrition
                                                                                       Attrition among
                                                                                                    among RedditRedditand andNon-Reddit
                                                                                                                                 Non-RedditCohorts  Cohorts
                                            To   test  whether      a  participant’s     age   has   an  influence      on
                                           To test whether a participant’s age has an influence on attrition rates, male and attrition    rates,   male     and   fe-
                                                                                                                                                               female
                                      male   respondents        were    grouped      into   ten-year    age    cohorts
                                    respondents were grouped into ten-year age cohorts (Figure 8). Among both genders,    (Figure     8).  Among        both    gen-
                                      ders, increasing
                                    increasing      age had ageahad      a positive
                                                                   positive     effecteffect   on survey
                                                                                         on survey             completion
                                                                                                         completion             rates.
                                                                                                                           rates.    ForFormen men of of
                                                                                                                                                       thethe   gen-
                                                                                                                                                              general
                                      eral  online    cohorts    (Figure    8a),  the   PAR    of  the  age    group    55+
                                    online cohorts (Figure 8a), the PAR of the age group 55+ was significantly less than all   was   significantly       less   than
                                      all other
                                    other          age cohorts
                                            age cohorts      (paired(paired    t-test;
                                                                         t-test;  range:range:    p = 0.0001
                                                                                            p = 0.0001            for 35−44
                                                                                                           for 35–44     year–p year–p     = 0.0096
                                                                                                                                     = 0.0096           for 44–54
                                                                                                                                                 for 44–54       year).
                                      year).
                                    For   men  Forofmen    of the Reddit
                                                      the Reddit      cohorts  cohorts
                                                                                  (Figure (Figure
                                                                                             8b), the 8b),PAR
                                                                                                            the PAR
                                                                                                                  of theofagethegroup
                                                                                                                                   age group55+ was55+ was alsoalso
                                                                                                                                                                  very
                                      very   significantly     less  than   all  other   age   cohorts     with    p < 0.0001     for
                                    significantly less than all other age cohorts with p < 0.0001 for all except for the 44–45 year    all except     for   the  44–
                                      45 year
                                    cohort    (pcohort
                                                  = 0.01).(pAmong
                                                             = 0.01). women,
                                                                        Among women,the samethe      same
                                                                                                 trend    cantrend    can be observed
                                                                                                                be observed       among the   among
                                                                                                                                                  Reddit  thecohorts
                                                                                                                                                                Red-
                                    (Figure 8d), where the age group 55+ was also very significantly less than all other all
                                      dit cohorts     (Figure    8d),   where    the   age  group     55+   was    also  very    significantly      less   than    age
                                      other age
                                    cohorts     withcohorts    with for
                                                       p < 0.0001      p < all
                                                                           0.0001
                                                                               exceptfor for
                                                                                          all except
                                                                                              the 44–45 for year
                                                                                                              the 44–45
                                                                                                                     cohort year
                                                                                                                               (p =cohort
                                                                                                                                      0.0149),(p =but0.0149),
                                                                                                                                                        not for  butthe
                                      not foronline
                                    general      the general
                                                         cohortonline
                                                                   (Figurecohort      (Figure
                                                                              8c). Here          8c). Here
                                                                                            the PAR     for the theagePAR
                                                                                                                        group for 55+
                                                                                                                                   the was
                                                                                                                                         age group        55+ was
                                                                                                                                               significantly       less
                                      significantly
                                    for  all other age  lesscohorts
                                                              for all other
                                                                        (range: agep= cohorts
                                                                                        0.0002(range:
                                                                                                  for 45–54 p = year–p
                                                                                                                 0.0002 for     45–54 for
                                                                                                                            = 0.0161     year–p34–44 = 0.0161
                                                                                                                                                        year) withfor
                                      34–44    year)   with  the   exception     of
                                    the exception of the 25–34 cohort (p = 0.8723).  the  25–34    cohort    (p  =  0.8723).
                                            Usingthe
                                           Using      the55+
                                                          55+ cohort,
                                                                cohort, which
                                                                           which shows
                                                                                     showsthe thesmallest
                                                                                                    smallestPAR  PARas  asthetheyardstick,
                                                                                                                                   yardstick,the  thePARPARdecaydecay
                                      curves    for  the  age  groups     of  male    (Figure    8b)  and    female    (Figure
                                    curves for the age groups of male (Figure 8b) and female (Figure 8d) Reddit respondents        8d)  Reddit      respondents
                                      eachshow
                                    each     showa averyveryhigh
                                                               highlevel
                                                                       levelofofcorrelation
                                                                                  correlation(r (r = =0.987–0.992
                                                                                                        0.987–0.992for   formenmen    and
                                                                                                                                    and   r =r 0.9787–0.9889
                                                                                                                                               = 0.9787–0.9889      for
                                      for women),       while   the   correlation     for  the  male    and    female
                                    women), while the correlation for the male and female general online cohorts         general     online    cohorts
                                                                                                                                                     (Figure(Figure
                                                                                                                                                                 8a,d)
                                    is8a,d)  is not significant.
                                       not significant.      Looking  Looking
                                                                          at the at   the trajectories
                                                                                   trajectories    of PAR   ofamong
                                                                                                                PAR amongmen, men,the age thegroups
                                                                                                                                               age groups18–2418–  and
                                      24 and    25–34   show    the   greatest   and    most   rapid   decline     PAR    once
                                    25–34 show the greatest and most rapid decline PAR once the activity set questions were       the  activity    set  questions
                                      were asked.
                                    asked.    The PAR  Thecurves
                                                             PAR curves
                                                                      for thefor   the Reddit
                                                                                 Reddit    cohortscohorts     all follow
                                                                                                       all follow           the same
                                                                                                                      the same            trajectory,
                                                                                                                                    trajectory,     withwith the the
                                                                                                                                                                  PAR
                                      PAR    decreasing     with    each   increase    in age   cohort    (Figure     8b).
                                    decreasing with each increase in age cohort (Figure 8b). The PAR curves show a high    The    PAR    curves     show      a high
                                                                                                                                                                   to a
                                      to a very    high   level   of  correlation     depending       on   the  combination
                                    very high level of correlation depending on the combination of adjacent age cohorts tested,     of adjacent      age   cohorts
                                      tested, ranging
                                    ranging     from r =from 0.995r (35–44
                                                                     = 0.995 vs.
                                                                               (35–44    vs. 45–54)
                                                                                    45–54)              to r =(18–24
                                                                                              to r = 0.999       0.999 (18–24      vs. 25–34).
                                                                                                                          vs. 25–34).
                                           The PAR curves for women Reddit cohorts follow similar trajectories compared to
                                    those of men but exhibit a more pronounced decrease once activity set questions were asked
                                    (Figure 8d). Differing from the men, however, the PAR curves for women respondents do
                                    not show a decrease in PAR with each increase in age cohort as the 35–44 year cohort shows
                                    a lower PAR than the 25–34 year cohort (Figure 8d). The PAR curves show a moderate to
                                    a high level of correlation depending on the combination of adjacent age cohorts tested,
                                    ranging from r = 0.978 (45–54 vs. 55+) to r = 0.992 (25–34 vs. 35–44). Among the general
Soc. Sci. 2022, 11, 31                                                                                                      16 of 35

                                      online cohorts of women respondents, the PAR curves are much more diverse without a
                                      clear pattern (Figure 8c). While the 55+ cohort shows the smallest increase in PAR (final
                                      value 78.6), the 18–24 year cohort shows the steepest and greatest increase PAR (final
                                      value 41.1%). Compared to the Reddit cohorts, which showed a gradual increase in PAR
                                      even among the attitude questions (P4–P9), the women respondents of the general online
                                      cohorts exhibited a high level of perseverance, with the younger age groups (18–24 and
                                      25–34) maintaining 100% until P8 and P9 respectively. Two other cohorts (35–44 and 55+)
                                      maintained a PAR of over 95% until P9. From P10 onwards, the PAR increased rapidly in
                                      these four age groups. Only the 44–54 year cohort showed a gradual, almost linear increase
                                      in PAR (Figure 8c). The correlations of the curves are not significant or only moderately
     Soc. Sci. 2021, 10, x FOR PEER REVIEW                                                                             16 of 35
                                      significant.

                                 (a)                                                          (b)

                                 (c)                                                          (d)
                                  Figure 8. Differences in participant attrition among male respondents by age group for general
                                Figure  8. Differences in participant attrition among male respondents by age group for general and
                                  and Reddit cohorts. (a) Men—General online cohorts, (b) Men—Reddit cohorts, (c) Women—Gen-
                                Reddit  cohorts.
                                  eral online     (a) Men—General
                                               cohorts,               onlinecohorts.
                                                        (d) Women—Reddit       cohorts, (b) Men—Reddit cohorts, (c) Women—General
                                online cohorts, (d) Women—Reddit cohorts.
                                       The PAR curves for women Reddit cohorts follow similar trajectories compared to
                                  those of men but exhibit a more pronounced decrease once activity set questions were
                                  asked (Figure 8d). Differing from the men, however, the PAR curves for women respond-
                                  ents do not show a decrease in PAR with each increase in age cohort as the 35–44 year
                                  cohort shows a lower PAR than the 25–34 year cohort (Figure 8d). The PAR curves show
You can also read