Examining Factors Associated with Twitter Account Suspension Following the 2020 U.S. Presidential Election
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Examining Factors Associated with Twitter Account Suspension Following the 2020 U.S. Presidential Election Farhan Asif Chowdhury∗ , Dheeman Saha∗ , Md Rashidul Hasan∗ , Koustuv Saha† , Abdullah Mueen∗ ∗ University of New Mexico,† Georgia Tech {fasifchowdhury, dsaha, mdhasan, mueen}@unm.edu, koustuv.saha@gatech.edu Abstract—Online social media enables mass-level, transparent, In the context of inadequate countermeasures against ma- and democratized discussion on numerous socio-political issues. nipulation during the 2016 U.S. presidential election, the Due to such openness, these platforms often endure manipulation 2020 election was of paramount importance for platform and misinformation - leading to negative impacts. To prevent such harmful activities, platform moderators employ countermeasures operators to provide a safe and democratized public discussion to safeguard against actors violating their rules. However, the sphere [11]. The impact and importance of these safeguard correlation between publicly outlined policies and employed measures are not confined to this particular election; instead, action is less clear to general people. they bear cardinal implications for future online political In this work, we examine violations and subsequent mod- discussions exceeding all geopolitical boundaries. Therefore, it erations related to the 2020 U.S. President Election discussion on Twitter. We focus on quantifying plausible reasons for the is requisite to assess these platforms’ moderation policy — to suspension, drawing on Twitter’s rules and policies by identifying investigate the correlation between their policies and actions, suspended users (Case) and comparing their activities and examine for potential political biases, and make general people properties with (yet) non-suspended (Control ) users. Using a aware of the targeted malice topics. dataset of 240M election-related tweets made by 21M unique In this respect, we focus on analyzing the moderation policy users, we observe that Suspended users violate Twitter’s rules at a higher rate (statistically significant) than Control users of popular micro-blogging site Twitter as a case-study by across all the considered aspects - hate speech, offensiveness, asking the following research questions: spamming, and civic integrity. Moreover, through the lens of Twitter’s suspension mechanism, we qualitatively examine the • RQ1: What factors associate with Twitter account sus- targeted topics for manipulation. pension following the 2020 U.S. President Election? • RQ2: How do political ideologies associate with the I. I NTRODUCTION suspended accounts? Social media platforms such as Facebook, Twitter, and • RQ3: What was the topic of discussion among suspended Reddit have become vastly popular in public discussion related users? What type of content these users shared? to societal, economic, and political issues [1]. However, in This work. To answer these research questions, we collect the recent past, these platforms were heavily targeted for a large-scale dataset of 240M tweets made by 21M unique manipulation and spreading misinformation relating to nu- users over eight weeks centering the 2020 U.S. President merous civic issues across the world [2], which is often Election. Afterward, we identify 355K suspended users who referred to as “Computational Propaganda” [3]. In particular, participated in this election discussion. We draw upon Twitter’s the coordinated misinformation and influence campaigns of rules and policies to examine plausible suspension factors. To foreign state-sponsored actors during the 2016 U.S. presiden- investigate the user activity that might lead to suspension, we tial election were highly scrutinized - which eventually led to adopt the “case-control” study design from epidemiology [12]. the U.S. Congressional hearings and investigation by the U.S. We consider the suspended users as Case group and sample Department of Justice [4], [5]. similar number of non-suspended users as Control group. We In the aftermath of the 2016 U.S. presidential election, devise several classification techniques to quantify suspension social platforms announced strict and improved platform mod- factors among these two groups. We infer these users’ political eration policy [6], [7]. However, these platform’s moderation leaning by utilizing the political bias of the news media and suspension policies have been largely debated and have outlet they share. By employing a language differentiation faced severe criticism from the political leaders and supporters technique, we contrast the conversational topics and shared about bias towards their opposition [8], [9]. Although these content among Case and Control groups. Through the lens platforms publicly outline their moderation policy, there is no of Twitter’s suspension policy, we passively infer the targeted third-party monitoring of their enacted moderation. Moreover, topics by platform violators and identify the online content social platforms like Twitter and Facebook employ extensive platforms utilized to sway the discussion. safeguard mechanisms that consider various aspects of user ac- Summary findings. We find that across all suspension factors, tivities (coordinated activities, impersonation, etc.) to identify the Suspended users have higher (statistically significant) malicious behavior [10]. Therefore, analyzing these suspended violation occurrences. Coherent with prior work, we find that users’ tweets and shared content might shed light on violators’ Suspended users are short-lived, have fewer followers, and targeted topics. show more tweeting activity. We observe a higher presence of
right-leaning users than left-leaning users among Suspended Control Suspended users. We find that Suspended users use more curse and 103 Statistic Value Number of Tweets derogatory words and personally-attacking and propaganda 102 # Tweets 240M related hashtags. We also notice that these users share news # Unique users 21M content from heavily biased right-leaning news-propaganda # Retweets 173M sites. We discuss the implications and limitations of our work 101 # Quote tweets 60M in the Conclusion. # µ tweets per user 11 100 # Suspended users 355K 101 103 105 # Tweets by sus. users 7.2M Number of Users II. R ELATED W ORK Figure 1: Distribution of # Table I: Descriptive statistics of There have been several works related to Twitter suspension, of users by # of tweets. Twitter Dataset. most of which focused primarily on spam-related suspen- sions [13], [14]. More recently, le2019postmortem studied suspended users in the context of the 2016 U.S. presidential election [15], and chowdhury2020twitter examined a large we do not redistribute the data, and (4) we only share ag- group of suspended users related to a large-scale Twitter gregated statistics and derived information to facilitate future purge in 2018 [16]. We refer readers to [15], [16] for a work. more comprehensive understanding of suspension and mod- eration on online platforms. However, none of these works IV. M ETHODS quantify specific factors associated with Twitter suspension. A. RQ1: Inferring Suspension Factors Additionally, political discussions and related manipulation on online platforms have been thoroughly studied previously [17], Twitter rules and policies. To infer the plausible factors [18], mostly related to the 2016 U.S.presidential election [19], that explain suspension, we draw upon Twitter’s rules, and [20]. These works primarily focus on characterizing malicious policies for free and safe public discussion [26]. Twitter users and inferring their motivation and impact. im2020still, outlines three specific categories — (1) safety, (2) privacy, badawy2018analyzing provide an extensive overview of this (3) authenticity, each of which entails finer sub-categories line of work [18], [19]. In contrast, we focus on quantifying on specific violating activities. We specifically focus on five suspension factors and examining malice topics related to the sub-categories that are more likely to be enacted upon on 2020 U.S. presidential election. election discussion: three from safety — (1) hateful conduct, (2) abuse/harassment, (3) terrorism/violent extremism; two III. DATA from authenticity — (4) spamming, (5) civic integrity. The rest are either not largely relevant (i.e., copyright, nudity) or To collect tweets related to the 2020 U.S. President Elec- inferred from our data (i.e., impersonation - as we do not tion discussion, we deployed an uninterrupted data collection have user and tweet information for all the tweet, sensitive framework utilizing Twitter’s streaming API to filter real-time media content - as we do not crawl media content). tweets based on given keywords. Similar to prior work that col- lected specific theme or event-related tweets [21], we initialize Hateful Conduct and Offensive Behavior. Several recent the keyword set with manually curated election-related words works aim to identify hateful and abusive activities in online and hashtags. To cover the continuously evolving election dis- platforms, which produced publicly available datasets and cussions and topics, we update the keyword set daily with new trained models [27], [28]. However, as the distinction between trending hashtags and words from previous days collection, as abusive and violent language is ill-defined, they unified these employed in [22], [23]. We provide a list of all the keywords in categories. Similarly, we combine both abusive and violent the supporting website [24]. Our data collection time-period tweets into one category - offensive. We utilize an automatic spans over around ten weeks, centering the election date - hate speech and offensive language detection technique to from “September 28, 2020” to “December 04, 2020”. Approx- detect hateful and offensive tweet, known as HateSonar [28]. imately one month after our data collection ends, on “January HateSonar is a logistic regression classifier that uses several 01, 2021” we start probing Twitter for each of the participating text features (i.e., TF-IDF of word-grams, sentiment, etc.), users from our dataset to identify the suspended users. Twitter which has been trained on manually labeled tweet corpus. We returns the response code 63 when requested user information use a pre-trained HateSonar model to classify each tweet into for a suspended user. In this process, we identify 355,573 three categories: (1) hateful, (2) offensive, and (3) normal. suspended users from roughly 21M participating users. We provide a summary descriptive statistics of our dataset in Table Civic integrity. Twitter has established strict rules to prevent 1 and plot per user tweet count in Figure 1. We discuss the users from “manipulating or interfering in elections or other limitation of our data collection framework in Section 6. civic processes”, including “posting and sharing misleading Ethical Concerns. Throughout our data collection, experiment content” [29]. To infer such violation, we utilize the posted design, and analysis process, we maintain ethical research hashtags and shared news website URLs. We curate a list of standards [25]. Hence, the accommodating academic institu- hashtags related to misinformation, propaganda, and conspir- tion’s Institutional Review Board exempted this project after a acy theories, borrowing from the work by [30], where they formal review. Following Twitter’s terms of services guideline: curated a list of conspiracy related hashtags. Additionally, (1) we use the Twitter API key only for passive data collection we compile a list of biased and propaganda-spreader news purposes, (2) we do not publish user-specific information, (3) websites based on a publicly available dataset from [31], [32].
Measure Suspended Control d t KS If a tweet contains a hashtag or news article from our curated Suspension rule (Safety) list, we consider it a violation of civic integrity policy. Hateful (%) 0.78 0.43 0.05 77.73*** 0.003*** Offensive (%) 6.45 5.21 0.05 91.62*** 0.01*** Spam. Several previous works have identified spamming on Suspension rule (Authenticity) Twitter, most of which consider both tweet content and user Civic Integrity (%) 0.40 0.30 0.02 31.46*** 0.001*** Spam (%) 0.56 0.38 0.03 46.83*** 0.002*** attributes for user-level classification. Here, we primarily infer Account Properties spamming violations at tweet-level based on tweet content, for Active days 964.0 1972.9 -0.74 -151.77*** 0.24*** Tweets per Day 33.5 20.82 0.25 50.20*** 0.14*** which we utilize a collection of spam keywords [33]. However, Followers Count 1491.5 3337.2 -0.04 -8.54*** 0.11*** to quantify spammers at the user-level, we also examine Friends Count 1112.5 1263.7 -0.03 -6.74*** 0.11*** several account attributes (i.e., account age, tweet rate, etc.), Political Ideology (% Tweets) Left Leaning 3.97 5.65 -0.08 -137.9*** 0.02*** which are most prominent for spammer detection [13], [34]. Right Leaning 7.25 6.00 0.05 87.21*** 0.01*** We note that the above-defined classification techniques are no match to Twitter’s actual countermeasure mechanism. Table II: Summary of differences in quantitative measures Rather, we posit these methods as high-precision approaches across Suspended and Control users. We report average — which utilize language models and keyword matching to occurrences across matched clusters, effect size (Cohen’s d), avoid false-positives. The detected violations can be regarded independent sample t-statistic, and KS-statistic. p-values are as the lower-bound for actual ensued violations, only to reported after Bonferroni correction (* p
Category Words Shared Content. In Table V, we show the top 10 distinct transistion, health care, amy coney barrett, shared domain names per user group. Among the Control flynn, senate, sidney powell, absentee, gra- users, we observe the presence of few moderately neutral Control ham, judge, legal, rigged, mail, ballot, rudy giuliani, fraudulent news outlets (i.e., nytimes.com, npr.org), independent political monitoring organization (i.e., democracydocekt.com, traitor, dumb, communist, biden family, id- citizenforethics.org), and few left-leaning news outlets Suspended iot, seanhannity, fu*k, liar, stupid, breitbart- news, ukraine, treason, terrorist, evil, leftist (theatlantic.com, motherjones.com). However, among Suspended users, we notice several heavily right-leaning non- Table III: Highly used fifteen distinctive words per user-group mainstream news-propaganda sites (i.e., usfuturenews.com, obtained using SAGE. ovalofficeradio.com, trunewshub.com, thefreeliberty.com). Category Hashtags VI. D ISCUSSION AND C ONCLUSION pentagon, bigtech, corruptkelly, quidpro- Implications. Our study bears an implication in shedding Control quo, doj, justicematters, climatechange, light on the transparency about Twitter’s content moderation flipthesenate, michiganhearing, cnntapes policy. Although we cannot ascertain any quantitative esti- stevebannon, bidenfamilycorruption, mation towards how far or through what means Twitter’s warroompandemic, russia, hunter- rules followed, our study makes insightful findings of the bidenemails, hunterbidenlaptop, statistically significant occurrences of hateful, offensive, and Suspended democratsaredestroyingamerica, misinformative content among the users whose accounts were bidencrimesyndicate, chinajoe, suspended after a while. These findings support theoretical, chinabitchbiden empirical, and anecdotal evidence about Twitter’s moderation Table IV: Highly used ten distinctive hashtags per user-group policies [26], which had only gained significant attention since obtained using SAGE. January 2021 when Twitter suspended the U.S. President Donald Trump’s Twitter account owing to inciteful and unrest- provocative content [43]. C. RQ3: Conversational topics and shared content Limitations and Future Work. Our Twitter data collection has potential biases as we initialize our seed keywords manu- Word Usage. In Table III, we present the top 15 most ally. While investigating plausible suspension reasons, we use distinctive words used in per group’s tweet as obtained from simple, interpretable, and high-precision approaches - which the SAGE technique. We observe that Control users distinctly are no match to Twitter’s complex and multi-faceted safeguard used words relating to different event-driven election-related mechanisms. We do not infer the exact reason for suspension topics (i.e., mail, ballot, rigged, fraudulent). However, we for individual users; rather, we quantify violations at tweet observe a large presence of swear words (i.e., idiot, dumb) and level. Future research can use causal inference methods like sensitive words (traitor, treason, terrorist) among Suspended matching [44] to minimize confounds and draw causal claims users’ unique words, which supports the higher hate-speech about why certain accounts were suspended. Moreover, we detection among suspended users. utilize several publicly available datasets that might suffer Hashtag Usage. Table IV presents the top 10 most distinc- from biases. tively used hashtags by Suspended and Control users. Similar We argue to situate our work as an initial step towards to word usage, the unique hashtags among Control users understanding malice, misinformation, and subsequent moder- were related to specific events (i.e., cnntapes, corruptkelly, ation related to the 2020 U.S. presidential election on online etc) and general election issues (i.e., bigtech, climatechange). platforms. Our presented insights and the derived information In contrast, distinct hashtags from Suspended users were can instigate further in-depth examination. For example, the mostly related to the defamation of Democratic presidential shared news articles’ content can be analyzed to understand candidate Joe Biden (i.e., bidencrimesyndicate, chinajoe, chin- the nature of propaganda news. To facilitate such research, we abitchbiden) and issues related to his son Hunter Biden (i.e., make these news URLs publicly available and other summary hunterbidenemails, hunterbidenlaptops). statistics [24]. Similarly, future works can investigate the dynamics of propaganda hashtags and news articles unique to suspended users to understand their impact and influence. Category Domain Additionally, interactions among suspended users can be ex- democracydocket.com, buildbackbet- plored to identify potential coordination. ter.gov, www.infobae.com, latimes.com, Conclusion. In this work, we perform a computational study Control texastribune.com, citizensforethics.org, to analyze Twitter’s suspension policy situated in the context theatlantic.com, motherjones.com, of the 2020 U.S. presidential election. We facilitate our work nytimes.com, npr.org by collecting large-scale tweet dataset during the election usfuturenews.com, trumpsports.org, period and subsequently identifying the suspended users. By techfinguy.com, mostafadahroug.com, designing a Case-Control experimental study and devising Suspended ovalofficeradio.com, wuestdevelopment.de, high-precision classification approaches, we quantify associ- queenofsixteens.com, truenewshub.com, einnews.com, thefreeliberty.com ated factors related to the suspension. Additionally, we explore the political ideology and targeted topics of suspended users. Table V: Highly shared ten distinctive URL-domain per user- We aim to motivate more rigorous and in-depth future works group obtained using SAGE. through our presented insights and shared datasets.
R EFERENCES [28] T. Davidson, D. Warmsley, M. Macy, and I. Weber, “Automated hate speech detection and the problem of offensive language,” in Proceedings of the International AAAI Conference on Web and Social Media, vol. 11, [1] H. Gil de Zúñiga, N. Jung, and S. Valenzuela, “Social media use for no. 1, 2017. news and individuals’ social capital, civic engagement and political [29] Twitter-Integrity, “https://help.twitter.com/en/rules-and-policies/ participation,” Journal of computer-mediated communication, vol. 17, election-integrity-policy,” 2021. no. 3, pp. 319–336, 2012. [30] E. Ferrara, H. Chang, E. Chen, G. Muric, and J. Patel, “Characterizing [2] A. Bessi and E. Ferrara, “Social bots distort the 2016 us presidential social media manipulation in the 2020 us presidential election,” First election online discussion,” First Monday, vol. 21, no. 11-7, 2016. Monday, 2020. [3] S. C. Woolley and P. N. Howard, Computational propaganda: political [31] FactCheck, “https://www.factcheck.org/2017/07/ parties, politicians, and political manipulation on social media. Oxford websites-post-fake-satirical-stories/,” 2021. University Press, 2018. [32] Politifact, “https://www.politifact.com/article/2017/apr/20/ [4] Congress-Hearing, “https://www.govinfo.gov/content/pkg/ politifacts-guide-fake-news-websites-and-what-they/,” 2021. CHRG-115shrg27398/pdf/CHRG-115shrg27398.pdf,” 2017. [33] F. Benevenuto, G. Magno, T. Rodrigues, and V. Almeida, “Detecting [5] Mueller-Report, “https://www.justice.gov/storage/report.pdf,” 2019. spammers on twitter,” 2010. [6] Facebook-Update, “https://about.fb.com/news/2017/09/ [34] K.-C. Yang, O. Varol, P.-M. Hui, and F. Menczer, “Scalable and information-operations-update/,” 2017. generalizable social bot detection through data selection,” in Proceedings [7] Twitter-Update, “https://blog.twitter.com/en_us/topics/company/2018/ of the AAAI Conference on Artificial Intelligence, vol. 34, no. 01, 2020, 2016-election-update.html,” 2018. pp. 1096–1103. [8] Bias, “https://www.usatoday.com/story/tech/news/2016/11/18/ [35] A. Badawy, A. Addawood, K. Lerman, and E. Ferrara, “Characterizing conservatives-accuse-twitter-of-liberal-bias/94037802/,” 2016. the 2016 russian ira influence campaign,” Social Network Analysis and [9] ——, “https://www.wired.co.uk/article/twitter-political-account-ban-us- Mining, vol. 9, no. 1, p. 31, 2019. mid-term-elections,” 2020. [36] AllSides, “https://www.allsides.com/media-bias/media-bias-ratings,” [10] Twitter-Safety, “https://blog.twitter.com/en_us/topics/company/2018/ 2021. how-twitter-is-fighting-spam-and-malicious-automation.html,” 2021. [37] MediaBias, “https://mediabiasfactcheck.com/,” 2021. [38] Twitter-Measures, “https://blog.twitter.com/en_us/topics/company/2018/ [11] Twitter-Policy, “https://blog.twitter.com/en_us/topics/company/2020/ how-twitter-is-fighting-spam-and-malicious-automation.html,” 2018. 2020-election-changes.html,” 2021. [39] A. Bruns and J. E. Burgess, “The use of twitter hashtags in the formation [12] K. F. Schulz and D. A. Grimes, “Case-control studies: research in of ad hoc publics,” in Proceedings of the 6th European consortium for reverse,” The Lancet, vol. 359, no. 9304, pp. 431–434, 2002. political research (ECPR) general conference 2011, 2011. [13] K. Thomas, C. Grier, D. Song, and V. Paxson, “Suspended accounts in [40] A. Arif, L. G. Stewart, and K. Starbird, “Acting the part: Examining retrospect: an analysis of twitter spam,” in Proceedings of the 2011 ACM information operations within# blacklivesmatter discourse,” Proceedings SIGCOMM conference on Internet measurement conference, 2011, pp. of the ACM on Human-Computer Interaction, vol. 2, no. CSCW, pp. 1– 243–258. 27, 2018. [14] A. A. Amleshwaram, A. N. Reddy, S. Yadav, G. Gu, and C. Yang, “Cats: [41] J. Eisenstein, A. Ahmed, and E. P. Xing, “Sparse additive generative Characterizing automation of twitter spammers.” 2021. models of text,” 2011. [15] H. Le, G. Boynton, Z. Shafiq, and P. Srinivasan, “A postmortem of [42] K. Saha, Y. Liu, N. Vincent, F. A. Chowdhury, L. Neves, N. Shah, and suspended twitter accounts in the 2016 us presidential election,” in 2019 M. W. Bos, “Advertiming matters: Examining user ad consumption for IEEE/ACM International Conference on Advances in Social Networks effective ad allocations on social media,” in Proc. CHI, 2021. Analysis and Mining (ASONAM). IEEE, 2019, pp. 258–265. [43] Trump-Ban, “https://blog.twitter.com/en_us/topics/company/2020/ [16] F. A. Chowdhury, L. Allen, M. Yousuf, and A. Mueen, “On twitter purge: suspension.html,” 2020. A retrospective analysis of suspended users,” in Companion Proceedings [44] K. Saha and A. Sharma, “Causal factors of effective psychosocial of the Web Conference 2020, 2020, pp. 371–378. outcomes in online mental health communities,” in Proceedings of the [17] E. Ferrara, “Disinformation and social bot operations in the run up to International AAAI Conference on Web and Social Media, vol. 14, 2020, the 2017 french presidential election,” arXiv preprint arXiv:1707.00086, pp. 590–601. 2017. [18] J. Im, E. Chandrasekharan, J. Sargent, P. Lighthammer, T. Denby, A. Bhargava, L. Hemphill, D. Jurgens, and E. Gilbert, “Still out there: Modeling and identifying russian troll accounts on twitter,” in 12th ACM Conference on Web Science, 2020, pp. 1–10. [19] A. Badawy, E. Ferrara, and K. Lerman, “Analyzing the digital traces of political manipulation: The 2016 russian interference twitter campaign,” in 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 2018, pp. 258–265. [20] S. Zannettou, T. Caulfield, E. De Cristofaro, M. Sirivianos, G. Stringh- ini, and J. Blackburn, “Disinformation warfare: Understanding state- sponsored trolls on twitter and their influence on the web,” in Companion proceedings of the 2019 world wide web conference, 2019, pp. 218–226. [21] A. Olteanu, C. Castillo, F. Diaz, and S. Vieweg, “Crisislex: A lexicon for collecting and filtering microblogged communications in crises,” in Proceedings of the International AAAI Conference on Web and Social Media, vol. 8, no. 1, 2014. [22] N. Abu-El-Rub and A. Mueen, “Botcamp: Bot-driven interactions in social campaigns,” in The World Wide Web Conference, 2019, pp. 2529– 2535. [23] A. Olteanu, C. Castillo, N. Diakopoulos, and K. Aberer, “Comparing events coverage in online news and social media: The case of climate change,” in Proceedings of the International AAAI Conference on Web and Social Media, vol. 9, no. 1, 2015. [24] Website, “https://sites.google.com/view/us-election20-twitter-suspend,” 2021. [25] C. M. Rivers and B. L. Lewis, “Ethical research standards in a world of big data,” F1000Research, vol. 3, 2014. [26] TwitterRules, “https://help.twitter.com/en/rules-and-policies/ twitter-rules,” 2021. [27] A. Founta, C. Djouvas, D. Chatzakou, I. Leontiadis, J. Blackburn, G. Stringhini, A. Vakali, M. Sirivianos, and N. Kourtellis, “Large scale crowdsourcing and characterization of twitter abusive behavior,” in Proceedings of the International AAAI Conference on Web and Social Media, vol. 12, no. 1, 2018.
You can also read