Universal Fake News Collection System using Debunking Tweets
Universal Fake News Collection System using Debunking Tweets Taichi Murayama, Shoko Wakamiya, Eiji Aramaki Nara Institute of Science and Technology {murayama, wakamiya, aramaki}@is.naist.jp arXiv:2007.14083v1 [cs.CY] 28 Jul 2020 Abstract Recently, there are various fact-checking web- sites by domain-experts such as Snopes.com, Poli- Large numbers of people use Social Networking Services (SNS) for easy access to various news, but they have more tiFact.com, Factcheck.org and so on. Also, online opportunities to obtain and share “fake news” carrying false tools for tracking fake news on SNS have been devel- information. Partially to combat fake news, several fact- oped for various studies and datasets (Shao et al. 2016; checking sites such as Snopes and PolitiFact have been Shu, Mahudeswaran, and Liu 2019). Existing online track- founded. Nevertheless, these sites rely on time-consuming ing tools collect true and fake news that has been manually and labor-intensive tasks. Moreover, their available languages annotated or reported by such fact-checking websites. are not extensive. To address these difficulties, we propose a Although these tracking tools play a crucially important new fake news collection system based on rule-based (unsu- role in the gathering of fake news, they present two major pervised) frameworks that can be extended easily for vari- difficulties. The fact-checking websites contributing these ous languages. The system collects news with high probabil- tools, are burdened by time-consuming and labor-intensive ity of being fake by debunking tweets by users and presents event clusters gathering higher attention. Our system cur- tasks. Because various fake information in SNS spreads rently functions in two languages: English and Japanese. It rapidly and widely, it is necessary to detect the spread shows event clusters, 65% of which are actually fake. In fu- at an earlier stage. Also, some countries (mainly the US ture studies, it will be applied to other languages and will be and Europe) have reliable fact-checking websites that published with a large fake news dataset. provide information related to fake news that can be tracked by the tools. Therefore, it is difficult to apply existing tracking tools for most countries, including Japan, where no Introduction fact-checking organization exists and which use languages Social networking services (SNS) such as Facebook and other than English, even though many instances of fake Twitter have been used widely throughout the world because news have been detected on SNS in these countries. people can easily and immediately obtain various news and To solve these difficulties, we present a new tracking sys- information free of charge. According to Pew Research tem. It requires neither human-annotation nor fact-checking Center, 62% of adults in the United States had received websites to identify spreading fake news quickly in any news from SNS in 2017 (Shearer and Gottfried 2017). Peo- country. Our system is based on an assumption that SNS ple continue to benefit from the convenience of excel- user comments such as “This is a fake news.” constitutes a lent sources using SNS, but they have increasing vulner- cost-free and real-time clue to catch fake news. The major ability to obtaining and sharing news that has not been features of the proposed system are the following. fact-checked carefully and which includes false or uncer- tain information, called as “fake news.” Fake news is “a • The system collects news with high probability of being news article or message published and propagated through fake by debunking tweets by Twitter users, not with fake media, carrying false information regardless the means annotation by domain experts and fact-checking websites. and motives behind it (Sharma et al. 2019).” Some orga- nizations and individuals spreading fake news for finan- • Our current system works in two languages: English and cial and political gains cause harm to society. For ex- Japanese. The system uses a rule-based (unsupervised) ample, during the US 2016 presidential election, various method. It can be extended easily for various languages. tweets related to fake news had been shared more than In the future, the system will publish a big multilingual 37 million times on SNS and had no small effect on the fake news dataset. election result (Budak 2019; Bovet and Makse 2019). But it affects not only elections: fake news appears in rela- • Whereas existing systems visualize fake news diffusion tion to various events (Mendoza, Poblete, and Castillo 2010; for researchers, our system presents diffused fake news Takayasu et al. 2015; Starbird 2017). contents for the public in real time.
!".$-)*+ 6"(4)7)*+ E)&'.-)F./)#* 3.'/-'0#4"G I*)75"&.-9 3)(4.5- /.-&1/>&9' ?B ?? #$+O6,'/-'6+9'9$O"J'!"#$ *	 P/%&#"1'-#)/6>'/-'6+9'9$O"J :"#',> ?@AD@ P/%&#"1 9#14/6>'/-'0#4"G S S "N9&"$'+O91"9-3'1/4"'!"#&3'&/>&1/>&9' *	 3)(4.5- M#64 U$+OI E#/59LK&/51 "3)(4.5- 9#14/6>'/-'0#4"G =D :"#',= I&5" ?@ADD T"11+G =+60/$K'/9J Figure 1: Framework of the proposed system: The system can be divided into four steps: crawling, archiving, visualization, and making dataset. The crawling step collects “fake”-related tweets. The archiving step organizes the collected tweets. Then, the visualization step shows ranked results. The making dataset step recrawls tweets to create fake news datasets. Related work Fake news datasets produced by tracking tools Fake checking websites The research community has produced various datasets for fake news detection or similar objectives. For produc- In attempts to combat fake news, various fact-checking web- ing datasets, some fake news trackers are used effectively. sites and organizations have been founded. PolitiFact1 is Hoaxy dataset (Hui et al. 2018) has been accumulated us- an independent, non-partisan site for online fact-checking, ing Hoaxy. It consists of retweeted messages with links to mainly of U.S. political news and politicians’ statements. either fact-checking or misinformation articles. FakeNews- Snopes2 , one of the first online fact-checking websites, han- Net (Shu et al. 2018), constructed using FakeNewsTracker, dles political, and other social and topical issues. Gos- contains various information such as news contents, and sipcop3 investigates fake news in U.S. entertainment sto- spatio-temporal and social contexts. ries published in magazines and web news. Although these fact-checking sites have high reliability, they require time- Universal fake news collection system consuming processes and have poor scalability. We first present an overview of the proposed system. Then, we introduce details of the respective components in our sys- Fake tracking tools tem. This system will be presented publicly on the web in two languages: English and Japanese. Because fake news is diffused in SNS, it is important to track fake news movements when immediately confirming whether news is fake or investigating the nature of fake Overview news. To meet these demands, some tracking tools have The proposed system has four steps: crawling, archiving, vi- been announced in some papers. Hoaxy (Shao et al. 2016) sualization, and making dataset. Figure 1 presents an over- is a framework for collecting and tracking fact-checking in- all picture of the system framework. Crawling accumulates formation and misinformation related to them. Users can tweets that point out “fake” or similar tweets. Archiving or- search for topics in which they are interested and check ganizes the collected data and ranks the data for visualiza- the diffusion visualization of the respective topics. Fake- tion. Visualization shows tweets in order corresponding to NewsTracker (Shu, Mahudeswaran, and Liu 2019) is a sys- the degree of attention they receive. The system provides a tem for fake news data collection, detection, and visualiza- voting function from users on whether a tweet is related to tion on SNS. They first collect a fake news source from fake news or not. Making dataset recrawls tweets with key- fake-checking websites. NewsVerify (Zhou et al. 2015), a words such as URLs obtained during archiving for produc- real-time news certification system, starts to track news ing multilingual and large datasets of fake news on SNS. after user inputs and detects the credibility of events. (Zhao, Resnick, and Mei 2015) is similar to ours in terms of Crawling using tweets including particular phrases. Although it col- Debunking patterns as search keywords must be found be- lects rumors using enquiry phrases such as “Really?”, we fore collecting debunking tweets. To find useful patterns, we collect fake news using phrases related to debunking or cor- use crowdsourcing platforms: Amazon Mechanical Turk4 rections such as “This is fake.” Additionally, our system for English and Yahoo! Crowdsourcing5 for Japanese. We is based on a systematic framework that can accommodate ask questions such as “Write what you would write on identification of multilingual fake news. an SNS such as Twitter for correction when you find false information (for example fake news.)” Then we collect 1 https://www.politifact.com/ 2 4 https://www.snopes.com/ https://www.mturk.com/ 3 5 https://www.gossipcop.com/ https://crowdsourcing.yahoo.co.jp/
Table 1: Selected debunking patterns for crawling tweets in English pendency arcs to the fake part, which behave as “nsubj,” and Japanese “nsubjpass,” “dobj,” “iobj,” “csubj,” or “appos.” These de- pendency patterns indicate that it is used grammatically (isn’t|is not) true by the objective case, the nominative case, the subject it- is (completely) (false|fake) self as clause, and so on. Second is that the location of the English Don’t believe everything event phrase is in advance of that of the fake part. How- spreading (false|fake) ever, we do not extract the phrase when it is a demonstra- #fakenews tive pronoun such as “this” and “it.” は(デマ|フェイク) 3. When we do not find the phrase following rules presented (デマ|フェイク|フェイクニュース)です in 2, we set the word; the part depends on which in depen- Japanese (フェイク|間違い|デマ)である dency structure, as “fake part” and perform the process 3. というデマ 4. When the fake part is ROOT in 3, we change the sen- (信じ|拡散し)ない tence including the fake part. We set following sentence in English, and the preceding sentence in Japanese as the sentence. We perform process 2 after we set the ROOT in 1,000 answer texts from target language speakers. To ac- the sentence as the fake part. quire useful debunking patterns, we extract uni/bi/tri/4-gram from answer texts and select high-frequency patterns. From Tweet grouping This step is designed to gather tweets re- these high-frequency patterns, human experts further se- ferring to the same event cluster in the same group. It is dif- lected those which are independent of any particular fake ficult to apply machine-learning-based methods for group- news. The patterns we selected are presented in Table 1. We ing because kinds of tweets are variable every day. We then use the Twitter Search API to crawl tweets including those execute a simple and robust rule-based grouping method us- patterns. The crawling is executed continuously and the col- ing the extracted suspicious event phrases and other features lected tweets are saved in our database. such as URL. The rules of grouping are presented below: 1. Set tweets with the same URL into the same group Archiving 2. Set tweets replying to the same tweet into the same group We organize and rank the crawling data for ease of check- ing. This step in turn has three steps: extracting event phrase, 3. Calculate the distance between extracting event phrases of tweet grouping, and ranking. Processing all collected tweets each tweet in the above step, using the word mover’s dis- is time-consuming. Therefore, we use only tweets, the num- tance (WMD) (Kusner et al. 2015). Set tweets that have ber shares of which are more than three. These steps are ap- fewer than threshold τ into the same group plied every day on one-day of tweets. To calculate WMD, we use word vectors Extracting event phrase This step extracts suspicious from (Grave et al. 2018). The threshold τ was set as event phrases pointed out in debunking tweets. The extracted 0.25. event phrases are used for the next step, Tweet grouping, and are also used as headlines for visualization. For example, Ranking This step ranks each group generated from “Michael talking” is extracted as a suspicious event phrase the above step in order of high attention. Our rank- from the tweet “Michael talking is fake!” We execute no ing method is inspired by an unsupervised method of machine learning-based extraction but use rule-based extrac- (Glavaš and Štajner 2015). The method ranks each group ac- tion, which can be expanded easily in multiple languages. cording to several features, which are considered to express For suspicious event phrase extraction handling multi- attention. We then calculate the average rank over all cal- ple languages, we use the result of Universal Dependen- culated ranks of features as ranking. Our system uses three cies (UD) (McDonald et al. 2013), which was developed for features: Number of Like, Number of Retweet and Pub- collection of treebanks with homogeneous syntactic depen- lic score. Public score calculates the percentage of follow- dency annotation for various languages. Actually, UD en- ers among Retweet users The larger the first two features ables application of the same rules to multiple languages for are, the higher the degree of attention becomes. The smaller extraction, with a little adjustment. We obtained treebank the public score becomes, the higher the degree of attention from universaldependencies.org6 and applied a UD parser becomes. When the tweet also spreads to other user than to each tweet. In this system, a human expert sets extraction followers, it is more important. The tweet with the highest rules, which are shown below: attention rank is selected from each group for ranking. 1. Parse a sentence including a debunking pattern in Table 1 based on universal dependencies. We designate the de- Visualization bunking pattern as “fake part” for these processes. The proposed system presents the event clusters in order by 2. Extract an event phrase from the sentence based on the our ranking method to meet general demand, not only the following two rules. One is that the event phrase has de- researchers’. An example is presented in Figure 2. The top 10 event clusters are exhibited in the pro- 6 https://universaldependencies.org/ posed system. Each has three parts: “Headline,” “Debunking
Table 2: Representative values of collected tweets and event clus- ters show daily average numbers of the respective items. EN JA Avg. no. of tweets 9039 7901 Avg. no. of event clusters 455 143 Avg. no. of RT (top10 events) 1549 810 Avg. no. of Like (top10 events) 5027 1899 Avg. no. of verified account (top10 events) 2.11 0.20 No great difference exists between English and Japanese in the numbers of the collected tweets. By the contrast, event clusters grouped in English are more than three times more numerous than those in Japanese. This is attributable to the fact that the greater part of the collected tweets in Japanese are retweets. The numbers of RT and Like of the top event clusters in English are also more than those in Japanese. This result derives from the situation in which debunked or cor- rected statements by verified accounts, which have many fol- lowers, frequently occur in the top event clusters in English. Debunked or corrected statements by verified accounts were not found in Japanese. Effectiveness of our system Confirming whether a collected event cluster is fake or not is important to validate the effectiveness of the proposed sys- tem. We annotated 124 Debunking tweets (62 tweets in En- Figure 2: Example of visualization of English fake news. glish and 62 tweets in Japanese) visualized from December 7, 2019 to December 13, 2019, for the following viewpoints. Tweet,” and “Part pointed out.” “Headline” describes sus- (a). Do sentences in collected tweets indicate debunking? picious event phrases using “extracting event phrase.” “De- (b). Are the subjects of collected tweets truly fake? bunking Tweet” shows the tweet which the system has ob- tained from crawling. “Part pointed out” shows the URL, We recruited two human annotators to label collected tweets quote tweet, and reply to tweet included in the Debunking manually. We developed a codebook according to the def- Tweet. The system collects event clusters with a strong prob- inition of fake news discussed in the Introduction. Results ability of being fake, without a fact checking site. confirmed a substantial level of agreement: Cohen’s Kappa Additionally, we introduce a “Voting system,” which en- score was 0.73. For the tweets the two annotators did not ables users to vote on whether each event cluster is fake, agree on, a third annotator (one of the authors) labeled the or not. The system clearly shows each event cluster with a tweet. strong possibility of being fake by this structure. The results of (a) indicates that more than 65% collected tweets show debunking in each language: 66% in English Making dataset and 69% in Japanese. This result suggests that selected pat- terns in Table 1 are appropriate for the system. The result of We will publish multilingual and large fake news datasets in (b) also indicates that more than 65% of subjects of collected the future. We execute re-crawl to event clusters visualized tweets are truly fake in each language: 68% in English and in the system using URL included in the related tweets and 65% in Japanese. The same architecture, irrespective of lan- keywords obtained by extracting event phrases for produc- guage, collects event clusters with high probability of being ing an exhaustive dataset. A dataset composed of tweets re- fake. From these results, we infer that the proposed system crawled and labeled by the voting system will be published. achieves a sufficient level of usefulness for practical use. Discussion Conclusions Representative values of our system Our paper presents a proposal for a fake news collection sys- The system collects numerous tweets daily by continuous tem to examine debunking tweets specifically. The system crawling. Table 2 shows representative values of tweets and works in two languages: English and Japanese. By virtue of event clusters collected in two languages, English (EN) and the fact that the proposed system can be easily extended to Japanese (JA), during November 14, 2019 through Decem- other languages, future studies will be undertaken for its ap- ber 13, 2019. plication to languages other than English and Japanese and
for publishing of a large fake news dataset. Using the system to gather various fake news items is also expected to contribute to easy comparison of fake news among languages and among countries.

Acknowledgments
This research was partly supported by Health and Labor Sciences Research Grant Number H30-shinkougyousei-shitei-004 and JSPS KAKENHI Grant Numbers JP19K20279 and JP19H04221.
