Context-Sensitive Malicious Spelling Error Correction
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Context-Sensitive Malicious Spelling Error Correction Hongyu Gong, Yuchen Li, Suma Bhat, Pramod Viswanath University of Illinois at Urbana-Champaign {hgong6, li215, spbhat2, pramodv}@illinois.edu Abstract sider the sentence, “My thoughts are that peo- ple should stop being stupid.” which was as- Misspelled words of the malicious kind work signed a perceived toxicity score of 86% by Per- by changing specific keywords and are in- arXiv:1901.07688v1 [cs.CL] 23 Jan 2019 tended to thwart existing automated applica- spective. When the word “stupid” was misspelled tions for cyber-environment control such as as “stup*d” in the same sentence, its toxicity harassing content detection on the Internet and score reduces to 8%. This marked reduction in- email spam detection. In this paper, we fo- duced by the deceptive spelling error on the key- cus on malicious spelling correction, which re- word ‘stupid’ reflects the importance of accurate quires an approach that relies on the context spelling correction for optimal toxicity detection. and the surface forms of targeted keywords. Had the spelling error been detected and corrected In the context of two applications–profanity before scoring for toxicity, the score would not detection and email spam detection–we show that malicious misspellings seriously degrade have lowered. their performance. We then propose a context- Likewise, deliberate typographic errors are sensitive approach for malicious spelling cor- common in phishing scams and spam emails. For rection using word embeddings and demon- instance, “Please pya any fees you may owe to our strate its superior performance compared to company” where pya is clearly a spelling error, in- state-of-the-art spell checkers. cluded to deceive spam filters that are sensitive to keywords. In both these instances, spelling cor- 1 Introduction rection as a preprocessing step of the messages are Automatic spelling correction has been an im- critical for a robust performance of the target sys- portant natural language processing component of tem. Spelling correction in the preprocessing stage any input text interface of today (Jurafsky and in malicious settings constitutes the focus of this Martin, 2014). While spelling errors in common study. writing could be regarded as innocuous omissions, In this paper, we empirically demonstrate the affecting opinions about the writer’s competence effect of spelling errors in a malicious setting by at worst, those of the malicious kind can poten- adding synthetic misspellings to sensitive words tially be highly offensive, since they are intended in the context of two applications–profanity de- to deceive an automated mechanism–be it a spam tection and spam detection. We propose an un- filter for emails or a profanity detector for social supervised embedding-based algorithm to cor- media. For those applications that rely on au- rect the targeted misspelled words. Earlier ap- tomatic detection of specific keywords to enable proaches to spelling correction primarily depend real-time control of the cyber-environment, detect- on the edit distance to find words morphologi- ing and correcting spelling errors is of primary im- cally similar to corrections. More recently, spell portance. checkers have been improved with the addition of Perspective (Google, 2017), is one such ap- contextual information (e.g., n-gram modeling in plication that helps reduce abusive content on- (Wint et al., 2018)), often with intensive compu- line by detecting profane language. It has been tation and memory requirements (to obtain and pointed out that Perspective’s otherwise good per- store N-gram statistics). A recently studied neu- formance in detecting toxic comments is not ro- ral network-based spell checker learns the mis- bust to spelling errors (Hosseini et al., 2017). Con- spelling pattern from annotated train data in a su-
pervised way, and was found to be sensitive to misspelled sensitive words (Zhong, 2014) because data domains and dependent on human annota- word-based spam filters tend to make false deci- tions (Ghosh and Kristensson, 2017). Our ap- sions in the presence of misspellings (Saxena and proach to make the correction procedure context- Khan, 2015), and so are word-based cyberbullying aware involves harnessing the geometric proper- detection systems (Agrawal and Awekar, 2018). ties of word embeddings. It is light-weight in Misspelling is also seen in web attacks where a comparison, unsupervised, and can be adapted to phishing site has a domain name as a misspelling a variety of data domains including Twitter data of a legitimate website (Howard, 2008). and spam data, since domain information can be Approaches to correct malicious misspellings. well captured by tuning embeddings on domain- Given the malicious intent of misspellings in the specific data (Taghipour and Ng, 2015). context of the Internet, recent studies have pro- We first demonstrate the effect of spelling er- posed correction strategies specifically for mali- rors in a malicious setting in the context of two cious misspellings. Levenshtein edit distance is applications–profanity detection and spam detec- commonly used in spell checking of detection sys- tion. Then we propose an unsupervised algo- tems. Spam filters (Zhong, 2014) and cyberbul- rithm to correct the targeted spelling errors. Be- lying detection systems (Wint et al., 2017) rely sides performing well on the correction of syn- on the idea of edit distance to recognize cam- thetic spelling errors, the algorithm also performs ouflaged words by capturing the pattern of mis- well on spell checking of real data, and shows that spellings. For example, (Rojas-Galeano, 2013) it can improve hate speech detection using Twitter studied an edit distance function designed to re- data. veal instances where spammers had interspersed black-listed words with non-alphabetic symbols. 2 Related Work Lexical and context statistics, such as word fre- quency, have also been used to make corrections Spelling correction. Automatic spelling correc- in social texts (Baziotis et al., 2017; Jurafsky and tion in non-malicious settings has had a long re- Martin., 2017). Existing approaches have the search track as an important component of text problem of domain specificity, since their lexical processing and normalization. Early works have statistics are obtained from a certain domain which used edit distance to find morphologically simi- might differ from the domain of target applica- lar corrections (Ristad and Yianilos, 1998), noisy tions. channel model for misspellings (Jurafsky and Our study considers the spelling correction in a Martin, 2014), and iterative search to improve cor- malicious setting where errors are not random, but rections of distant spelling errors (Gubanov et al., are carefully introduced. Our context-aware ap- 2014). Word contexts have been shown to be proach to spelling correction relies on the geome- improve the robustness of spell checkers with n- try of word embeddings, which has the advantage gram language model as one approach to incorpo- of efficient domain adaptation. As we will show in rate contextual information (Hassan and Menezes, Section 4.2, the embedding can be easily tuned on 2013; Farra et al., 2014). Other ways of incor- a small corpus from the target domain to capture porating contextual information include n-gram domain knowledge. statistics capturing the cohesiveness of a candidate word with the given context (Wint et al., 2018). 3 Methods Effect of malicious misspellings. Misspellings are commonly seen in online social platforms such We study malicious spelling correction for three as Twitter and Facebook, and recent studies have target applications–toxicity detection using the drawn attention to the fact that online users de- Perspective API, email spam detection, and hate liberately introduce spelling errors to thwart bul- speech detection on Twitter data. We work with lying detection (Power et al., 2018) or to chal- synthetic spelling errors in the first two applica- lenge to moderators of online communities (Pa- tions, and study the real misspellings in the third. pegnies et al., 2017). This is because simple ways For the toxicity detection task, we use the Per- of filtering terms of profanity are rendered inad- spective dataset (Google, 2017), which provides equate in the presence of spelling errors. Like- a set of comments collected from the Internet with wise, email spam filters are often obfuscated by human annotated scores of toxicity. Using the Per-
spective API we obtained the toxicity score for stud and stupid can be thought of as the correct 2767 comments in the dataset. form of the erroneous word stupd. In such cases, The spam dataset consists of 960 emails from spelling correction relies on the context in which the Ling-Spam dataset 1 . We randomly split the the word occurs. data into 700 train emails and 260 test emails. 3.2 Effect of malicious spelling errors Both the train and test sets were balanced with respect to the positive and negative classes. The We quantify the effect of spelling errors by show- most frequent 2500 words in the training data were ing that a simple mechanism of injecting malicious selected, and we counted the occurrence of each spelling errors greatly degrades the performance word in the emails. The number of occurrences of toxicity detection and spam detection. Toward of these frequent words were used as a 2500- this, we first describe the general mechanism to dimension feature vector for each email. We first generate synthetic errors and then study their im- used these features to train a Naive Bayes classi- pact on toxicity and spam detection. We point fier. It achieved an accuracy of 98% in spam de- out that owing to the absence of a dataset with in- tection. tended malicious errors, we had to generate them As for the Twitter data, a total of 16K user posts synthetically. were collected from Twitter, a social communica- Spelling error generation. We first choose the tion platform and used in used in (Waseem and “sensitive” words that the detection algorithm re- Hovy, 2016). Of these tweets, 1937 were labeled lies (the mechanism will be discussed separately as being racist, 3117 were labeled as being sexist, for each application we consider). We then replace and the rest were neither racist nor sexist. The task these words with their erroneous forms (which of hate speech detection is to classify these tweets may not be real words). Given the characteristics into one of the three categories racist, sexist, nei- of malicious errors mentioned in Section 3.1, our ther. We randomly split the tweets into train and assumption is that as a result of these errors the test data, and trained a neural-network based hate altered words will appear similar to the original speech detection system on the training data (de- ones. As illustrated in Table 1, we consider four scribed later in Section 5.3). basic character-level operations to change the sen- sitive words: insertion, permutation, replacement, 3.1 Characterization and removal. In our experiments we perform these operations on randomly picked characters of the Here we summarize our assumptions on the char- sensitive word. For permutation, we exchange this acteristics of the malicious spelling errors. character with the next one. Each operation in Ta- (1) Malicious errors are usually made on the sensi- ble 1 increases the edit distance by 1, and we gen- tive keywords in order to obfuscate the true inten- erate malicious misspellings of the sensitive word tions and deceive detection systems that rely on that are at most 2 edit distance from the correct keywords (Weinberger et al., 2009). word. (2) The error yields a word that is similar to the original word in surface form. The misspelled Table 1: Common spelling errors words would be words that humans can easily un- Type insertion permutation replacement removal derstand (thus the erroneous words still convey the Error idio.t moeny chanse stupd intended meaning, while being in disguise). Their Correction idiot money chance stupid similarity is reflected in the small edit distance be- tween the erroneous and the correct word (Mor- Finally, we obtain revised sentences by replac- gan, 1970). The errors often involve character- ing the sensitive words in the original sentences level operations as shown in Table 1 (Liu et al., with their erroneous counterparts. Next, we intro- 2011). duce our mechanism for selecting the “sensitive” (3) With edit distance as the similarity criterion, it words for each application. is often the case that the word with an error is sim- Perspective toxicity detection. We select 2767 ilar to multiple valid words, making correction a clean comments from a total of 2969 comments challenge in real applications. For example, both in the Perspective data, with the selection criteria given below. 1 http://csmining.org/index.php/ ling-spam-datasets.html • The most toxic word in a comment should
contain more than two characters. A word tection. In this section, we introduce our unsu- that is too short is not likely to be a meaning- pervised algorithm on non-word error correction ful toxic word, and can have too many candi- based on the relevant context. date corrections in the dictionary. Our correction method proceeds in two stages. In the first stage of candidate enumeration, the • The most toxic word in a comment should ap- algorithm detects the spelling error and proposes pear at least 100 times in the dataset, since candidates that are valid words and similar in sur- rare words tend to be misspellings in online face form to the misspelled word. In the next texts. stage, correction via context, the best candidate is chosen based on its context. • The most toxic word in a comment should be a content word, i.e., it should not be- 4.1 Candidate Enumeration long in the list of function words such as “must”, “ought”, “shall” and“should”. Func- As in the case of non-word spelling correction tion words are not toxic and so were excluded (Jurafsky and Martin, 2014), we check whether from our experiments. an erroneous word is valid or not using a vo- cabulary of valid words. For toxicity detection For each comment, its predicted toxicity score and spam detection, the vocabulary consists of lies in the range of [0, 1]. The higher the score, the words that occur more than 100 times in En- more toxic the sentence is. For each sentence, we glish Wikipedia corpus 2 For hate speech detec- chose the most toxic word to be that word, with- tion, the vocabulary consists of standard words out which the toxicity score reduced the most. We from Wikipedia and the list of internet slang words then added malicious errors, as described above, to as detailed in Section 5.3, since slang words fre- this word. We note that for 2380 toxic comments quently occur in tweets. For an invalid word, we with toxicity scores higher than 0.5, spelling er- find candidate words with the smallest Damerau- rors brought down their predicted toxicity by 32%. Levenshtein edit distance (Damerau, 1964; Leven- Section 5 provides the details of the degradation in shtein, 1966). This distance between two strings toxicity detection. is defined as the minimum number of unit oper- Spam detection. A Naive Bayes spam detec- ations required to transform a string into another, tion model provides the likelihood of each word where the unit operations include character addi- given the spam and non-spam classes. For a given tion, deletion, transposition and replacement. We word, the difference between these probabilities note that because there are potentially multiple reflects how important that word is in spam detec- candidates having the smallest distance to the mis- tion. We sorted all the words in decreasing order spelled word, the output of this stage is a set of of this difference in probabilities, and picked the candidate corrected words. most spam-indicative words. We then added ma- licious errors to these words in test spam emails. 4.2 Correction via Context For a spam filtering system which relies heavily After enumerating all possible candidates, we on the counts of spam-indicative words, the errors choose the one that fits the best in the context. can easily disguise spams as non-spams. This is We propose a word-embedding-based method to seen in the test accuracy dropping from 98% on match the context with the candidate. the original test data to 72% on the revised test Pretrained embedding. We use word2vec data. CBOW model to train embeddings (Mikolov et al., We note that for the Perspective data, we added 2013). The Perspective and Twitter data have an errors to only one word per sentence, but did not informal style, while email data consists of rela- have such limits for the spam data. Also, we do tively formal expressions. Word embeddings are not limit the number of corrections during the spell known to be domain-specific (Nguyen and Gr- check process. ishman, 2014) and naturally domain-specific cor- pora are used to train word embeddings for use 4 Spelling Correction in a given domain. We note that in the absence We have shown that malicious errors degrade the 2 available at: http://www.cs.upc.edu/˜nlp/ performance of toxicity detection and spam de- wikicorpus/
of a large enough representative corpus to train Table 2: Correction Accuracy on Perspective and Spam domain-specific high-quality embeddings for this Dataset Our system PyEnchant Ekphrasis Google study, we reconcile with word embeddings trained Perspective 0.840 0.476 0.713 0.323 on a large WikiCorpus (Al-Rfou et al., 2013) to Spam 0.786 0.618 0.639 0.526 capture the general lexical semantics and further tuned on a domain-specific corpus, such as that of tors: the Perspective dataset, the spam emails data and p the Twitter data. This step allows us to combine 1 X domain information in trained embeddings. dist(c, Tp ) = min k ai vi − vc k22 . (1) {ai } kvc k2 i=−p, Error Correction. Word vectors permit us to de- i6=0 cide the fit of a word in a given context by consid- ering the geometry of the context words in relation This is a quadratic minimization problem for to the given word. Firstly, the embedding of the which we can find a closed-form solution. compositional semantics of phrases or sentences Instead of fixing one context window size, we can be approximated by a linear combination of consider multiple window sizes and weigh the dis- the embeddings of their constituent words (Salehi tances obtained using different window sizes. The et al., 2015). Let vw be the embedding of the token context words that are closer to the misspelled w. Take the phrase “lunch box” as an example, it word are more informative in candidate selection. holds that In the sentence “the stu*pid and stubborn adminis- trators”, the word stubborn suggests that the mis- spelled word should be a negative personality ad- vhate group ≈ αvhate + βvgroup , jective, and the word administrator that it should be an adjective for people. Thus, the closest con- where α and β are real-valued coefficients. text word stubborn provides more relevant infor- This enables us to represent the context as a lin- mation than the distant word administrator. ear space of the component words. Another prop- We thus weigh the distance by the inverse of the erty is that semantically relevant words lie close in context window size, i.e., the weight p1 for the win- the embedding space. If a word fits the the con- dow size p. Suppose that T is the full context, and text, it should be close to the context space, i.e., the candidate-context distance is defined below: the normalized Euclidean distance from the word P embedding to the context space should be small X 1 dist(c, T ) = · dist(c, Tp ), (2) (Gong et al., 2017). We quantify the inconsistency p p=1 between the word and its context by the distance between the word embedding and the span of the where P = 4 in our experiments. context words. Given the context T , the suggested correction ∗ c is the candidate with the smallest distance to For a misspelled word in a sentence, all words the linear space of context words, i.e., occurring within a window of size p are considered to be its context words. Let the context Tp be the c∗ = arg min dist(c, T ). (3) set of words within distance p from w0 : c∈C Tp = {w−p , w−p+1 , . . . , w0 , . . . , wp−1 , wp }, 5 Experiments We evaluated our spelling correction approach in where w0 is the misspelled word. Let C be the three settings: toxicity and spam detection with set of candidate replacements of w0 . Let vi be the synthetic misspellings, and hate speech detection word embedding of context word wi . We denote with real spelling errors. Even though some recent by dist(c, Tp ), the distance between a candidate c works using neural networks (Ghosh and Kris- and the linear span of the words in its context Tp , tensson, 2017) are available, they require ground termed as the candidate-context distance of a can- truth corrections for supervised training. For our didate, defined as the normalized distance between unsupervised approach, we compare its perfor- the word embedding of the candidate, vc and its mance with that of three strong unsupervised base- linear approximation obtained by the context vec- lines below, which are generic off-the-shelf tools
Table 3: Correction on perspective data original sentence the stupid and stubborn administrators anti American hate groups you’re a biased fuck revised sentence the stu*pid and stubborn administrators anti American ahte groups you’re a biased fucdk Our correction the stupid and stubborn administrators anti American hate groups you’re a biased fuck PyEnchant the sch*pid and stubborn administrators anti American hate groups you’re a biased Fuchs Ekphrasis the stupid and stubborn administrators anti American ate groups youre a biased fuck Google the stupid and stubborn administrators anti American ahte groups you’re a biased fucdk used in many NLP systems such as AbiWord (abi, bin is 0.55, whereas the revised toxicity is only 2018) and Google search engine (goo, 2018). 0.33 (a drop by 40%). This shows that toxicity (1) Pyenchant: a generic spell checking library of detection is very sensitive to spelling errors. multiple correction algorithms (Kelly, 2015). We From the same figure we see that our proposed use the MySpell library in this work. error correction results in toxicity scores closer to (2) Ekphrasis: a spelling correction and text nor- the original ones when compared with the base- malization tool for texts from social networks lines, validating the effectiveness of our approach. (Baziotis et al., 2017). We note that in some cases our approach might re- (3) Google spell checker: a Google search engine- sult in a slightly higher toxicity score than the orig- based spell check (Coad, 2018). We chose it for its inal one, because of the pre-existing misspellings ability to undertake context-sensitive correction on in the original sentences. For example, original the basis of user search history. sentences “you are a fagget”, “fukkin goof’s track record” and “I suspect that closedmouth is g*y” 5.1 Toxicity Detection have crude misspellings, which are used as In- ternet slang. Our approach corrects these pre- existing misspellings in addition to the errors we add later, resulting in higher toxicity scores of cor- rected sentences. We perform corrections of all spelling errors, recovering more toxic words than we added maliciously. 5.2 Spam Detection Figure 1: Toxicity scores by different approaches We took 2380 sentences from Perspective dataset whose toxicity scores were higher than 0.5 and added synthetic errors maliciously to these sentences as described in Section 3. Some ex- ample Perspective sentences, revised sentences and corrections given by different algorithms are Figure 2: Spam detection accuracy on test emails shown in Table 3. The correction accuracy achieved by different approaches is reported in Ta- We next evaluate our spelling correction on the ble 2. spam data. Synthetic malicious errors were added We divided toxic sentences into 5 bins accord- to spam-indicative words in spam mails based on ing to their original toxicities. In Fig. 1, we report our misspelling generation mechanism. the average toxicity of the original, revised and Some example spams are shown in Table 4. corrected sentences in each bin. We see that the Sensitive words such as “money” which are in- malicious errors drastically reduce the sentences’ dicative of spams are highlighted, and the cor- toxicities. The original average toxicity in the first rections given by different approaches are shown.
Table 4: Correction examples on spam data it really be a great oppotunity to make relatively we have quit our jobs , and will soon buy a home on Original sentence easy money , with little cost to you . the beach and live off the interest on our money . it really be a grfat opportunity to make relatively we have quit our tobs, and will soon buy a home on Revised sentence easy fmoney, with little cosgt to you. the beach and live off the interest on our jmoney. it really be a great opportunity to make relatively we have quit our jobs, and will soon buy a home on Our correction easy money, with little cost to you. the beach and live off the interest on our money. it really be a graft opportunity to make relatively we have quit our bots, and will soon buy a home on PyEnchant easy fmoney, with little cost to you. the beach and live off the interest on our money. it really be a great opportunity to make relatively we have quit our tobs, and will soon buy a home on Ekphrasis easy money, with little cost to you. the beach and live off the interest on our money. it really be a great opportunity to make relatively we have quit our jobs, and will soon buy a home on Google easy fmoney, with little cosgt to you. the beach and live off the interest on our jmoney. Table 2 shows the spelling correction accuracy preprocessing tool which can replace aforemen- achieved by our algorithm and the baselines for tioned tokens with a special set of tokens. For the spam data. example, one tweet is “#isis #islam pc puzzle: Fig. 2 shows the spam detection accuracy on the converting to a religion of peace leading to vio- original, revised and corrected test emails. Again lence ?,http://t.co/tbjusaemuh http://t.co/g4xoh...”, we see that malicious errors resulted in a large which becomes “$HASHTAG$ $HASHTAG$ pc accuracy drop, the accuracy was restored after puzzle: converting to a religion of peace leading spelling correction, and the increase in accuracy to violence? $URL$ $URL$...” after preprocess- using our approach was the maximal among those ing. This preprocessing stage can clean texts with- compared. out dealing with misspellings. Tweets are prepro- cessed right before they are fed into the neural net- 5.3 Real Misspellings in Tweet Hate Speech work. Detection Hate speech detection. The state-of-the-art Table 5: Example of tweets and hate speech categories system for hate speech detection is a Bidirectional Long Short Term Memory (BLSTM) network Category Example sentence with attention mechanism (Agrawal and Awekar, #isis #islam pc puzzle: converting to a religion Racism 2018). The model takes a sequence of words in a of peace leading to violence? http://t.co/tbjusaemuh Real question is do feminist liberal bigots tweet, obtains word embeddings in the embedding Sexism understand that different rules for men/women layer, and passes them to bidirectional recurrent is sexism Neither The mother and son team are sooooo nice !!! layers to generate a dense representation of the in- put tweet. The feedforward layer takes the tweet vector and predicts its probability distribution over We have shown that synthetic misspellings are all three classes. The class with the highest likeli- able to deceive the toxicity detector and the spam hood is chosen as the category of the input tweet. detector, and reported the performance of spell In our experiment, we use a BLSTM model for the checkers on the synthetic data. We also experi- hate speech detection task. mented with data containing real spelling errors collected from Twitter, where instances of hate Vocabulary construction. Firstly we build a speech contain user-generated spelling errors. The vocabulary list using both a standard dictionary correction performance of spell checkers are now and the frequent words (frequency higher than 5) compared on the real misspellings. in the training data. This list will serve as a ref- Tweet normalization. Some example tweets erence for spelling correction; words outside the of racist and sexist nature are shown in Table 5. list will be taken as spelling errors and replaced Tweets are notoriously noisy and unstructured with legal words from the vocabulary. The rea- given the frequent occurrences of hashtags, URLs, son for collecting words from the training data is reserved words and emojis. These non-standard to include the Internet slangs in tweets which may tokens greatly increase the vocabulary size of not exist in the standard dictionary. Some exam- tweets while also injecting noise to classifica- ples are “tmr” (for tomorrow), “fwd” (for forward) tion tasks. We use Tweet Preprocessor, a tweet and “lol” (for laughing out loudly). Some previ-
Table 6: Hate speech detection results with different spelling corrections Category Metric Original Our system PyEnchant Ekphrasis Google Precision 0.630 0.640 0.630 0.630 0.569 racist Recall 0.617 0.681 0.617 0.617 0.702 F1 score 0.623 0.660 0.623 0.623 0.628 Precision 0.641 0.630 0.629 0.629 0.649 sexist Recall 0.775 0.767 0.790 0.790 0.759 F1 score 0.701 0.692 0.701 0.701 0.700 Precision 0.721 0.721 0.718 0.718 0.703 Macro average over Recall 0.741 0.757 0.743 0.743 0.759 all categories F1 score 0.727 0.737 0.727 0.727 0.727 ous works proposed to replace these slang terms results on the 571 test tweets to evaluate the ef- with their standard forms (Gupta and Joshi, 2017; fect of spelling correction on this downstream ap- Modupe et al., 2017), which require either ex- plication. As shown in Table 6, We report results pert knowledge or human annotations. We argue on the original test data, and also on the test data that because these Internet slangs change rapidly which are corrected by our approach, PyEnchant, we should understand them in a data-driven man- Ekphrasis, and google spell checker. We report ner instead of standardizing them based on hu- precision, recall and F1-score of racist and sexist man knowledge. Since the representation of these classification respectively and the macro-averages slangs and their use in hate speech will be learned (to evaluate the overall performance). by the neural network from the train data, we add The best performance of each metric is high- these slangs to our vocabulary as legal words. lighted in the table. Compared with the classifica- Spelling correction. Misspellings are another tion performance on the test data with spelling er- source of text noise which cannot be handled in the rors for the racist category, our approach improves tweet normalization stage. Some misspellings are the precision by 1%, the recall by 6.4%, and the F1 user-created to deceive the online detection sys- score by 3.7% absolute points. Both PyEnchant tem. For example, the swear word “fucking” has a and Ekphrasis spell checkers improve the recall of lot of variants in tweets such as “fckn”, “f*ckin”, sexist category, but decrease the precision, so their “f**king”, “fckin”, “fuckin”, “fking”,“fkin” and corrected forms achieve F1 scores similar to the “fkn”. When a new variant of a swear word arises original test data. Google spell checker also gives in the test data, the model takes it as an out-of- similar F1 score on both racist and sexist classes. vocabulary word, and is unable to match it with Our approach outperforms the other baselines in any learned pattern. The purpose of spelling cor- terms of F1 score for the racist class and the macro rection is to map these new variants to words that F1 score. are known to the model. As discussed in Section 4, For sexism-related tweets, our approach does we enumerate the candidates of a misspelled word, not improve the results compared to the original and choose the candidate which best fits with the test data. Taking a closer look at the nature of context as its correction. the sexist tweets we notice that they often con- Results. We do a random 80:20 train-test tain some abbreviations which might be taken as split of the Twitter dataset. A detection system misspellings, and that their contextual informa- was trained on the normalized train data (with- tion is insufficient to decide the appropriate cor- out spelling correction) using the state-of-the-art rections. For example, in the sexist tweet “I’m BLSTM model. There were 3218 test tweets, 571 sorry but if you watch women ufc fights kys”. Our of which had misspellings. We applied the dif- approach replaces kys with keys, and the trained ferent spelling correction approaches to these 571 neural network misclassified the corrected tweet. tweets, and the corrected tweets were then cleaned Another sexist-related tweet is “these nsw promo as described in the tweet normalization stage. Pro- girls think way too highly of themselves”, where cessed tweets were input to the trained detection nsw is incorrectly replaced with new by our ap- system. We compared the hate speech detection proach.
6 Conclusion Jigsaw Google. 2017. Perspective. Available at: https://www.perspectiveapi.com/. In this study, we showed how malicious spelling errors can deceive profanity- and spam detec- Sergey Gubanov, Irina Galinskaya, and Alexey Baytin. 2014. Improved iterative correction for distant tors. To deal with these malicious misspellings, spelling errors. In ACL (2), pages 168–173. we proposed a context-sensitive spelling correc- tor based on word embeddings. Our spell checker Itisha Gupta and Nisheeth Joshi. 2017. Tweet nor- malization: A knowledge based approach. In Info- is light-weight, unsupervised and can be easily com Technologies and Unmanned Systems (Trends incorporated into downstream applications. It and Future Directions)(ICTUS), 2017 International achieved a favorable spelling correction perfor- Conference on, pages 157–162. IEEE. mance when compared with general purpose spell- Hany Hassan and Arul Menezes. 2013. Social text nor- checking tools such as PyEnchant, Ekphrasis and malization using contextual graph random walks. In Google spell checkers on both synthetic and real ACL (1), pages 1577–1586. misspellings from different datasets. Hossein Hosseini, Sreeram Kannan, Baosen Zhang, and Radha Poovendran. 2017. Deceiving google’s perspective api built for detecting toxic comments. References arXiv preprint arXiv:1702.08138. 2018. Abiword. Available at: https://www. Fraser Howard. 2008. Web attacks: Modern web at- abisource.com. tacks. Network Security, 2008(4):13–15. 2018. Google search engine. Available at: https: Dan Jurafsky and James H Martin. 2014. Speech and //www.google.com. language processing, volume 3. Pearson London. Sweta Agrawal and Amit Awekar. 2018. Deep learn- Daniel Jurafsky and James H. Martin. 2017. Dis- ing for detecting cyberbullying across multiple so- tributed representations of words and phrases and cial media platforms. In European Conference on their compositionality. In Speech and Language Information Retrieval, pages 141–153. Springer. Processing, chapter 5, pages 1–12. Rami Al-Rfou, Bryan Perozzi, and Steven Skiena. Ryan Kelly. 2015. Pyenchant. Available at: https: 2013. Polyglot: Distributed word representations //github.com/rfk/pyenchant. for multilingual nlp. In Proceedings of the Seven- Vladimir I Levenshtein. 1966. Binary codes capable teenth Conference on Computational Natural Lan- of correcting deletions, insertions, and reversals. In guage Learning, pages 183–192, Sofia, Bulgaria. Soviet physics doklady, volume 10, pages 707–710. Association for Computational Linguistics. Fei Liu, Fuliang Weng, Bingqing Wang, and Yang Liu. Christos Baziotis, Nikos Pelekis, and Christos Doulk- 2011. Insertion, deletion, or substitution?: nor- eridis. 2017. Datastories at semeval-2017 task malizing text messages without pre-categorization 4: Deep lstm with attention for message-level and nor supervision. In Proceedings of the 49th An- topic-based sentiment analysis. In Proceedings of nual Meeting of the Association for Computational the 11th International Workshop on Semantic Eval- Linguistics: Human Language Technologies: short uation (SemEval-2017), pages 747–754, Vancouver, papers-Volume 2, pages 71–76. Association for Canada. Association for Computational Linguistics. Computational Linguistics. Noah Coad. 2018. Google spell check. Avail- Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Cor- able at: https://github.com/noahcoad/ rado, and Jeff Dean. 2013. Distributed representa- google-spell-check. tions of words and phrases and their composition- ality. In C. J. C. Burges, L. Bottou, M. Welling, Fred J Damerau. 1964. A technique for computer de- Z. Ghahramani, and K. Q. Weinberger, editors, Ad- tection and correction of spelling errors. Communi- vances in Neural Information Processing Systems cations of the ACM, 7(3):171–176. 26, pages 3111–3119. Curran Associates, Inc. Noura Farra, Nadi Tomeh, Alla Rozovskaya, and Nizar Abiodun Modupe, Turgay Celik, Vukosi Marivate, and Habash. 2014. Generalized character-level spelling Melvin Diale. 2017. Semi-supervised probabilis- error correction. In ACL (2), pages 161–167. tics approach for normalising informal short text messages. In Information Communication Technol- Shaona Ghosh and Per Ola Kristensson. 2017. Neural ogy and Society (ICTAS), Conference on, pages 1–8. networks for text correction and completion in key- IEEE. board decoding. arXiv preprint arXiv:1709.06429. Howard L. Morgan. 1970. Spelling correction in sys- Hongyu Gong, Suma Bhat, and Pramod Viswanath. tems programs. In Communications of the ACM, 2017. Geometry of compositionality. pages 90–94.
Thien Huu Nguyen and Ralph Grishman. 2014. Em- Xinwang Zhong. 2014. Deobfuscation based on edit ploying word representations and regularization for distance algorithm for spam filitering. In Machine domain adaptation of relation extraction. In ACL Learning and Cybernetics (ICMLC), 2014 Interna- (2), pages 68–74. tional Conference on, volume 1, pages 109–114. IEEE. Etienne Papegnies, Vincent Labatut, Richard Dufour, and Georges Linares. 2017. Impact of content fea- tures for automatic online abuse detection. In Inter- national Conference on Computational Linguistics and Intelligent Text Processing. Aurelia Power, Anthony Keane, Brian Nolan, and Brian O’Neill. 2018. Detecting discourse- independent negated forms of public textual cyber- bullying. Journal of Computer-Assisted Linguistic Research, 2(1):1–20. Eric Sven Ristad and Peter N Yianilos. 1998. Learning string-edit distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(5):522–532. Sergio A Rojas-Galeano. 2013. Revealing non- alphabetical guises of spam-trigger vocables. Dyna, 80(182):15–24. Bahar Salehi, Paul Cook, and Timothy Baldwin. 2015. A word embedding approach to predicting the com- positionality of multiword expressions. In HLT- NAACL, pages 977–983. Manish Saxena and PM Khan. 2015. Spamizer: An approach to handle web form spam. In Computing for Sustainable Global Development (INDIACom), 2015 2nd International Conference on, pages 1095– 1100. IEEE. Kaveh Taghipour and Hwee Tou Ng. 2015. Semi- supervised word sense disambiguation using word embeddings in general and specific domains. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computa- tional Linguistics: Human Language Technologies, pages 314–323. Zeerak Waseem and Dirk Hovy. 2016. Hateful sym- bols or hateful people? predictive features for hate speech detection on twitter. In Proceedings of the NAACL student research workshop, pages 88–93. Kilian Weinberger, Anirban Dasgupta, John Langford, Alex Smola, and Josh Attenberg. 2009. Feature hashing for large scale multitask learning. In Pro- ceedings of the 26th annual international conference on machine learning, pages 1113–1120. ACM. Zar Zar Wint, Theo Ducros, and Masayoshi Aritsugi. 2017. Spell corrector to social media datasets in message filtering systems. In Digital Information Management (ICDIM), 2017 Twelfth International Conference on, pages 209–215. IEEE. Zar Zar Wint, Théo Ducros, and Masayoshi Aritsugi. 2018. Non-words spell corrector of social media data in message filtering systems. Journal of Dig- ital Information Management, 16(2).
You can also read