Alexa in Phishingland: Empirical Assessment of Susceptibility to Phishing Pretexting in Voice Assistant Environments
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
2021 IEEE Symposium on Security and Privacy Workshops Alexa in Phishingland: Empirical Assessment of Susceptibility to Phishing Pretexting in Voice Assistant Environments Filipo Sharevski Peter Jachim School of Computing School of Computing DePaul University DePaul University Chicago, IL Chicago, IL fsharevs@cdm.depaul.edu pjacim@depaul.edu Abstract—This paper investigates what cues people use to some action with an URL included in that email. The spot a phishing email when the email is spoken back to adversary can, in addition, ask a user of the native Alexa them by the Alexa voice assistant, instead of read on a email skill to reply in some form (e.g. reply “yes”). If screen. We configured Alexa to read there emails to a sample they do, they could be a potential target for phishing of 52 participants and ask for their phishing evaluations. emails coming from a particular authoritative sender [3]. We also asked a control group of another 52 participants If they ignore it or delete the email, the adversary can use to evaluate these emails on a regular screen to compare another strategy, e.g. “urgency” or “reciprocity” [4]. In the plausibility of phishing pretexting in voice assistant this way, the adversary can exploit the trust users have into environments. The results suggest that Alexa can be used Amazon Alexa and its native email function to perform for pretexting users that lack phishing awareness to receive “pretexting” without risking user suspicion in third party and act upon a relatively urgent email from an authoritative skills [5] or being detected by tools like SkillExplorer [6]. sender. Inspecting the sender (“authority cue”) and relying In this paper we investigated the preconditions for on their personal experiences helped participants with higher an adversary to utilize the Alexa native email function phishing awareness to use Alexa towards a preliminary email towards “pretexting” for phishing, that is, find out the screening to flag an email as potentially “phishing.” susceptibility of users to phishing when such an email is spoken back to them. Alexa delivered three separate Index Terms—Voice assistant security, IoT security, phishing emails to a sample of 52 participants that were asked susceptibility, Amazon Alexa to decide whether each of the emails is phishing or not and elaborate on the cues they based that decision on. To 1. Introduction compare the users’ susceptibility to spoken back emails versus the susceptibility to visually inspected emails, we Over the years we have to condition ourselves to spot exposed a control group of another 52 participants to a phishy URL or an overtly persuasive email narrative. the same email stimuli and collected their feedback. We The act of spotting phishing is mainly based on visual measured all 104 participants’ “phishing susceptibility” inspection of the email through a computer or smartphone using the SeBIS scale [7] and used the scores together with interface. Users recently got an alternative interface for their responses to determine the effectiveness of various email inspection in the voice assistants like Amazon Alexa “pretexting” and phishing tactics. This work provides the or Google Home [1]. The content in the voice assistant first empirical assessment of susceptibility to phishing ecosystem is usually “spoken back” to users, and as such, attacks when users listen to an email, but not look at it. users rely mostly on audio inspection. The voice/audio interface entails different patterns of interaction, which 2. Susceptibility to Phishing could affect the way users inspect suspicious content like email. The difference in email inspection opens an 2.1. Cognitive Vulnerabilities opportunity, we suspect, for adversaries to target users’ susceptibility to phishing emails. The goal of a phishing campaign is to elicit a decision Spoken back emails, even if phishing, are mostly from a vulnerable user that under normal circumstances harmless for users: they cannot click on the phishing they otherwise wouldn’t make. Exploiting one’s cognitive links or download any damaging attachments. This type vulnerabilities through phishing rests on three fundamen- of interaction is different from the one where a third- tal principles: epistemic asymmetry, technocratic domi- party application for Alexa tries to lure the user to spell nance, and teleological replacement [8]. Epistemic asym- their password directly to the voice assistant [2], which metry occurs when an adversary employs a persuasion is a phishing variant called voice phishing or “vishing”. strategy to manipulate the heuristics a user employs to Instead of crafting a third-party skill to try to retrieve the decide how to act on an email [9], [10]. These heuristics user’s password, an adversary thus can send a preliminary are influenced by the following persuasion strategies [4]: email with “pretexting” content, e.g. assume the guise of authority, commitment, liking, reciprocation, scarcity, and an “authoritative” figure and create a believable scenario social proof. These strategies have a varying effect on that elicits a user to expect a follow-up email and perform users and studies have shown that users are significantly © 2021, Filipo Sharevski. Under license to IEEE. 207 DOI 10.1109/SPW53761.2021.00034
more susceptible to emails using the authority and scarcity attacker, is to condition the target user, i.e. create a believ- principles [3], [11], [12]. able scenario that elicits the target user to expect a follow- Technocratic dominance occurs when an adversary up email and perform some action (e.g. follow an URL). mimics the cues a user employs to access the authenticity This email also asks the target user to reply in some form of an email e.g. spoofing “from” addresses or including (e.g. reply “yes”) to “confirm” that they have received this deceptive images/logos/banners, deceptive links, “https,” preliminary notification from the authoritative sender. Any and spelling and grammar [13], [14]. Studies have found action like confirmation with “yes” gives an indication that that the use of convincing logos and letterheads makes it the target user might be susceptible to phishing emails significantly harder to detect a phish for an average target coming from an authoritative sender [3]. [15]. Scarcity of time, or urgency, was found to short- For example, an adversary can send a preliminary circuiting the resources available for assessing the techni- email that appears it is from a user’s IT department asking cal cues to detect phishing [16]. Teleological replacement the user to reply “yes” if they don’t want opt out from occurs when the adversary manages to exploit users to a large mail server migration. If the target user replies act against their better judgment [16], [17]. For example, “yes,” the adversary can follow up with an actual phishing when people are stressed or under pressure, overloaded email and ask the user to confirm their opt-out on “the with information, or heavily focused on a primary task, following link,” where the link is actually a fraudulent their ability to notice suspicious emails is reduced [18]. URL. Certainly, there is no guarantee that the user will Even if suspicious emails are noticed, people may not feel actually click the link or recall they agreed on such an they have sufficient time, resources or means to further action when conversing with Alexa and succumb to the process any persuasion or technical cues. An attacker can reciprocity principle. Nevertheless, the adversary can take either “pretext” a target with overloading information or the “pretexting” further on and add a deadline for the utilize outside information to infer their cognitive over- reply before the deactivation takes place automatically. If load and choose a particular moment to send a phishing the target user in this case replies “yes” the adversary email. The phishing incident with John Podesta, Hillary gets the indication that “urgency” might work and rushes Clinton’s campaign manager, provides evidence for such a to send the actual phishing email to the user. strategy that ultimately altered his behaviour and resulted Even if all of this is futile and doesn’t work for the in yielding his Gmail login credentials [19]. adversary, it might be beneficial for the user. Users, many of which have been trained to spot or have experienced 2.2. Phishing Preparation: Pretexting phishing emails in the past, could be “incentivized” by Alexa to scrutinize any email in more detail on a screen and neutralize the phishing effort. Another reason for Pretexting is a commonly-used technique in social en- suspicion of such a pretexting email is a type of a phone gineering attacks in general and phishing in particular. In scam called “Can you hear me?” that lures the victims pretexting, the adversary uses a pre-designed scenario to to respond with “yes.” [23]. The goal of the scam, in legitimize their interactions with potential victims, reduce the similar way as the Alexa vishing attack [2], is to their suspicions, and eventually mislead them to click on obtain an approval in a form of voice signature and that a phishing URL or download a malicious attachment [20]. can later be used by the adversaries to pretend to be the Studies have found that phishing attacks coupled with pre- victim and authorize fraudulent activities. Any immediate texting are more likely to victimize message recipients and action such as giving confirmation “out of the blue”, result in teleological replacement with a higher success could potentially trigger a phishing flag for users aware rate [21], [22], [16]. Usually the pretexting scenario con- of phishing and vishing. In other words, Alexa could in textualizes the phishing principles of persuasion in a con- and of itself help users eliminate any epistemic asymmetry, text relevant for the phishing victim, for example, crafting bring the technological dominance on the side of the users, emails for a customer survey employing the principle of and deny any effort for teleological replacement through reciprocation to say Verizon and T-Mobile customers to “pretexting” by the phishing attackers. resemble the usual communication patterns each of the cellphone carriers does with their customers. However, little has be done in exploring the possible pretexting 4. Alexa in Phishingland potential using alternative interfaces and communication Our study seeks to understand how the change in the channels with the potential victim. In this paper, we interface, from visual to voice/audio interaction, affects introduce a novel type of epistemic asymmetry that utilizes the inspection of emails by end users. Our objective was voice assistants like Alexa as a form of technocratic dom- to see if an adversary could infer a phishing susceptibility inance over target users that helps an adversary achieve strategy for a user by sending several “pretext” emails teleological replacement through “pretexting”. through Amazon Alexa and observe the user’s behaviour. We set to answer the following research questions: 3. Phishing Pretexting with Alexa Research Question 1a: What action one takes to re- An email skill on the voice assistant allows users spond to an email spoken back by Alexa? to ask Alexa to read, flag, respond, delete, or search with a keyword for a particular email in their inbox on Research Question 1b: If one’s action is to ask Alexa their behalf. Some of these emails could be a preliminary to reply on their behalf to an email, what are the most phishing with “pretexting” content, e.g. assume the guise effective influencing strategies employed in the email of an “authoritative” figure. This email, sent by a phishing formatting that resulted in a successful “pretexting”? 208
Research Question 2a: What cues (audio, experience, 5. Results prior phishing knowledge) one uses to inspect an email spoken back by Alexa and decide whether the email is phishing or not? 5.1. Research Question 1 Research Question 2b: What cues (audio, experience, prior phishing knowledge) one uses to decide to ask 5.1.1. A: Responding to a Spoken Back Email. Table Alexa to reply on their behalf to a spoken back email? 2 shows the distribution of responses chosen for each of the three emails. Roughly 9% of the participants in Research Question 3a: Is there a difference in sus- the Alexa group indicated they will ask Alexa to reply ceptibility to pretexting and phishing, consequently, be- “yes” on their behalf. This result is on the lines of the tween emails spoken back by Alexa and emails visually percentage of people that fall on phishing emails (click inspected on a screen? rate) [27], given that Alexa group participants scored high on the SeBIS scale on average (M= 3.7147, SD = Research Question 3b: How the cues one uses to .658). Inspecting the spoken back email prompted most inspect an email differ between an email spoken back of the participants to flag the emails and check on the by Alexa and visually inspected email? screen or simply ignore the email. This is expected, given that users are yet to accustom to utilizing Alexa in full assisting capacity [28]. The first email, employing both With an IRB approval we recruited participants affili- the principles of authority and scarcity prompted 5.8% ated with a university in the US. The inclusion criteria re- of the participants to reach out and call the “Information quired them to be at least 18 years or older and have inter- Security” department. This results suggest that a combina- acted with an voice assistant. A sample of 104 participants tion of the persuasion principles elicit a response from the agreed to be in the study. Half of them were randomly as- user (potentially contrary to the adversary;s objectives). signed to interact with Alexa and the other half were given Roughly half of the participants indicated that they will emails for visual inspection (a control group). The emails delete the second email, probably numb from receiving relate to activities usually communicated via email, e.g. too many email surveys. university communication, a phone carrier message (Ver- izon), and a payment provider notification (Venmo). The emails were adapted from previous phishing susceptibility TABLE 2. U SER R ESPONSES TO SPOKEN BACK EMAILS studies [24], [25]. We modified the email text to include Email Stimuli only the option for replying “yes” instead of the URL Response First Second Third for the Alexa group, corresponding to the “pretexting” Alexa, reply “yes” 9.6% 8.7% 8.7% strategy we elaborated in Section 3. Each participant in the Flag and check on a screen 34.7% 53.8% 68.3% Alexa group prompted Alexa “Alexa, read my emails from Reach out (call) 5.8% 1% 1% Ignore the email 15.4% 27.9% 18.3% today.” The participants in the control group were simply Delete the email 1.9% 46.2% 3.8% shown the emails on a screen. After each email, we used the phishing susceptibility questionnaire from [24], [26], shown in Table 1. At the end, the participants completed the SeBIS questionnaire to measure participants’ phishing 5.1.2. B: Pretexting and Influencing Strategies. Table awareness [7]. The SeBIS scale measures a user’s self- 3 gives the distribution of the “pretexting” effectiveness reported intent to comply with “good” security practices in relationship to the influencing strategies for the portion such as paying attention to contextual phishing cues. of participants susceptible to phishing pretexting, i.e. who replied “yes” to the email stimuli. The scarcity and author- ity principles worked well for the first email, but the an- choring into the personal experience worked half the time TABLE 1. P HISHING S USCEPTIBILITY Q UESTIONNAIRE for these participants. This finding confirms the previous User Response What action will you take as the next step: evidence that authority and scarcity are indeed effective 1. Ask Alexa to reply “yes” to this email (“Respond to this email persuasion strategies [3] when it comes to phishing. About by following the link” for the control group) the personal experience, it is expected that participants not 2. Flag and check the email on a computer or smartphone screen necessarily received or perhaps paid attention to university later (only the Alexa group) administrative emails in the past. For the second email, 3. Reach out to the sender separately the authority and the personal experience were the most 4. Ignore the email 5. Delete the email successful in the pretexting, expectedly, because the email Authority: Do you recognize the sender? (Yes/No) was from a known sender. The scarcity principle worked Scarcity: Did the email convey sense of urgency? (Yes/No) roughly half the time. The scarcity didn’t work at all Personal Experience: Have you ever received an email like this for the third email. The authority of the sender and the from this or similar senders? (Yes/No) personal experience of the participants is what worked Detection: Does the email appear to be legitimate? (Yes/No/Maybe) well. Using a payment service like Venmo as a disguise Inspection: What cues did you use to make your determination? works as an effective adaptation of the old phishing emails (Open ended) from banks (Wachovia, Bank of America) or PayPal to the younger generation (the average age of the participant in the sample was 28 years old) [26]. 209
TABLE 3. “P RETEXTING ” AND I NFLUENCING S TRATEGIES personal experience (“usually a Verizon survey email is signed by a representative”). Email Stimuli Pretexting First Second Third For the third email, slightly more than half of Scarcity (Urgency) 100% 57.1% 0% the participants correctly identified it as being phishing Authority 80% 100% 100% (57.7%) with not phishing (36.5%) and potentially phish- Personal Experience 50% 100% 100% ing (5.8%). The participants who decided the email is phishing scored the highest on the phishing awareness scale (SeBIS=3.92). Those participants who said it was 5.2. Research Question 2 not phishing or were not entirely sure scored lower on average: SeBIS=3.7 and SeBIS=3.2, respectively. When 5.2.1. A: Audio Inspection Cues. The decisions for each asked what cues they used to inspect the third email of the spoken back emails are given in Table 4. For the spoken back by Alexa, the participants that incorrectly first email, participants are equally divided between the identified that the email was not phishing referred to email being phishing, not phishing, or potentially phish- the authority principle of influence (“The email is from ing. The participants who incorrectly decided the email is Venmo,” “Venmo is a reputable company that doesn’t send not phishing scored the lowest on the phishing awareness scam emails”) and personal experience (“I have received scale (SeBIS=3.5911), while the undecided scored the similar emails from Venmo,” “Venmo sends me these all highest on average (SeBIS=3.8526). Those that said the the time”). The participants who correctly identified the email was not phishing scored SeBIS=3.75. When asked email as phishing predominately referred to their personal what cues they used to inspect the first email spoken experience saying that “Venmo states the amount of the back by Alexa, the participants that incorrectly identified transaction directly in the email” and that “Venmo sends the email was not phishing predominantly referred to the notifications in the app.” The personal experience was also authority principle of influence, saying that the “source is the main factor for the participants that were undecided, trustworthy,” “the sender is valid,” and the email allowed saying that the email “didn’t sound the same as the other them to “safely ignore it.” The participants who correctly emails from Venmo” or “they don’t ask me to reply to get identified the email as phishing equally referred to the a receipt, the receipt is there in the email.” scarcity principle of influence (saying “the email seems urgent,” “it prompted me to reply immediately”) and 5.2.2. B: Pretexting Cues. Table 5 gives the breakdown personal experience (asking“why would I or someone else of the phishing decisions made by the participants who deactivate a spam filter”). The participants that were unde- asked Alexa to reply “yes” on their behalf. The authority cided questioned the authority of the sender (“seems like it and scarcity in the first email were sufficient to persuade is from our university, but I don’t know that department”), 70% of this group of participants to incorrectly decide the urgency of the email (“it said I can respond right here the email was not phishing. These participants scored the right now”) and also relied on their personal experience lowest on the phishing awareness scale: SeBIS=3.77. The (“I am not sure who would’ve initiated this email”). pretexting was suspicious to 20% of them and only 10% of them correctly decided that someone is trying to condition TABLE 4. D ETECTION : S POKEN BACK EMAILS them to take an action. The correct and suspecting partic- ipants scored the highest on the phishing awareness scale: Email Stimuli SeBIS=4.21 and SeBIS=4.20, respectively. The authority Detection First Second Third and personal experience helped 85.7% of the participants Not Phishing 30.7% 13.5% 36.5% Phishing 30.7% 63.5% 57.7% to correctly identify the second email as not phishing, Maybe Phishing 38.6% 23% 5.8% scoring the highest on the phishing awareness scales (Se- BIS=4.77). Only 14.3% of the participants suspect the For the second email, more than half of the par- email is phishing, scoring the lowest on the phishing ticipants incorrectly identified it as being not phishing awareness scale (SeBIS=3.00). The authority and personal (13.5%) with phishing (63.5%) and potentially phishing experience were persuasive for 75% of these participants (23%). The participants who decided the email is not to incorrectly identify the third email as not phishing. phishing scored the lowest on the phishing awareness scale Their scores were the lowest on the phishing awareness on average, SeBIS=3.5667. Those participants who said scales (SeBIS=3.00). Equally 12.5% of the participants it was phishing or were not entirely sure scored higher suspected/decided correctly that the email is phishing and on average: SeBIS=4.007 and SeBIS=4.006, respectively. their scores were the highest (SeBIS=4.2 on average). When asked what cues they used to inspect the second TABLE 5. P RETEXTING AND E MAIL D ETECTION email spoken back by Alexa, the participants that correctly identified that the email was not phishing predominantly Email Stimuli referred to the authority principle of influence saying that Pretexting First Second Third it “sounds like something Verizon will ask.” The partic- Not Phishing 70% 85.7% 75% Phishing 10% 0% 12.5% ipants who incorrectly identified the email as phishing Maybe Phishing 20% 14.3% 12.5% referred to the reciprocity principle of influence (saying “Verizon usually gives incentives,” “there is no indication what I will get in return”). The participants that were un- 5.3. Research Question 3 decided questioned the authority of the sender (“someone could easily fake Verizon’s email”), the urgency of the 5.3.1. A: Differences in Susceptibility. To see if there is email (“it gives a week to respond”) and relied on their a difference in susceptibility to pretexting between emails 210
spoken back by Alexa and emails visually inspected on a 5.3.2. A: Differences in Cues for Inspection. Table 7 screen, we performed several Chi-square tests between the shows the difference in relying on personal experience control and the Alexa group. Table 6 shows the results of for making a decision about phishing between the spoken a 2 x 4 Chi-square test comparing the user responses for back and the regular emails. We observed a statistically each of the email stimuli used in the study, respectively significant association between the mode of email inspec- We observed a statistically significant association between tion (spoken back vs. regular) and the personal experience the mode of email inspection (spoken back vs. regular) for for the second and the third email: 2) χ2 (1) = 7.80, p = the first email, χ2 (3) = 16.867, p = 0.002, and for the 0.005; and 3) χ2 (1) = 28.719, p = 0.001. While for the third email, χ2 (3) = 20, p = 0.001. Both of these emails first email there was no difference between the groups, were phishing emails and in both cases the participants in the Alexa group participants had reported higher exposure the Alexa group appear more susceptible to the pretexting to business communication emails. The participants in the then the control group participants to the phishing. This control group expressed a bit different personal experience result adds additional evidence that pretexting in voice with customer satisfaction surveys and payment receipts. assistant environments is a viable “preterxting” strategy. For the second email they indicated that the “greeting is vague,” and that “customer surveys are usually done via TABLE 6. U SER R ESPONSE : S POKEN BACK VS . REGULAR EMAILS phone at end of service call, no need for an email.” For User Response [Email 1] the third email, they noticed that “e-mails like this would Yes/Click Call Ignore Delete Total provide the receipt details regardless.” This outcome might Control 3 6 25 18 52 be a result of the Alexa group participants’ reliance on Alexa 16 10 20 6 52 the availability heuristic when deciding on a spoken back Total 19 16 46 24 104 email - hearing a keyword like “customer satisfaction User Response [Email 2] survey” or a “payment receipt” might have prompted Control 10 4 29 9 52 Alexa 11 1 29 11 52 them to relate any experience with email surveys and Total 21 5 58 20 104 receipts, bypassing any vagueness in greeting or need to User Response [Email 3] hear the details of the receipt. This result indicates that Control 1 2 23 26 52 the pretexting strategy should rely on contextually relevant Alexa 14 4 24 10 52 experiences for the user, instead of a simple email blast. Total 15 6 47 36 104 TABLE 8. P ERSONAL EXPERIENCE : S POKEN BACK VS . REGULAR Table 6 shows the results of a 2 x 3 Chi-square EMAILS test comparing the detection decision for each of the email stimuli used in the study, respectively. We ob- Personal experience [Email 1] served a statistically significant association between the Yes No Total Control 16 36 52 mode of email inspection (spoken back vs. regular) and Alexa 25 27 52 the phishing detection success for all three emails: 1) Total 41 63 104 χ2 (2) = 18.682, p = 0.001; 2) χ2 (2) = 6.504, p = 0.039; Personal experience [Email 2] and 3) χ2 (2) = 30.239, p = 0.001. For the first email, Control 34 18 52 there are more than twice correct detections that the email Alexa 46 6 52 Total 80 24 104 is phishing in the control group compared to the Alexa Personal experience [Email 3] group. For the second email, there are more than twice Control 21 31 52 correct detections that the email is not phishing in the Alexa 47 5 52 Alexa group. For the third email, there are more than Total 68 36 104 twice correct detections that the email is phishing in the control group compared to the Alexa group. There are Table 9 shows the difference in the effectiveness of several implications of these results, namely, that 1) an the authority persuasion strategy between the spoken back adversary could pretext a user to believe a phishing email and the regular emails. We observed a statistically signif- is not a phishing when the email is spoken back; and 2) icant association between the mode of email inspection voice assistant users show much lower false negative rate (spoken back vs. regular) and the authority strategy for in detecting phishing/pretexting. all three emails: 1) χ2 (1) = 7.558, p = 0.006; 2) χ2 (1) = 28.556, p = 0.001; and 3) χ2 (1) = 9.43, p = 0.002. More TABLE 7. D ETECTION : S POKEN BACK VS . REGULAR EMAILS of the Alexa group participants recognized the sender in the first email, but less in the second and third email. Detection [Email 1] The participants in the control group suspected the au- Phishing Not Phishing Maybe Total thority of the sender in the first email, indicating that Control 35 2 15 52 Alexa 16 16 20 52 “the email doesn’t seem to come from a .edu address.” Total 51 18 35 104 They acknowledged, though still suspected, the authority Detection [Email 2] of the second email (Verizon), asking “why the email has Control 12 12 20 52 an impersonal greeting and not my name.” For the third Alexa 7 33 12 52 email, the control group participants visually caught the Total 19 53 32 104 Detection [Email 3] misspelling of the authoritative sender (“Venmou” instead Control 2 4 11 52 of “Venmo,” which in the Alexa group, when spoken back, Alexa 19 30 3 52 sounds the same). This outcome might be a result of Total 56 37 14 104 the contextual relevance of the first email for the Alexa group participants (allegedly coming from the university 211
information security department), but not the particular phishing awareness scores on average. The lack of phish- businesses in the second and the third email (not all ing awareness impaired them to correctly spot the emails participants are Verizon and Venmo users). As with the as phishing when they were spoken back by Alexa. Our personal experiences above, the results indicate that the further analysis suggested that this particular outcome pretexting strategy needs to be relevant for the user in the results because the first email successfully employed the context of their personal day-to-day interactions to allow authority principles of influence, while the authority and for application of the authority persuasion principle. the personal experience were the most successful in the pretexting for the third email. The comparison between TABLE 9. AUTHORITY: S POKEN BACK VS . REGULAR EMAILS the Alexa and the control group of participants indicated that pretexting in voice assistant environments might be Authority [Email 1] Yes No Total a threat worth exploring, especially when the pretexting Control 19 33 52 strategy relies on contextually relevant experiences and Alexa 33 19 52 day-to-day interactions. We must note that the results do Total 52 52 104 not serve as an authoritative test but more so an early Authority [Email 2] exploration of new avenues for phishing pretexting. As Control 36 16 52 Alexa 9 43 52 such, we urge readers to take the results as indicative to Total 52 52 104 the said threat, which we certainly believe needs need Authority [Email 3] further investigation. Control 26 26 52 A significant portion of the participants utilized Alexa Alexa 11 41 52 to their advantage as a form of a potential early phish- Total 52 52 104 ing detection buffer indicating that they are inclined to flag the emails and check them later on a computer or Table 10 shows the difference in the effectiveness of a smartphone screen. For the first email, 5.8% partici- the scarcity persuasion strategy between the spoken back pants indicated they will directly call the department that and the regular emails . We observed a statistically sig- sent the email to ask what it is about. The majority of nificant association between the mode of email inspection them correctly identified the first email as phishing but (spoken back vs. regular) and the scarcity strategy for the half of them failed to make the correct decision for the second and the third email: 2) χ2 (1) = 40.316, p = 0.001; third email, despite scoring the highest on the phishing and 3) χ2 (1) = 61.905, p = 0.001. While for the first awareness score in both cases. Most of them exhibited email there was no difference between the groups, the “better safe than sorry” behaviour in the second email, Alexa group participants had the opposite perception of suspecting that the email is possibly phishing. A possible the scarcity, or the urgency of the emails then the partici- explanation for the lapse in judgement for the third email pants in the control group. The participants in the control is the lack of urgency in the email formatting, indicating group recognized the urgency in the formatting of the that the absence of known influencing strategies could be first (“Sense of urgency, link is suspicious since it’s not a useful for “pretexting” through Alexa for users who are .edu”) and second email (“The time limit alludes to a bit highly aware of phishing. Attention needs to be brought of urgency”). The inability to contextualize the urgency to the participants who said that they would ignore the of the emails in the Alexa group might be a result of the emails altogether. An additional analysis revealed that audio interface itself which doesn’t allow for immediate those who chose to ignore the first and the second email action on the email like the regular computer interface scored the lowest on the phishing awareness scale, but does. The results indicate scarcity might not be applicable those that ignored the third showed the highest score. We for pretexting in voice assistant environments. suspect that this might be a result of the higher exposure TABLE 10. S CARCITY: S POKEN BACK VS . REGULAR EMAILS to financial phishing campaigns then to institutional or communications ones. People probably tend to be more Scarcity [Email 1] vigilant about their bank accounts than their institutional Yes No Total or cellphone service accounts. A portion of the partici- Control 29 23 52 pants decided to delete the phishing emails, suggesting Alexa 32 20 52 Total 61 43 104 potential Alexa’s utilization as phishing removal assistant. Scarcity [Email 2] In general, the alternative audio/voice interaction in- Control 18 34 52 terface has the potential to be used for pretexting based on Alexa 49 3 52 the authority principle and to a lesser extent the scarcity Total 67 37 104 principle, but the relatively higher number of incorrect Scarcity [Email 3] Control 8 44 52 decisions and decisions as “potentially phishing” suggest Alexa 48 4 52 that users in our study relayed on their personal expe- Total 56 48 104 rience when talking to Alexa about emails. This makes sense from a human-computer interaction perspective, given that users tend to personify Alexa in such inter- 6. Discussion and Conclusion actions as reading personal emails and place a high level of trust in the voice assistant [29]. Our analysis shows that the percentage of partici- We utilized a relatively younger-leaning sample in our pants susceptible to “pretexting” through Alexa is non- study and a more representative population might have negligible. Not surprisingly, for the emails that were ac- a different approach towards handling phishing emails tually phishing, these participants showed relatively low and interacting with Alexa. We didn’t measure for the 212
frequency of interaction with voice assistants but stud- [14] K. Parsons, M. Butavicius, M. Pattinson, D. Calic, A. Mccormac, ies suggest that the personification and interaction with and C. Jerram, “Do users focus on the correct cues to differentiate between phishing and genuine emails?” 2016. Alexa is dependent on it [29], [26]. It will be interesting [15] M. Blythe, H. Petrie, and J. A. Clark, “F for fake: Four in the future to explore the relationship between one’s studies on how we fall for phish,” in Proceedings of the susceptibility to phishing “pretexting” and their interaction SIGCHI Conference on Human Factors in Computing Systems, habits with Alexa. Our study utilized three sample email ser. CHI ’11. New York, NY, USA: Association for Computing stimuli that were customized to the audio interaction with Machinery, 2011, p. 3469–3478. [Online]. Available: https: //doi.org/10.1145/1978942.1979459 Alexa. Different emails from different senders, perhaps [16] A. Vishwanath, T. Herath, R. Chen, J. Wang, and H. R. Rao, including actual fraudulent links, might yield different “Why do people get phished? testing individual differences in results. Hearing an email that wants you to click on a phishing vulnerability within an integrated, information processing link could trigger a phishing deception judgment much model,” Decision Support Systems, vol. 51, no. 3, pp. 576 – 586, more frequently than hearing an email that asks you to 2011. [Online]. Available: http://www.sciencedirect.com/science/ article/pii/S016792361100090X reply “yes.” This is certainly an avenue of research that [17] J. S. Downs, M. B. Holbrook, and L. F. Cranor, “Decision strategies we will explore in the future to yield a more nuanced and susceptibility to phishing,” in Proceedings of the Second understanding of users’ phishing behaviour. Symposium on Usable Privacy and Security, ser. SOUPS ’06. New York, NY, USA: Association for Computing Machinery, 2006, pp. References 79 – 90. [18] C. Koumpis, G. Farrell, A. May, J. Mailley, M. Maguire, and V. Sdralia, “To err is human, to design-out divine: Reducing [1] H. Chung, M. Iorga, J. Voas, and S. Lee, “Alexa, Can I Trust You?” human error as a cause of cyber security breaches,” Human Factors Computer, vol. 50, no. 9, pp. 100–104, 2017. Working Group Complementary White Paper, 2007. [2] Security Research Labs, “Smart Spies: Alexa and Google Home [19] E. Lipton, D. Sagner, and S. Shane, “The Perfect Weapon: How expose users to vishing and eavesdropping,” https://srlabs.de/bites/ Russian Cyberpower Invaded the U.S.” https://www.nytimes.com/ smart-spies/, December 2019. 2016/12/13/us/politics/russia-hack-election-dnc.html?referer=, De- [3] T. Lin, D. E. Capecci, D. M. Ellis, H. A. Rocha, S. Dommaraju, cember 2016. D. S. Oliveira, and N. C. Ebner, “Susceptibility to spear-phishing [20] K. D. Mitnick and W. L. Simon, The art of deception: Controlling emails: Effects of internet user demographics and email content,” the human element of security. John Wiley & Sons, 2003. ACM Trans. Comput.-Hum. Interact., vol. 26, no. 5, Jul. 2019. [21] X. R. Luo, W. Zhang, S. Burd, and A. Seazzu, “Investigating [Online]. Available: https://doi.org/10.1145/3336141 phishing victimization with the heuristic-systematic model: A [4] R. B. Cialdini, Influence: the psychology of persuasion; Rev. theoretical framework and an exploration,” Computers & Security, ed. New York, NY: Collins, 2007. [Online]. Available: vol. 38, pp. 28 – 38, 2013, cybercrime in the Digital Economy. http://cds.cern.ch/record/2010777 [Online]. Available: http://www.sciencedirect.com/science/article/ [5] D. J. Major, D. Y. Huang, M. Chetty, and N. Feamster, “Alexa, who pii/S0167404812001927 am i speaking to? understanding users’ ability to identify third- [22] M. Workman, “Wisecrackers: A theory-grounded investigation of party apps on amazon alexa,” 2019. phishing and pretext social engineering threats to information [6] Z. Guo, Z. Lin, P. Li, and K. Chen, “Skillexplorer: Understanding security,” Journal of the American Society for Information Science the behavior of skills in large scale,” in 29th USENIX Security and Technology, vol. 59, no. 4, pp. 662–674, 2008. Symposium (USENIX Security 20). USENIX Association, Aug. [23] Federal Communications Commission, “FCC Warns of ’Can You 2020, pp. 2649–2666. [Online]. Available: https://www.usenix.org/ Hear Me’ Phone Scams,” March 2017, https://www.fcc.gov/. conference/usenixsecurity20/presentation/guo [24] C. Canfield, A. Davis, B. Fischhoff, A. Forget, S. Pearman, [7] S. Egelman, L. F. Cranor, and J. Hong, “You’ve been warned: and J. Thomas, “Replication: Challenges in using data logs An empirical study of the effectiveness of web browser phishing to validate phishing detection ability metrics,” in Thirteenth warnings,” in Proceedings of the SIGCHI Conference on Human Symposium on Usable Privacy and Security (SOUPS 2017). Factors in Computing Systems, ser. CHI ’08. New York, NY, Santa Clara, CA: USENIX Association, Jul. 2017, pp. 271–284. USA: Association for Computing Machinery, 2008, p. 1065–1074. [Online]. Available: https://www.usenix.org/conference/soups2017/ [Online]. Available: https://doi.org/10.1145/1357054.1357219 technical-sessions/presentation/canfield [8] H. Joseph, “Social engineering in cybersecurity: The evolution of [25] C. I. Canfield, B. Fischhoff, and A. Davis, “Quantifying phishing a concept,” Computers & Security, vol. 73, pp. 102 – 113, 2018. susceptibility for detection and behavior decisions,” Human Fac- tors, vol. 58, no. 8, pp. 1158–1172, 2016. [9] A. Ferreira, L. Coventry, and G. Lenzini, “Principles of persua- [26] S. Sheng, M. Holbrook, P. Kumaraguru, L. F. Cranor, and sion in social engineering and their use in phishing,” in Human J. Downs, “Who falls for phish? a demographic analysis of Aspects of Information Security, Privacy, and Trust, T. Tryfonas phishing susceptibility and effectiveness of interventions,” in and I. Askoxylakis, Eds. Springer, 2015, pp. 36–47. Proceedings of the SIGCHI Conference on Human Factors [10] A. van der Heijden and L. Allodi, “Cognitive triaging of in Computing Systems, ser. CHI ’10. New York, NY, USA: phishing attacks,” in 28th USENIX Security Symposium (USENIX Association for Computing Machinery, 2010, p. 373–382. [Online]. Security 19). Santa Clara, CA: USENIX Association, Aug. Available: https://doi.org/10.1145/1753326.1753383 2019, pp. 1309–1326. [Online]. Available: https://www.usenix.org/ [27] Verizon, “Data Breach Investigation Report,” online, May conference/usenixsecurity19/presentation/van-der-heijden 2019, https://enterprise.verizon.com/resources/executivebriefs/ [11] O. Zielinska, A. Welk, C. B. Mayhorn, and E. Murphy-Hill, “The 2019-dbir-executive-brief.pdf. persuasive phish: Examining the social psychological principles [28] J. Lau, B. Zimmerman, and F. Schaub, “Alexa, Are You hidden in phishing emails,” in Proceedings of the Symposium and Listening?: Privacy Perceptions, Concerns and Privacy-seeking Bootcamp on the Science of Security, ser. HotSos ’16. New Behaviors with Smart Speakers,” Proc. ACM Hum.-Comput. York, NY, USA: Association for Computing Machinery, 2016, p. Interact., vol. 2, no. CSCW, pp. 1–31, Nov. 2018. [Online]. 126. [Online]. Available: https://doi.org/10.1145/2898375.2898382 Available: http://doi.acm.org/10.1145/3274371 [12] G. D. Moody, D. F. Galletta, and B. K. Dunn, “Which phish [29] A. Purington, J. G. Taft, S. Sannon, N. N. Bazarova, and S. H. get caught? an exploratory study of individuals’ susceptibility to Taylor, ““alexa is my new bff”: Social roles, user satisfaction, phishing,” European Journal of Information Systems, vol. 26, no. 6, and personification of the amazon echo,” in Proceedings of the pp. 564–584, 2017. 2017 CHI Conference Extended Abstracts on Human Factors [13] C. Iuga, J. R. C. Nurse, and A. Erola, “Baiting the hook: fac- in Computing Systems, ser. CHI EA ’17. New York, NY, tors impacting susceptibility to phishing attacks,” Human-centric USA: Association for Computing Machinery, 2017, p. 2853–2859. Computing and Information Sciences, vol. 6, no. 1, p. 8, 2016. [Online]. Available: https://doi.org/10.1145/3027063.3053246 213
You can also read