Ideas for research from social bots to beer - Thesis proposals - Florian Daniel
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Thesis proposals Ideas for research from social bots to beer Florian Daniel florian.daniel@polimi.it February 25, 2019
Detection of harmful social bots Identification of harmful communication patterns Conversational screen readers Domain-specific content extraction
Empirical study shows: bots may cause harm to humans Examples When abuse happens What bots do What rce, Cu sto merS vc Disclose What else can 1 eCom me else can Chat Talk with user AASlang sensitive facts go wrong? 10 Boo Denigrate s tJui they do? Redirect user SM c Ss e ex a Be grossly 3 m Pu offensive Regulated o Ore Be indecent 4 by law Tay MS or obscene Write post DeathThreat 2 o Be threatening Geic Post Comment post Da Make false 6 t ing allegations SethRich Iva Forward post Sp na am Action Abuse Deceive Pola W rBo Spam ise t Like message Instag Sh res ib e Spread Endorse misinformation Trump, Colluding Not regulated Follow user Bots Mimic interest Clone profile , Jason Slotkin6 InstaClone Participate Create user Invade space F. Daniel, C. Cappiello, B. Benatallah. Bots Acting Like Humans: Understanding and Preventing Harm. IEEE Internet Computing, 2019, accepted for publication. https://ieeexplore.ieee.org/document/8611348
Detection of harmful social bots Tweets Account How do we identify and classify bots Harm1 according to the Human harm they may Harm2 HarmN cause?
Not Safe For Work Profile picture stolen from existing people (usually women) Personalized profile preferences aimed at emulating actual users Catchy profile description paired with unsafe URL Tweets aim to redirect users to untrusted URLs Lorenzo Cannone, Matteo Di Pierro. Detection and Classification of Harmful Bots in Human-Bot Interactions on Twitter. MSc Thesis, Politecnico di Milano, December 2018.
News-Spreader Propagandistic profile picture and tile Propaganda High number of interactions hashtags and messages in description Lots of (political) news retweeting Lorenzo Cannone, Matteo Di Pierro. Detection and Classification of Harmful Bots in Human-Bot Interactions on Twitter. MSc Thesis, Politecnico di Milano, December 2018.
Spam-Bot Profile tile with catchy Profile picture with messages corporate logo Corporate link High number of tweets in description with high frequency paired with messages of job offers (product sales) Repetitive tweets aimed at spamming URLs Lorenzo Cannone, Matteo Di Pierro. Detection and Classification of Harmful Bots in Human-Bot Interactions on Twitter. MSc Thesis, Politecnico di Milano, December 2018.
Fake-Follower Optional profile picture Poor profile (usually missing) customization Poor Low number of content posting (usually null) description Following as main interaction (usually empty) Optional (re)tweeting activity Lorenzo Cannone, Matteo Di Pierro. Detection and Classification of Harmful Bots in Human-Bot Interactions on Twitter. MSc Thesis, Politecnico di Milano, December 2018.
Thesis ingredients Creation of datasets Multi-class ensemble classifier Feature selection and engineering Can we add more types of harm? How to train classes as users use BotBuster? Web app Figure 7.3: Client - Server architecture Inception neural network. Users with less than 10 tweets, or less than 10
Identification of harmful communication patterns Bot code repositories GitHub Which potential harms can we identify inside the code of the bots?
ne ts n s ce fac tio on e bs ma siv ati ive ro lleg en for sit g to t Harmful code off ce n s isin en ea file ere i ten en pa sly es dm pro s e int ec ea fal es rat ive s s e gro ind clo thr am ic rea ne nig ke us ad ce m Clo patterns Dis Ma Inv De Sp Sp Ab Be Be Pattern De Be Action Mi Follow Indiscriminate follow Whitelist-based follow Blacklist-based follow Twitter Phantom follow Like Indiscriminate like Whitelist-based like Python Blacklist-based like Mass like Tweet Fixed-content tweet Next: patterns search AI-generated tweet Trusted source tweet engine + web site for Mention Indiscriminate mention users Opt-in mention Targeted mention Whitelist-based mention Blacklist-based mention What else can we Retweet Indiscriminate retweet learn from the code of Whitelist-based retweet Blacklist-based retweet bots? Is it possible to Mass retweet Talk to Indiscriminate talk trace back from Fixed-content talk AI-generated talk messages to code? Talk with opt-in Targeted talk Pause Mimic human Andrea Millimaggi and Florian Daniel. On Social Satisfy API contraints Bots Behaving Badly: Empirical Study of Code Patterns on GitHub. Submitted for publication to Store Store persistently ICWE 2019, under review. Enables Prevents Vulnerable to content abuse Vulnerable to trust abuse
Conversational screen readers Amazon Alexa Google Assistant Can we extract conversational knowledge from webpages? What about “talking” to websites?
Domain-specific content extraction How effectively can we extract recipes from free text? And how do we do it?
Typical data science steps Domain understanding Data collection Manual inspection of data Hypotheses formulation Feature engineering, data labeling Algorithm engineering (from AI/machine learning to statistics) Online tool Validation and hypotheses verification
Soft skills Proposals Template Florian Daniel http://www.floriandaniel.it
You can also read