February 22nd 2019 Joint work with Pascal Dupré, Gili Rusak, Giancarlo Pellegrino and Dan Boneh - Perceptual Ad-Blocking: Meet Adversarial Machine ...

Page created by Darrell Bell

IT & Technique

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

February 22nd 2019 Joint work with Pascal Dupré, Gili Rusak, Giancarlo Pellegrino and Dan Boneh - Perceptual Ad-Blocking: Meet Adversarial Machine ...

Perceptual Ad-Blocking:
Meet Adversarial Machine Learning

                             Florian Tramèr
                            Palo Alto Networks
                           February 22nd 2019

Joint work with Pascal Dupré, Gili Rusak, Giancarlo Pellegrino and Dan Boneh

The Future of Ad-Blocking?

                              easylist.txt

                                …markup…
                                 …URLs…

                                 ???

                                           This is
                                           an ad

Human distinguishability of ads
  >   Legal requirement (U.S. FTC, EU E-Commerce)
  >   Industry self-regulation on ad-disclosure
                                                     2

Perceptual Ad-Blocking

§   Ad Highlighter by Storey et al.
    >   Visually detects ad-disclosures
    >   Traditional Computer Vision techniques
    >   Simplified version implementable in Adblock Plus

§   Sentinel by Adblock Plus
    >   Locates ads in Facebook screenshots using neural networks
    >   Not yet deployed

                                                                    3

Perceptual Ad-Blocking

§   Ad Highlighter by Storey et al.
    >   Visually detects ad-disclosures
    >   Traditional Computer Vision techniques

§   Sentinel by Adblock Plus
    >   Locates ads in Facebook screenshots using neural networks
    >   Not yet deployed

                                                                    4

How Secure is Perceptual Ad-Blocking?

                                          … so that
                                         Tom’s post
                                        gets blocked
Jerry uploads
  malicious
   content
      …
                                                       5

Outline

§ Perceptual ad-blockers: how they work

§ Attacking perceptual ad-blockers

§ Why defending is hard

                                          6

Outline

§ Perceptual ad-blockers: how they work

§ Attacking perceptual ad-blockers

§ Why defending is hard

                                          7

How does a Perceptual Ad-Blocker Work?
                                 https://www.example.com                                Ad Disclosure

                                                                                               Ad

                    Classifier                                            Classifier

Data Collection and Training                     Page Segmentation                 Classification       Action

                                                                     Template matching, OCR,
                                                                     DNNs, Object detector networks

Ø Element-based (e.g., find all  tags) [Storey et al. 2017]
Ø Frame-based (segment rendered webpage into “frames”)
Ø Page-based (unsegmented screenshots à-la-Sentinel)
                                                                                                                 8

Building a Page-Based Perceptual Ad-Blocker

§   Sentinel is not yet deployed so we rolled our own

§   Let’s aim bigger than just Facebook!
    (data collection on FB is a pain / privacy issue anyways...)

§   We trained an object detector neural network (YOLO-v3)
    on news websites from all G20 nations
    >   Use filter lists to create a labelled dataset for training
    >   Crop & replace ads for data augmentation (increase data diversity)

                                                                             9

Our Ad-Detector in Action

Video taken from 5 websites not used during training
                                                       10

Outline

§ Perceptual ad-blockers: how they work

§ Attacking perceptual ad-blockers

§ Why defending is hard

                                          11

The Current State of ML

              ML works well on average
                            ≠
         ML works well on adversarial data

     *

         *as long as there is no adversary

                                             12

Adversarial Examples

                                                        Szegedy et al., 2014
                                                        Goodfellow et al., 2015

                            - ≈ 2⁄255

§    How?
    > Training ⟹ “tweak model parameters such that !(    ) = %&'(&”
    > Attacking ⟹ “tweak input pixels such that !(  ) = )*++,'”

                                                                                  13

(Meaningful) Defenses

                        14

Adversarial Examples for Perceptual Ad-Blockers

                                              15

Attacking Page-Based Classifiers

§    Challenge 1: Input of model is a webpage screenshot
    > Attack must be implemented in HTML

§    Challenge 2: ! can’t control or predict the full input
    > E.g., publishers can’t modify or know contents of ad frames

                                                  Classifier

                                                                    16

Ad-Block Evasion

§    Goal: Make ads unrecognizable by ad-blocker

§    ! = Website publisher
    > Abilities: Inspect ad-blocker classifier(s) offline
                 Change page DOM, CSS, JavaScript...
                 Cannot modify content of ad frames

§    ! = Ad network (or advertisers)
    > Abilities: Inspect ad-blocker classifier(s) offline
                 Arbitrary changes to content of ad frames

                                                             17

Evasion 1: Universal Transparent Overlay

§ Web publisher perturbs every rendered pixel

                                      Use HTML tiling to minimize
                                       perturbation size (20 KB)

Ø 100% success rate on 20 webpages not used to create the overlay
Ø The attack is universal: the overlay is computed once and works
  for all (or most) websites

                                                                    18

Evasion 2: Perturbed Ads

§   Ad network perturbs served ads
    >   Creating a single perturbation that works for every ad on every website is hard
    >   Target a specific domain

                      Alternative attack
        Ø Publisher perturbs background below ad frame
        Ø 100% success in evading ads

              Original                                With adversarial ad

Ø 100% success rate for ads served on BBC.com
Ø No CSS: Ad image is directly perturbed on the server
Ø The perturbation is universal: It works for all ads (on this domain)
                                                                                          19

Ad-Block Detection

§    Goal: Trigger ad-blocker on “honeypot” content
    > Detect ad-blocking in client-side JavaScript or on server
    > Applicability of these attacks depends on ad-blocker type

§    ! = Website publisher
    > Abilities: Inspect ad-blocker classifier(s) offline
                 Change page DOM, CSS
                 Use client-side JavaScript to detect DOM changes

                                                                    20

Detection: Perturb fixed page layout

§    Publisher adds honeypot in page-region with fixed layout
    > E.g., page header

          original                      With honeypot header

                                                                21

New Threats: Privilege Abuse
§       Ad-block evasion & detection is a well-known arms race.
        But there’s more!

                                                     … so that
                                                    Tom’s post
                                                   gets blocked
              Jerry uploads
                malicious
                 content
                    …

    What happened?
    Ø    Object detector model generates box predictions from full page inputs
    Ø    Content from one user can affect predictions anywhere on page
    Ø    Model’s segmentation is not aligned with web-security boundaries
                                                                                 22

Outline

§ Perceptual ad-blockers: how they work

§ Attacking perceptual ad-blockers

§ Why defending is hard

                                          23

A Challenging Threat Model
                                   https://www.example.com                           Ad Disclosure

                                                                                            Ad

                      Classifier                                       Classifier

  Data Collection and Training                     Page Segmentation            Classification            Action

                                    Ø DOM Obfuscation                    Adversarial                 Privilege
Data Poisoning                                                            Examples                    Abuse
                                    Ø Resource Exhaustion

    Ø ! has white-box access to ad-blocker
    Ø ! can exploit False Negatives and False Positives in classification pipeline
    Ø ! prepares attacks offline ó The ad-blocker must defend against
                                     attacks in real-time in the user’s browser
    Ø ! can take part in crowd-sourced data collection

                                                                                                                   24

Defense Strategy 1: Obfuscate the Model

      §        Attacks are easy if ! has access to the ML model
             > Hide model from adversary?

      §        Obfuscate the ad-blocker?
             > It isn’t hard to create adversarial examples for black-box classifiers
https://www.example.com                 https://www.example.com
                                                            Ad Disclosure                    Ad Disclosure

                                                                   Ad                               Ad

                           Classifier         Classifier                      Classifier

      §        Randomize the ad-blocker?
              (1) Pageand
      Data Collection            (2) Classification
                       Segmentation
                          Training (1) Page Segmentation
                                                                            (3) Action(2) Classification     (3) Action

             > Deploy different models
                    -     Adversarial examples that work against multiple models
             > Randomly change page before classifying
                    -     Adversarial examples robust to random transformations

                                                                                                                          25

Defense Strategy 2: Anticipate and Adapt

§    If ad-blocker is attacked (evasion or detection),
     collect adversarial samples and re-train the model
    > Or train on adversarial examples proactively

§    This is called Adversarial Training
    > New arms-race: ! finds new attacks and
      ad-blocker re-trains
    > Mounting a new attack is much easier than         1600 citations, 800 in 2018!
      updating the model
    > On-going research: so far ! always wins!

                                                      Broke 7 defenses, a few days after
                                                     they were accepted for publication

                                                                                       26

Defense Strategy 3: Simplify the Problem

§   Storey et al: recognize ad-disclosures
    > Simpler computer vision problem than
      full-page ad-detection
    > Light-weight and mature techniques
      (OCR, perceptual hashing, SIFT)

§   Adversarial Examples still exist

                                                27

Take Away

§    Emulating human-like detection of ads could be the end-game
     for ad-blockers
§    But very hard with current computer vision techniques
    > Resisting adversarial examples is one of the most challenging open
      problems in ML security

§    Perceptual ad-blockers have to survive a strong threat model
    > Evasion & detection with adversarial examples
    > Privilege abuse attacks from arbitrary content providers
    > Similar threats for other ML-based ad-blockers (e.g., AdGraph?)

                                                 Ø Train a page-based ad-blocker
                                                 Ø Download pre-trained models
                                                 Ø Attack demos
      http://arxiv.org/abs/1811.03194         https://github.com/ftramer/ad-versarial
                                                                                        28

You can also read