Adversarial examples for Deep Learning Cyber Security Analytics

Page created by Jorge Salinas

Current Events

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

Adversarial examples for Deep Learning Cyber Security Analytics

Adversarial examples for Deep Learning Cyber Security Analytics

                                                                     Alesia Chernikova                                      Alina Oprea
                                                                   Northeastern University                            Northeastern University

                                         Abstract—We consider evasion attacks (adversarial exam-            PDF malware detection [62], [67] and malware classifica-
                                         ples) against Deep Learning models designed for cyber se-          tion [31], [63], but these applications use binary features.
                                         curity applications. Adversarial examples are small modifica-      Recently, Kulynych et al. [41] introduce a graphical frame-
                                         tions of legitimate data points, resulting in mis-classification   work for general evasion attacks in discrete domains, that
                                         at testing time. We propose a general framework for crafting       constructs a graph of all possible transformations of an
                                         evasion attacks that takes into consideration the dependen-        input and selects a set of minimum cost to generate an
                                         cies between intermediate features in model input vector, as       adversarial example. The previous work, however, cannot
arXiv:1909.10480v2 [cs.CR] 31 Jan 2020

                                         well as physical-world constraints imposed by the applica-         yet handle evasion attacks in security applications that
                                         tions. We apply our methods on two security applications,          respect complex feature dependencies, as well as physical-
                                         a malicious connection and a malicious domain classifier, to       world constraints.
                                         generate feasible adversarial examples in these domains. We             In this paper we introduce a novel framework for
                                         show that with minimal effort (e.g., generating 12 network         crafting adversarial attacks in cyber security domain that
                                         connections), an attacker can change the prediction of a           respects the mathematical dependencies given by common
                                         model from Malicious to Benign. We extensively evaluate            operations applied in feature space and enforces at the
                                         the success of our attacks, and how they depend on several         same time the physical-world constraints of specific appli-
                                         optimization objectives and imbalance ratios in the training       cations. At the core of our framework is an iterative opti-
                                         data.
                                                                                                            mization method that determines the feature of maximum
                                                                                                            gradient of attacker’s objective at each iteration, identifies
                                                                                                            the family of features dependent on that feature, and
                                         1. Introduction                                                    modifies consistently all the features in the family, while
                                                                                                            preserving an upper bound on the maximum distance and
                                             Deep learning has reached super-human performance              respecting the physical-world application constraints.
                                         in machine learning (ML) tasks for classification in diverse            Our general framework needs minimum amount of
                                         domains, including image classification, speech recogni-           adaptation for new applications. To demonstrate this, we
                                         tion, and natural language processing. Still, deep neural          apply our framework to two distinct applications. The first
                                         networks (DNNs) are not robust in face of adversarial              is a malicious network traffic classifier for botnet detection
                                         attacks, and their vulnerability has been demonstrated ex-         (using a public dataset [27]), in which an attacker can
                                         tensively in many applications, with the majority of work          insert network connections on ports of his choice that re-
                                         in adversarial ML being performed in image classification          spect the physical network constraints (e.g., TCP and UDP
                                         tasks (e.g, [65], [10], [29], [42], [54], [14], [48], [5]).        packet sizes) and a number of mathematical dependencies.
                                             ML started to be used more extensively in cyber                The second application is malicious domain classification
                                         security applications in academia and industry, with the           using features extracted from web proxy logs (collected
                                         emergence of a a new field called security analytics.              from a large enterprise) that involves a number of sta-
                                         Among the most popular applications of ML in cyber                 tistical and mathematical dependencies in feature space.
                                         security we highlight malware classification [6], [9], [56],       We demonstrate that the attacks are successful in both
                                         malicious domain detection [47], [12], [3], [7], [51], and         applications, with minimum amount of perturbation. For
                                         botnet detection [32], [66]. In most of these applications,        instance, by inserting 12 network connections an attacker
                                         the raw security datasets (network traffic or host logs)           can change the classification prediction from Malicious to
                                         are not used directly as input to the DNN, but instead an          Benign in the first application. We perform detailed evalu-
                                         intermediate feature extraction layer is defined by domain         ation to test: (1) if our attacks perform better than several
                                         experts to generate inputs for neural networks (or other           baselines; (2) if the selection of the optimization objective
                                         ML models). There are efforts to automate the feature              impacts the attack success rate; (3) how the imbalance
                                         engineering aspect (e.g., [38], but it is not yet a common         ratio between the Malicious and Benign classes in training
                                         practice. One of the challenges of adapting ML to work             changes the success of the attack; (4) if features modified
                                         in these domains is the large class imbalance during               by the attack are the features with highest importance.
                                         training [7]. Therefore, adversarial attacks designed on           We also test several approaches for performing the attack
                                         continuous domains (for instance, in image classification)         under a weaker threat model, through transferability from
                                         need to be adapted to take into account the specifics of           a substitute model to the original one, or by adapting
                                         cyber security applications.                                       existing black-box attacks. Finally, we test the resilience
                                             Initial efforts to design adversarial attacks at testing       of adversarial training as a defensive mechanism in this
                                         time (called evasion attacks) for discrete domains are             setting.
                                         underway in the research community. Examples include                    To summarize, our contributions are:

1)    We introduce a general evasion attack framework       attack model, in which the attacker has information about
         for cyber security that respects mathematical fea-    the feature representation of the underlying classifier, but
         ture dependencies and physical-world constraints.     not exact details on the ML algorithm and training data.
   2)    We apply our framework with minimal adapta-               In the considered applications, training data comes
         tion to two distinct applications using different     from security logs ollected at the border of an enterprise
         datasets and feature representations: a malicious     or campus network. We assume that the attacker compro-
         network connection classifier, and a malicious        mises at least one machine on the internal network, from
         domain detector, to generate feasible adversarial     where the attack is launched. The goal of the attacker is
         examples in these domains.                            to modify its network connections to evade the classifier’s
   3)    We extensively evaluate our proposed framework        Malicious prediction in a stealthy manner (i.e., with mini-
         for these applications and quantify the amount        mum perturbation). We assume that the attacker does not
         of effort required by the attacker to bypass the      have access to the security monitor that collects the logs.
         classifiers, for different optimization objectives    That would result in a much more powerful attack, which
         and training data imbalance ratios.                   can be prevented with existing techniques (e.g., [13]).
   4)    We evaluate the transferability of the proposed
         evasion attacks between different ML models           2.3. Evasion Attacks against Deep Neural Net-
         and architectures and test the effectiveness of       works
         performing black-box attacks.
   5)    We measure the resilience of adversarially-               We describe several evasion attacks against DNNs:
         trained models against our attacks.                   projected gradient descent-based attacks and the penalty-
Organization. We provide background material in Sec-           based attack of Carlini and Wagner.
tion 2. We discuss the challenges for designing evasion        Projected gradient attacks. This is a class of attacks
attacks in cyber security and introduce our general frame-     based on gradient descent for objective minimization, that
work in Section 3. We instantiate our framework for the        project the adversarial points to the feasible domain at
two applications of interest in Section 4. We extensively      each iteration. For instance, Biggio et al. [10] use an
evaluate our framework in Sections 5 and 6, respectively.      objective that maximizes the confidence of adversarial
Finally, we discuss related work in Section 7 and conclude     examples, within a ball of fixed radius in L1 norm. Madry
in Section 8.                                                  et al. [48] use the loss function directly as the optimization
                                                               objective and use the L2 and L∞ distances for projection.
2. Background                                                  C&W attack. Carlini and Wagner [14] solve the follow-
                                                               ing optimization problem to create adversarial example
2.1. Deep Neural Networks for Classification                   against CNNs used for multi-class prediction:
                                                                          δ = arg min ||δ||2 + c · h(x + δ)
    A feed-forward neural network (FFNN) for binary            h(x + δ) = max(0, max(Zk (x + δ) : k 6= t) − Zt (x + δ)),
classification is a function y = F (x) from input x ∈ Rd                where Z() are the logits of the DNN.
(of dimension d) to output y ∈ {0, 1}. The parameter
vector of the function is learned during the training phase         This is called the penalty method, and the optimization
using back propagation over the network layers. Each           objective has two terms: the norm of the perturbation
layer includes a matrix multiplication and non-linear acti-    δ , and a function h(x + δ) that is minimized when the
vation (e.g., ReLU). The last layer’s activation is sigmoid    adversarial example x + δ is classified as the target class
σ for binary classification: y = F (x) = σ(Z(x)), where        t. The attack works for L0 , L2 , and L∞ norms.
Z(x) are the logits, i.e., the output of the the penultimate
layer. We denote by C(x) the predicted class for x. For        3. Methodology
multi-class classification, the last layer uses a softmax
activation function.                                               In this section, we start by describing the classification
                                                               setting in cyber security analytics. Then we devote the
2.2. Threat Model                                              majority of the section to describe evasion attacks for
                                                               cyber security, mention challenges of designing them, and
    Adversarial attacks against ML algorithms can be           present our new attack framework that takes into consid-
developed in the training or testing phase. In this work, we   eration the specific constraints of security applications.
consider testing-time attacks, called evasion attacks. The
DNN model is trained correctly and the attacker’s goal         3.1. ML classification in cyber security
is to create adversarial examples at testing time. In secu-
rity settings, typically the attacker starts with Malicious         In standard computer vision tasks such as image clas-
points that he aims to minimally modify into adversarial       sification, the raw data (image pixels) is used directly as
examples classified as Benign.                                 input into the neural network models. In contrast, in cyber
    We consider initially for our optimization framework       security, domain expertise is still required to generate
a white-box attack model, in which the attacker has full       intermediate features from the raw data (e.g., network
knowledge of the ML system. White-box attacks have             traffic or endpoint data) (see Figure 1).
been considered extensively in previous work, e.g., [29],           ML is commonly used in cyber security for classifi-
[10], [14], [48] to evaluate the robustness of existing ML     cation of Malicious and Benign activity (e.g., [47], [12],
classification algorithms. We also consider a more realistic   [51]). A raw dataset R is initially collected (for example,

Figure 1: Neural network training for image classification (left) and for cyber security analytics (right).

pcap files or Netflow logs), and feature extraction is Several previous work address evasion attacks in dis-
performed by applying different operators, such as Max, crete domains. The evasion attack for malware detection
Min, Avg, and Total. The training dataset Dtr has N train- by Grosse et al. [30], which directly leverages JSMA [54],
ing examples: Dtr = {(x(1) , L(1) ), . . . , (x(N ) , L(N ) )}, modifies binary features corresponding to system calls.
each example x(i) being a d-dimensional feature vector: Kolosnjaji et al. [39] use the attack of Biggio et al. [10]
(i) (i)
x(i) = (x1 , . . . , xd ). Features of the training dataset are to append selected bytes at the end of the malware file.
most of the time obtained by application of operator Opj Suciu et al. [63] also append bytes in selected regions of
(i)
on the raw data xj = Opk (R). The set of all supported malicious files. Kulynych et al. [41] introduce a graphical
operators or functions applied to the raw data is denoted framework in which an adversary constructs all feasible
by O. A data point x = (x1 , . . . , xd ) in feature space is transformation of an input, and then uses graph search
feasible if there exists some raw data r such as for all j , to determine the path of minimum cost to generate an
there exists a operator Opk ∈ O with xj = Opk (r). The adversarial example.
set of all feasible points for raw data R and operators Neither of these approaches are applicable to our gen-
O are called Feasible Set(R, O). An example of feasible eral setting. First, in the considered applications features
and unfeasible points is illustrated in Table 1. have numerical values and the evasion attacks developed
for malware binary features [30], [39], [63] are not ap-
Feature Feasible Infeasible plicable. Second, none of these attacks guarantees the
Frac empty 0.2 0.5 feasibility of the resulting adversarial vector in terms of
Frac html 0.13 0.13 mathematical relationships between features. We believe
Frac image 0.33 0.33
Frac other 0.34 0.4
that crafting adversarial examples that are feasible, and
respect all the application constraints and dependencies to
TABLE 1: Example of feasible and infeasible features. be a significant challenge. Once application constraints are
The features denote the fraction of URLs under a domain specified, the resulting optimization problem for creating
that have certain content type (e.g., empty, html, image, adversarial examples includes a number of non-linear
and other). The sum of all the features is 1 in the feasible constraints and cannot be solved directly using out-of-
example, but exceeds 1 in the unfeasible one. the-box optimization methods.

As in standard supervised learning, the training exam- 3.3. Overview of our approach
ples are labeled with a class L(i) , which is either Malicious
or Benign. Malicious examples are obtained by different
To address these issues, we introduce a framework
methods, including using blacklists, honeypots, or running
for evasion attacks that preserves a range of feature de-
malware in a sandbox. A supervised ML model (classifier)
pendencies and guarantees that the produced adversarial
f is selected by the learning algorithm from a space of
examples are within the feasible region of the domain.
possible hypothesis H to minimize a certain loss function
Our framework supports two main types of constraints:
on the training set.
Mathematical feature dependencies: These are dependen-
cies created in the feature extraction layer. For instance,
3.2. Limitations and challenges by applying several mathematical operators (Max, Min,
Total) over a set of raw log data, we introduce feature
Existing evasion attacks are mostly designed and dependencies. See the example in Figure 3 for Bro (or
tested for image classification, where adversarial examples Zeek) connection log events and several dependent fea-
have pixel values in a fixed range (e.g., [0,1]) and can be tures constructed using these operators. For instance, a
modified independently in continuous domains [14], [48], Bro connection includes the number of packets sent and
[5]. However, most security datasets are discrete, resulting received, and we define the Min, Max, and Total number
in feature dependencies and physical-world constraints to of packets sent and received by the same source IP on
ensure certain application functionality. a particular port (within a fixed time window). We use

the terminology family of features to denote a subset of function UPDATE DEP (line 32). We need to define the
features that are inter-connected and need to be updated function UPDATE DEP for each application, but we use a
simultaneously. For the Bro example, the features defined set of building blocks that are reusable. Once all features
for each port (e.g., 80, 53, 22) are dependent as they are in the family have been updated, there is a possibility
generated from all the connections on that port. that the update data point exceeds the allowed distance
Physical-world constraints: These are constraints imposed threshold from the original point. If that is the case, the
by the real-world application. For instance, in the case of algorithm backtracks and performs a binary search for the
network traffic, a TCP packet has maximum size 1500 amount of perturbation added to the representative feature
bytes. (until it finds a value for which the modified data point is
Our starting point for the attack framework are inside the allowed region).
gradient-based optimization algorithms, including pro- 2. If the feature of maximum gradient does not belong to
jected [10], [48] and penalty-based [14]. Of course, we any feature family, then it can be updated independently
cannot apply these attacks directly since they will not from other features. The feature is updated using the
preserve the feature dependencies. To overcome this, we standard gradient update rule (line 13). This is followed
use the values of the objective gradient at each iteration by a projection Π2 within the feasible ball in L2 norm.
to select features of maximum gradient values. We create We currently support two optimization objectives:
feature-update algorithms for each family of dependencies Objective for Projected attack. We set the objective
that use a combination of gradient-based method and G(x) = Z1 (x), where Z1 is the logit for the Malicious
mathematical constraints to always maintain a feasible class, and Z0 = 1 − Z1 for the Benign class:
point that satisfies the constraints. We also use various
δ = arg min Z1 (x + δ),
projection operators to project the updated adversarial
s.t. ||δ||2 ≤ dmax ,
examples to feasible regions of the feature space.
x + δ ∈ Feasible Set(R, O)

3.4. Proposed Evasion Attack Framework We need to ensure that the adversarial example is in the
feasible set to respect the imposed constraints.
We introduce here our general evasion attack frame- Objective for Penalty attack. The penalty objective for
work for creating adversarial examples at testing time for binary classification is equivalent to:
binary classifiers. In the context of security applications, δ = arg min ||δ||2 + c · max(0, Z1 (x + δ)),
the main goal of the attacker is to ensure that a Malicious x + δ ∈ Feasible Set(R, O)
data point is classified as Benign after applying a min-
imum amount of perturbation to it. We consider binary Our general evasion attack framework can be used for
classifiers designed using FFNN architectures. For mea- different classifiers, with different features and constraints.
suring the amount of perturbation added by the original The components that need to be defined for each applica-
example, we use the L2 norm. tion are: (1) the optimization objective G for computing
Algorithm 1 and Figure 2 describes the general frame- adversarial examples; (2) the families of dependent fea-
work. The input consists of: an input sample x with label y tures and family representatives; (3) the UPDATE DEP
(typically Malicious in security applications); a target label function that performs feature updates per family; (4) the
t (typically Benign); the model prediction function C ; the projection operation to respect the constraints.
optimization objective G; maximum allowed perturbation
dmax ; the subset of features FS that can be modified; the 4. Evasion Attacks for Concrete Security Ap-
features that have dependencies FD ⊂ FS ; the maximum plications
number of iterations M and a learning rate α for gradient
descent. The set of features with dependencies are split We describe in this section our framework instantiated
into families of features. A family is defined as a subset of to two cyber security applications, a malicious network
FD such that features within the family need to be updated connection classifier, and a malicious domain classifier.
simultaneously, whereas features outside the family can be We emphasize that our framework is applicable to other
updated independently. security applications, such as malware classification, web-
The algorithm proceeds iteratively. The goal is to site fingerprinting, and malicious communication detec-
update the data point in the direction of the gradient (to tion. For each of these, the application-specific constraints
minimize the optimization objective), while preserving need to be encoded and respected when feature updates
the family dependencies, as well as the physical-world are performed.
constraints. In each iteration, the gradients of all modifi-
able features are computed, and the feature of maximum 4.1. Malicious Connection Classifier
gradient is selected. The update of the data point x in the
direction of the gradient is performed as follows: Network traffic includes important information about
1. If the feature of maximum gradient belongs to communication patterns between source and destination
a family with other dependent features, function IP addresses. Classification methods have been applied
UPDATE FAMILY is called (line 10). Inside the function, to labeled network connections to determine malicious
the representative feature for the family is computed (this infections, such as those generated by botnets [12], [7],
needs to be defined for each application). The representa- [35], [51]. Network data comes in a variety of formats,
tive feature is updated first, according to its gradient value, but the most common include net flows, Bro logs, and
followed by updates to other dependent features using packet captures.

Figure 2: Evasion Attack Framework

                              Time        Src IP            Dst IP        Prot.     Port     Sent      Recv.       Sent    Recv. Duration
                                                                                             bytes     bytes      packets packets
                              9:00:00   147.32.84.59     77.75.72.57        TCP       80      1065      5817            10           11      5.37

                   Raw Bro    9:00:03   147.32.84.59     81.27.192.20      UDP        53       48        48             1             1     0.0012
                   log data   9:00:05   147.32.84.59    87.240.134.159      TCP       80       950       340            7             5     25.25
                              9:00:12   147.32.84.59      77.75.77.9        TCP       80      1256       422            5             5     0.0048

                                                             Port 22         Port 80         Port 53         Port 443

                          Family of features              Packet                    Bytes            Duration                 Traffic
                             for port 80                 features                 features           features                statistics

                                                                                                                  Operator

                                                Min    Max     Sum                                              Min      Max         Sum
                              Sent Packets      5      10      22        Representative         Sent Bytes      950      1256        3271
                                                                            feature
                              Recv. Packets     5      11      21                               Recv. Bytes     340      5817        6579

            Figure 3: Example of Bro logs and feature family per port for malicious connection classifier.

Problem definition: dataset and features. We leverage                                      applications, including: HTTP (80), SSH (22), and DNS
a public dataset of botnet traffic that was captured in at                                 (53). We also add a category called OTHER for connec-
the CTU University in the Czech Republic, called CTU-                                      tions on other ports. We aggregate the communication on
13 dataset [27]. The dataset include Bro connection logs                                   a port based on a fixed time window (the length of which
with communications between internal IP addresses (on                                      is a hyper-parameter). For each port, we compute traffic
the campus network) and external ones. The dataset has                                     statistics using operators such as Max, Min, and Total
the advantage of providing ground truth, i.e., labels of                                   separately for outgoing and incoming connections. See
Malicious and Benign IP addresses. The goal of the clas-                                   the example in Figure 3, in which features extracted for
sifier is to distinguish Malicious and Benign IP addresses                                 each port define a family of dependent features. These
on the internal network.                                                                   are statistical dependencies between features, which need
     The fields available in Bro connection logs are given in                              to be preserved upon performing the attack. We obtain a
Figure 3. They include: the timestamp of the connection                                    total of 756 aggregated traffic features on these 17 ports.
start; the source IP address; the source port; the desti-
nation IP address; the destination port; the number of                                     Physical constraints. We assume that the attacker con-
packets sent and received; the number of bytes sent and                                    trols the victim IP on the internal network (e.g., it was
received; and the connection duration (the time difference                                 infected by a botnet). The attacker thus can determine
between when the last packet and first packets are sent).                                  what network traffic the victim IP will generate. As there
A TCP connection has a well-defined network meaning                                        are many legitimate applications that generate network
(a connection established between two IP addresses using                                   traffic, we assume that the attacker can only add network
TCP), while for UDP Bro aggregates all packets sent                                        connections (a safe assumption to preserve the functional-
between source and destination IPs in a certain time                                       ity of the legitimate applications). When adding network
interval (e.g., 30 seconds) to form a connection.                                          connections, the attacker has some leverage in choosing
     A standard method for creating network features is                                    the external IP destination, the port on which it communi-
aggregation by destination port to capture relevant traffic                                cates, the transport protocol (TCP or UDP), and how many
statistics per port (e.g., [27], [50]). This is motivated by                               packets and bytes it sends to the external destination. The
the fact that different network services and protocols run                                 attacker’s goal is to have his connection feature vector
on different ports, and we expect ports to have different                                  classified as Benign. When adding network connections,
traffic patterns. We select a list of 17 ports for popular                                 the attacker needs to respect physical constraints imposed

Algorithm 1 Framework for Evasion Attack with Con-             Algorithm 2 Malicious Connection Classifier Attack
straints                                                       Require: x: data point in iteration m
Require: x, y : the input sample and its label;                          p: port updated in iteration m
            t: target label;                                             xTCP /xUDP : number of TCP / UDP connections
            C : prediction function;                               on p
            G: optimization objective;                                   xtot
                                                                           bytes : number of sent bytes on p
            dmax : maximum allowed perturbation;                         xmin
                                                                           bytes : min number of sent bytes on port p
            FS : subset of features that can be modified                 xmax
                                                                           bytes : max number of sent bytes on port p
            FD : features in FS that have dependencies;                  xtot     min max
                                                                           dur /xdur /xdur : total/min/max duration on p
            M : maximum number of iterations;                            ∇: gradient of objective with respect to x
            α: learning rate.                                            c1 , c2 : TCP and UDP connections added
Ensure: x∗ : adversarial example or ⊥ if not successful.        1: function INIT FAMILY (m, xm , ∇, j )
  1: Initialize m ← 0; x0 ← x
                                                                         // Add TCP connections if allowed
  2: // Iterate until successful or stopping condition
                                                                2:     if ∇TCP < 0 and IS ALLOWED(TCP, p) then
  3: while C(xm )! = t and m < M do
                                                                3:          xTCP ← xTCP + c1
  4:      ∇ ← [∇Gxi (xm )]i // Gradient vector
                                                                         // Add UDP connections if allowed
  5:      ∇S ← ∇FS // Gradients of features in FS
                                                                4:     if ∇UDP < 0 and IS ALLOWED(UDP, p) then
  6:      imax ← argmax∇S // Feature of max gradient
                                                                5:          xUDP ← xUDP + c2
  7:      // Check if feature has dependencies
  8:      if imax ∈ FD then                                     6: function UPDATE DEP(s, xm , ∇, Fimax )
  9:           // Update dependent features                     7:     // Compute gradient difference in sent bytes
10:            xm+1 ← UPDATE FAMILY(m, xm , ∇, imax )           8:     ∆b ← −∇tot     bytes
11:       else                                                  9:     // Project to respect physical constraints
12:            Gradient update and projection                  10:     ∆b ← PROJECT(∆b , c1 · tcp min + c2 ·
               xm+1        m                                       udp min, c1 · tcp max + c2 · udp max)
13:              imax ← ximax − α∇imax
14:            x m+1
                      ← Π2 (xm+1 )                             11:     xtot           tot
                                                                         bytes ← xbytes + ∆b
                                                                         // Update Min and Max dependencies for sent
15:       FS ← FS \ {imax }                                        bytes
16:       m←m+1                                                        xmin                 min
                                                               12:       bytes ← Min(xbytes , ∆b /nconn )
17:       if C(xm ) = t then                                           xbytes ← Max(xmax
                                                                         max
18:            return x∗ ← xm
                                                               13:                           bytes , ∆b /nconn )
                                                                         // Update duration
19: return ⊥
                                                               14:     ∆d ← −∇d
20: function UPDATE FAMILY (m, xm , ∇, imax )
                                                               15:     ∆d ← PROJECT(∆d , c1 · tcp dmin · +c2 ·
21:       // Extract all dependent features on imax                udp dmin·, c1 · tcp dmax · +c2 · udp dmax·)
22:       Fimax ← Family Dep(imax )                                    xtot         tot
                                                               16:       dur ← xdur + ∆d
23:       // Family representative feature                             xdur ← Min(xmin
                                                                         min
                                                               17:                         dur , ∆d /nconn )
24:       j ← Family Rep(Fimax )                                       xmax                 max
                                                               18:       dur ← Max(xdur , ∆d /nconn )
25:       δ ← ∇j // Gradient of representative feature
26:       // Initialization function
27:       s ← INIT FAMILY(m, xm , ∇, j)
28:       // Binary search for perturbation                    thus control the duration of the connection by sending
29:       while δ 6= 0 do                                      packets at certain time intervals (to avoid closing the
30:            xm        m
                 j ← xj − αδ // Gradient update                connection). We generate a range of valid protocol spe-
31:            x ← UPDATE DEP(s, xm , ∇, Fimax )
                 m                                             cific durations per packet range [tcp dmin, tcp dmax] and
32:            if d(xm , x0 ) > dmax then                      [udp dmin, udp dmax] from the distribution of connec-
33:                // Reduce perturbation                      tion duration in the training dataset.
34:                δ ← δ/2
                                                               Attack algorithm. The attack algorithm follows the
35:            else
                                                               framework from Algorithm 1, with the specific functions
36:                return xm
                                                               defined in Algorithm 2. First, the feature of maximum
                                                               gradient is determined and the corresponding port is
by network communication, as outlined below:                   identified. The family of dependent features are all the
                                                               features computed for that port. The attacker attempts to
1. Use TCP and UDP protocols only if they are allowed          add a fixed number of connections on that port (which
on certain ports. For example, on port 995 both TCP and        is a hyper-parameter of our system). This is done in the
UDP are allowed, but port 465 is specific to TCP.              INIT FAMILY function (see Algorithm 2). The attacker
2. The TCP and UDP packet sizes are capped at 1500             can add either TCP, UDP or both types of connections,
bytes. We thus create range intervals for these values:        according to the gradient sign for these features and also
[tcp min, tcp max] and [udp min, udp max].                     respecting network-level constraints. The representative
3. The duration of the connection is defined as the interval   feature for a port’s family is the number of packets that
between when the last packet and the first packet is           the attacker sends in a connection. This feature is updated
sent between source and destination. If the connection         by the gradient value, following a binary search for per-
is idle for some time interval (e.g., 30 seconds), then it     turbation δ , as specified in Algorithm UPDATE FAMILY.
is closed by default in the Bro logs. The attacker can             In the UPDATE DEP function an update to the ag-

Feature Description
NIP Number of IPs contacting the domain
added. We support other families of dependencies, among
Num Conn Total number of connections which one that has includes both statistical and ratio
Avg Conn Average number of connections by an IP dependencies (see the definition of the ratio features for
Total Sent Bytes Total number of sent bytes bytes sent over received). We omit here the details. The
Total Recv Bytes Total number of received bytes important observation here is that the constraints update
Avg Ratio Bytes Average ratio bytes sent over received by an IP
Min Ratio Bytes Min ratio of bytes sent over received by an IP functions are reusable across applications, and they can
Max Ratio Bytes Max ratio of bytes sent over received by an IP be extended to support new mathematical dependencies.
Frac empty Fraction of connections with empty content type
Frac html Fraction of connections with html content type Algorithm 3 Malicious Connection Classifier Attack
Frac img Fraction of connections with image content type
Frac other Fraction of connections with other content type Require: x: data point in iteration m
1: function UPDATE DEP(s, xm , ∇, Fimax )
TABLE 2: Example families of features (Connections, 2: if s == Stat then
Bytes, and Content) for malicious domains. 3: Update Stat(xm , ∇, Fimax )
4: if s == Ratio then
5: Update Ratio(xm , ∇, Fimax )
gregated port features is performed. The difference in the
6: function Update Stat(xm , ∇, F )
total number of bytes sent by the attacker is determined
7: Parse F as: T (total number of events); N (number
from the gradient, followed by a projection operation to be
of entities); XT , Xmin , Xmax , Xavg (the total, min,
within the feasible range for TCP and UDP packet sizes
max, and average number of events per entity).
(function PROJECT). The PROJECT function takes an
8: // XT is representative feature.
input a value x and a range [a, b]. It projects x to the
9: XT0 ← Π(XT − α∇T )
interval [a, b] (if x ∈ [a, b], it returns x; if x > b, it PN
returns b; otherwise it returns a). The duration is also set 10: XN +1 ← XT0 − i=1 Xi
according to the gradient, again projecting based on lower 11: Xmin ← min(Xmin , XN +1 )
and upper bounds computed from the data distribution. 12: Xmax ← max(Xmax , XN +1 )
The port family includes features such as Min and Max 13: N ← N + 1; XT ← XT0
sent bytes and connection duration. These need to be 14: function Update Ratio(xm , ∇, F )
updated because we add new connections, which might 15: Parse FPas: N, Nr , X1 , . . . , XN such that: Xi =
N
include higher or lower values for sent bytes and duration. Ni /N and i=1 Xi = 1.
We assume that the attacker communicates with an 16: // Xr is representative feature
external IP under its control (for instance, the command- 17: Nr0 ← Π(Nr − α∇r )
and-control IP), and thus has full control on the malicious 18: N ← N + Nr0 − Nr
traffic. For simplicity, we set the number of received 19: Xr ← Π(Nr0 /N )
packets and bytes to 0, assuming that the malicious IP 20: Xi ← (dXi · N e)/N, ∀i 6= r
does not respond to these connections. 21: Nr ← Nr0

4.2. Malicious Domain Classifier 5. Experimental evaluation for malicious do-
main classifier
Problem definition: dataset and features. The second
application is to classify domain names contacted by an One of the main challenges in evaluating our work
enterprise hosts as Malicious or Benign. We use a dataset is the lack of standard benchmarks for security analytics.
from [51], that was collected by a company that includes We first obtain access to a proprietary enterprise dataset
89 domain features extracted from HTTP proxy logs and from a security company, with features defined by domain
domain labels. The features come from 7 families, and we experts. This dataset is based on real enterprise traffic,
include an example of several families in Table 2. includes labels of malicious domains, and is highly imbal-
Attack algorithm. In this application, we do not have anced. Secondly, we use a smaller public dataset (CTU-
access to the raw HTTP traffic, only to features extracted 13) to make our results reproducible. CTU-13 includes
from it. The majority of constraints are mathematical labeled Bro (Zeek) log connections for different botnet
constraints in the feature space. The attack algorithm scenarios merged with legitimate campus traffic.
follows the framework from Algorithm 1, with the specific We first perform our evaluation on the enterprise
functions defined in Algorithm 3. The families of features dataset, starting with a description of the dataset in Sec-
have various dependencies, as illustrated in the Connection tion 5.1, ML model selection in Section 5.2, and evasion
and Content families. For Connection we have statistical attack results in Section 5.3. We show initial results on
constraints (computing min, max, average values over adversarial training in Section 5.4.
a number of events), while for Content we have ratio
constraints (ensuring that the sum of all ratio values equals 5.1. Enterprise dataset
to 1). We assume that we add events to the logs (and never
delete or modify existing events). For instance, we can The data for training and testing the models was
insert more connections, as in the malicious connection extracted from security logs collected by web proxies at
classifier. Function Update Stat shows how the statistical the border of a large enterprise network with over 100,000
features are modified, while function Update Ratio shows hosts. The number of monitored external domains in the
how the ratio features are modified if a new event is training set is 227,033, among which 1730 are classified as

Malicious and 225,303 are Benign. For training, we sam-
pled a subset of training data to include 1230 Malicious
domains, and different number of Benign domains to get
several imbalance ratios between the two classes (1, 5,
15, 25, and 50). We used the remaining 500 Malicious
domains and sampled 500 Benign domains for testing the
evasion attack. Overall, the dataset includes 89 features
from 7 categories.
We assume that the attacker controls the malicious
domain and all the communication from the victim ma-
chines to that domain, so it can change the commu- (a) Model comparison (balanced data).
nication patterns to the malicious domain. Among the
features included in the dataset, we determined a set
of 31 features that can be modified by an attacker (see
Table 15 in Appendix for their description). These include
communication-related features (e.g., number of connec-
tions, number of bytes sent and received, etc.), as well
as some independent features (e.g., number of levels in
the domain or domain registration age). Other features in
the dataset (for examples, those using URL parameters or
values) are more difficult to change, and we consider them
immutable during the evasion attack. (b) Imbalance results.

5.2. Model Selection Figure 4: Training results for malicious domain classifier.

Hyper-parameter selection. We first evaluate three stan- of random forest, but it might be possible to improve
dard classifiers with different hyper-parameters (logis- these results with additional effort (note that for higher
tic regression, random forest, and FFNN). The hyper- imbalance ratio the performance of FFNN improves, as
parameters for logistic regression and random forests are shown in Figure 4b). For the remainder of the section, we
in Tables 13 and 14 from the Appendix. For logistic focus our discussion on the robustness of FFNN models.
regression, the maximum AUC score of 87% is achieved Comparison of class imbalance for FFNN. Since the
with L1 regularization with inverse regularization 2.89. issue of class imbalance is a known challenge in cyber
For random forest, the maximum AUC of 91% is ob- security [7], we analyze the model accuracy as a function
tained with Gini Index criterion, maximum tree depth 13, of imbalance ratio, showing the ROC curves in Figure 4b.
minimum number of samples in leaves 3, and minimum Interestingly, the performance of the model increases to
samples for split 8. 93% AUC for imbalance ratio up to 25, after which it
The architectures used for FFNN are illustrated in starts to decrease (with AUC of 83% at a ratio of 50).
Table 3. The best performance was achieved with 2 hidden Our intuition is that the FFNN model achieves better
layers with 80 neurons in the first layer, and 50 neurons performance when more training data is available (up to
in the second layer. ReLU activation function is used after a ratio of 25). But once the Benign class dominates the
all hidden layers except for the last layer, which uses Malicious one (at ratio of 50), the model performance
sigmoid (standard for binary classification). We used the starts to degrade.
Adam optimizer and SGD with different learning rates.
The best results were obtained with Adam and learning 5.3. Robustness to evasion attacks
rate of 0.0003. We ran training for 75 epochs with mini-
batch size of 32. As a result, we obtained the model with After we train our models, we use a testing set of
AUC score 89% in cross-validation accuracy. 500 Malicious and 500 Benign data points to evaluate the
evasion attack success rate. We vary the maximum allowed
Hyperparameter Value
Architecture 1 layer [80], [64], [40]
perturbation expressed as an L2 norm and evaluate the
Architecture 2 layers [80, 60], [80, 50], success of the attack. We evaluate the two optimization
[80, 40], [64, 32], [48, 32] objectives for Projected and Penalty attacks and compare
Architecture 3 layers [80, 60, 40] with several baselines. We also run directly the C&W
Optimizer Adam, SGD
Learning Rate [0.0001, 0.01]
attack and show that it results in infeasible adversarial
examples (as expected). We evaluate the success rate of
TABLE 3: DNN Architectures, epochs = 75 the attacks for different imbalance ratios. We also perform
some analysis of the features that are modified by the
Model comparison. After performing model selection attack, and if they correlate with feature importance. We
for each type of model, we compare the three best re- show an adversarial example generated by our method
sulting models. Figure 4a shows the ROC curves and and discuss how optimization-based attack performs under
AUC scores for a 1:1 imbalance ratio (with the same weaker threat models.
number of Malicious and Benign points used in training). Existing Attack. We run the existing C&W attack [14] on
The performance of FFNN is slightly worse than that our data in order to measure if the adversarial examples

(a) Comparison to two baselines. (b) ROC curves under attack. (c) Imbalance sensitivity.
Figure 5: Projected attack results.

are feasible. While the performance of the attack is high experiment, we select 62 test examples which all models
and reaches 98% at distance 20 (for the 1:1 balanced case), (trained for different imbalance ratios) classified correctly
the resulting adversarial examples are outside the feasibil- before the evasion attack. The results are illustrated in
ity region. An example is included in Table 4, showing Figure 5c. At L2 distance 20, the evasion attack achieves
that the average number of connections is not equal to 100% success rate for all ratios except 1. Additionally,
the total number of connections divided by the number of we observe that with higher imbalance, it is easier for the
IPs. Additionally, the average ratio of received bytes over attacker to find adversarial examples (at fixed distance).
sent bytes is not equal to maximum and minimum values One reason is that models that have lower performance
of ratio (as it should be when the number of IPs is 1). (as the one trained with 1:50 imbalance ratio) are easier
to attack. Second, we believe that as the imbalance gets
Feature Input Adversarial Correct Value higher the model becomes more biased towards the major-
Example
NIP 1 1 1
ity class (Benign), which is the target class of the attacker,
N Conn 15 233.56 233.56 making it easier to cross the decision boundary between
Avg Conns 15 59.94 233.56 classes.
Avg Ratio Bytes 8.27 204.01 204.01
Max Ratio Bytes 8.27 240.02 204.01
Penalty attack results. We now discuss the results
Min Ratio Bytes 8.27 119.12 204.01 achieved by applying our attack with the Penalty objective
on the testing examples. Similar to the Projected attack,
TABLE 4: Adversarial example generated by C&W. The we compare the success rate of the Penalty attack to the
example is not consistent in the connection and ratio of two types of baseline attacks (for balanced classes), in Fig-
bytes features, as highlighted in red. The correct value is ure 6a (using the 412 Malicious testing examples classified
shown for a feasible example in green. correctly). Overall, the Penalty objective is performing
worse than the Projected one, reaching 79% success rate
Projected attack results. We evaluate the success rate at L2 distance of 20. We observe that in this case both
of the attack with Projected objective first for balanced baselines perform worse, and the attack improves upon
classes (1:1 ratio). We compare in Figure 5a the attack both baselines significantly. The decrease of the model’s
against two baselines: Baseline 1 (in which the features performance under the Penalty attack is illustrated in
that are modified iteratively are selected at random), and Figure 6b (for 500 Malicious and 500 Benign testing
Baseline 2 (in which, additionally, the amount of per- examples). While AUC is 0.87 originally on the testing
turbation is sampled from a standard normal distribution dataset, it decreases to 0.59 under the evasion attacks
N (0, 1)). The attacks are run on 412 Malicious testing at maximum allowed perturbation of 7. Furthermore, we
examples classified correctly by the FFNN. The Projected measure the attack success rate at different imbalance
attack improves both baselines, with Baseline 2 perform- ratios in Figure 6c (using the 62 testing examples clas-
ing much worse, reaching success rate 57% at distance sified correctly by all models). For each ratio value we
20, and Baseline 1 having success 91.7% compared to searched for the best hyper-parameter c between 0 and
our attack (98.3% success). This shows that the attacks 1 with step 0.05. Here, as with the Projected attack, we
is still performing reasonably if feature selection is done see the same trend: as the imbalance ratio gets higher,
randomly, but it is very important to add perturbation to the attack performs better, and it works best at imbalance
features consistent with the optimization objective. ratio of 50.
We also measure in Figure 5b the decrease of the Attack comparison. We compare the success rate of our
model’s performance before and after the evasion attack attack using the two objectives (Projected and Penalty)
at different perturbations (using 500 Malicious and 500 with the C&W attack, as well as an attack we call Post-
Benign examples not used in training). While AUC score processing. The Post-processing attack runs directly the
is 0.87 originally, it drastically decreases to 0.52 under original C&W developed for continuous domains, after
evasion attack at perturbation 7. This shows the significant which it projects the adversarial example to the feasible
degradation of the model’s performance under evasion space to enforce the constraints. In the Post-processing
attack. attack, we look at each family of dependent features, keep
Finally, we run the attack at different imbalance ratios the value of the representative feature as selected by the at-
and measured its success for different perturbations. In this tack, but then modify the values of the dependent features

(a) Comparison to two baselines. (b) ROC curves under attack. (c) Imbalance sensitivity.
Figure 6: Penalty attack results.

axis) and feature importance (right axis). We observe that
features of higher importance are chosen more frequently
by the optimization attack. However, since we are modify-
ing the representative feature in each family, the number
of modifications on the representative feature is usually
higher (it accumulates all the importance of the features
in that family). For the Bytes family, feature 3 (number
of received bytes) is the representative feature and it is
updated more than 350 times. However, for features that
have no dependencies (e.g., 68 – number of levels in
the domain, 69 – number of sub-domains, 71 – domain
Figure 7: Malicious domain classifier attacks. registration age, and 72 – domain registration validity), the
number of updates corresponds to the feature importance.
Feature Original Adversarial
NIP 1 1
using the UPDATE DEP function. The success rate of all Total Recv Bytes 32.32 43653.50
these attacks is shown in Figure 7, using the 412 Malicious Total Sent Bytes 2.0 2702.62
testing examples classified correctly. The attacks based Avg Ratio Bytes 16.15 16.15
on our framework (with Projected and Penalty objectives) Registration Age 349 3616
perform best, as they account for feature dependencies TABLE 5: Adversarial example for the Projected attack
during the adversarial point generation. The attack with (distance 10).
the Projected objective has the highest performance (we
suspect that the Penalty attack is sensitive to parameter Adversarial examples. We include an adversarial exam-
c). The vanilla C&W has slightly worse performance at ple in Table 5 for the Projected attack. We only show the
small perturbation values, even though it does not take features that are modified by the attack and their original
into consideration the feature constraints and works in an value. As we observe, the attack preserves the feature
enlarged feature space. Interestingly, the Post-processing dependencies: the average ratio of received bytes over
attack performs worse (reaching only 0.005% success sent bytes (Avg Ratio Bytes) is consistent with number of
at distance 20 – can generate 2 out of 412 adversarial received (Total Recv Bytes) and sent (Total Sent Bytes)
examples). This demonstrates that it is not sufficient to bytes. In addition, the attack modifies the domain regis-
run state-of-art attacks for continuous domains and then tration age, an independent feature, relevant in malicious
adjust the feature dependencies, but more sophisticated domain classification [47]. However there is a higher
attack strategies are needed. cost to change this feature: the attacker should register
Number of features modified. We compare the number a malicious domain and wait to get a larger registration
of features modified during the attack iterative algorithm age. If this cost is prohibitive, we can easily modify our
to construct the adversarial examples for three attacks: framework to make this feature immutable (see Table 15
Projected, Penalty, and C&W. The histogram for the num- in Appendix for a list of features that can be currently
ber of modified features is illustrated in Figure 8a. It is modified by the attack).
not surprising that the C&W attack modifies almost all Weaker attack models. We consider a threat model
features, as it works in L2 norms without any restriction in in which the adversary only knows the feature repre-
feature space. Both the Projected and the Penalty attacks sentation, but not the exact ML model or the training
modify a much smaller number of features (4 on average). data. One approach to generate adversarial examples is
We are interested in determining if there is a relation- through transferability [52], [46], [68], [64], [21]. We
ship between feature importance and choice of feature perform several experiments to test the transferability of
by the optimization algorithm. For additional details on the Projected attacks against FFNN to logistic regression
feature description, we include the list of features that (LR) and random forest (RF). Models were trained with
can be modified in Table 15 in the Appendix. In Figure 8b different data and we vary the imbalance ratio. The results
we plot the number of modifications for each feature (left are in Table 6. We observe that the largest transferability

(a) Histogram on feature modifications. (b) Number of updates (left) and feature importance (right).
Figure 8: Feature statistics.

rate to both LR and RF is for the highest imbalanced other more effective methods of performing black-box
ratio of 50 (98.2% adversarial examples transfer to LR attacks in future work.
and 94.8% to RF). As we increase the imbalance ratio,
the transfer rate increases, and the transferability rate to 5.4. Adversarial Training
LR is lower than to RF.
Finally, we looked at defensive approaches to protect
Ratio DNN LR RF
1 100% 40% 51.7% ML classifiers in security analytics tasks. One of the most
5 93.3% 66.5% 82.9% robust defensive technique against adversarial examples
15 99% 60.9% 90.2% is adversarial training [29], [48]. We trained FFNN us-
25 100% 47.6% 68.8% ing adversarial training with the Projected attack at L2
50 100% 98.2% 94.8%
distance 20. We trained the model adversarially for 11
TABLE 6: Transferability of adversarial examples from epochs and obtain AUC score of 89% (each epoch takes
FFNN to LR (third column) and RF (fourth column). We approximately 7 hours). We measure the Projected attack’s
vary the ratio of Benign to Malicious in training. Column success rate for the balanced case against the standard
FFNN shows the white-box attack success rate. and adversarially training models in Figure 9. Interest-
ingly, the success rate of the evasion attacks significantly
We also look at the transferability between different drops for the adversarially-trained model and reaches only
FFNN architectures trained on different datasets (results 16.5% at 20 L2 distance. This demonstrates that adversar-
in Table 7). The attacks transfer best at highest imbalance ial training is a promising direction for designing robust
ratio (with success rate higher than 96%), confirming that ML models for security.
weaker models are easier to attack.
Ratio DNN1 DNN2 DNN3
[80, 50] [160, 80] [100, 50, 25]
1 100% 57.6% 42.3%
5 93.3% 73.6% 58.6%
15 99% 78.6% 52.4%
25 100% 51.4% 45.3%
50 100% 96% 97.1%

TABLE 7: Transferability between different FFNN archi-
tectures (number of neurons per layer in the second row).
Adversarial examples are computed against DNN1 and
transferred to DNN2 and DNN3.
Figure 9: Success rate of the Projected attack against
adversarially and standard trained model.
Alternative approaches to perform black-box attacks
is to use substitute model and synthetic training inputs la-
beled by the target classifier using black-box queries [53]
or to query the ML classifier and estimate gradient val- 6. Experimental evaluation for malicious
ues [37]. Running directly existing black-box attacks
does not generate feasible adversarial examples, thus we
connection classifier
adapted the black-box attack of Ilyas et al. [37] to our
setting (assuming the attacker knows the feature represen- Hyperparameter Value
tation). When estimating the gradient of the attacker’s loss Architecture [256, 128, 64]
Optimizer Adam
function, we use finite difference that incorporates time- Learning Rate 0.00026
dependent information and perform our standard proce-
dure of updating feature dependencies. The attack success TABLE 8: DNN Architecture
is only 28.4% (with 48 queries). We plan to investigate

(a) Projected attack success rate. (b) ROC curves under attack. (c) Average number of updated ports.
Figure 10: Projected attack results on malicious connection classifier.

Training scenario F1 AUC
1, 2 0.94 0.96
6.2. Classification results
1, 9 0.96 0.97
2, 9 0.83 0.79 We perform model selection and training for a number
of FFNN architectures on all combinations of two sce-
TABLE 9: Training results for FFNN. narios, and tested the models for generality on the third
scenario. The best architecture is illustrated in Table 8.
Feature Input Delta Adversary
Total TCP 6809 12 6821 It consists of three layers with 256, 128 and 64 hidden
Total Sent Pkts 29 1044 1073 layers. We used the Adam optimizer, 50 epochs for train-
Max Sent Pkts 11 76 87 ing, mini-batch of 64, and a learning rate of 0.00026.
Sum Sent Bytes 980 1348848 1349828 The F1 and AUC scores for all combinations of training
Max Sent Bytes 980 111424 112404
Total Duration 2.70 5151.48 5154.19
scenarios are illustrated in Table 9. We also compared the
Max Duration 2.21 430.26 432.47 performance of FFNN with logistic regression and random
forest, but we omit the results (FFNN achieved similar
TABLE 10: Feature statistics update when generating an performance to random forest). For the adversarial attacks,
adversarial example at distance 14, on port 443. we choose the scenarios with best performance: training
on 1, 9, and testing on 2.
In this application we have access to raw network
6.3. Robustness to evasion attacks
connections (in Bro log format), which provides the op-
portunity to generate feasible adversarial examples in both We show the Projected attack’s performance, discuss
feature representation and raw data space. We show how which ports were updated most frequently, and show
an attacker can insert new realistic network connections an adversarial examples and the corresponding Bro logs
to change the prediction of Malicious activity. We only records. The testing data for the attack is 407 Malicious
analyze the Projected attack here, as it demonstrated best examples from scenario 2, among which 397 were pre-
performance in the previous application. The code of the dicted correctly by the classifier.
attack and the dataset are available at https://github.com/
achernikova/cybersecurity evasion. The malicious domain Evasion attack performance. First, we analyze the attack
dataset is proprietary and we cannot release it. success rate with respect to the allowed perturbation,
shown in Figure 10a. The attack reaches 99% success
We start with a description of the CTU-13 dataset
rate at L2 distance 16. Interestingly, in this case the
in Section 6.1, then we show the performance of FFNN
two baselines perform poorly, demonstrating again the
for connection classification in Section 6.2. Finally, we
clear advantages of our framework. We plot next the
present the analysis on model robustness in Section 6.3.
ROC curves under evasion attack in Figure 10b (using
the 407 Malicious examples and 407 Benign examples
6.1. CTU-13 dataset from testing scenario 2). At distance 8, the AUC score
is 0.93 (compared to 0.98 without adversarial examples),
CTU-13 is a collection of 13 scenarios including both but there is a sudden change at distance 10, with AUC
legitimate traffic from a university campus network, as score dropping to 0.77. Moreover, at distance 12, the
well as labeled connections of malicious botnets [27]. AUC reaches 0.12, showing the model’s degradation under
We restrict to three scenarios for the Neris botnet (1, evasion attack with relatively small distance.
2, and 9). We choose to train on two of the scenarios Ports family statistics. We show the average number of
and test the models on the third, to guarantee indepen- port families updated during the attack in Figure 10c. The
dence between training and testing data. The training data maximum number is 3 ports, but it decreases to 1 port at
has 3869 Malicious examples, 194,259 Benign examples, distance higher than 12. While counter-intuitive, this can
and an imbalance ratio of 1:50. There is a set of 432 be explained by the fact that at larger distances the attacker
statistical features that the attacker can modify (the ones can add larger perturbation to the aggregated statistics of
that correspond to the characteristics of sent traffic). The one port, crossing the decision boundary.
physical constraints and statistical dependencies on bytes In Table 12 we include the port families selected
and duration have been detailed in Section 4.1. during attack, at distance 8, as well as their importance.

You can also read