Holistic Adversarial Robustness of Deep Learning Models
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Holistic Adversarial Robustness of Deep Learning Models Pin-Yu Chen1,3∗ , Sijia Liu2,3 1 IBM Research, 2 Michigan State University, 3 MIT-IBM Watson AI Lab pin-yu.chen@ibm.com and liusiji5@msu.edu arXiv:2202.07201v1 [cs.LG] 15 Feb 2022 Abstract Adversarial robustness studies the worst-case per- formance of a machine learning model to ensure safety and reliability. With the proliferation of deep-learning based technology, the potential risks associated with model development and deploy- ment can be amplified and become dreadful vul- nerabilities. This paper provides a comprehensive overview of research topics and foundational prin- ciples of research methods for adversarial robust- Figure 1: Holistic view of adversarial attack categories and capabili- ness of deep learning models, including attacks, de- ties (threat models) in the training and deployment phases. The three fenses, verification, and novel applications. types of attacks highlighted in colors (poisoning/backdoor/evasion attack) are the major focus of this paper. In the deployment phase, the target (victim) can be an access-limited black-box system (e.g. a 1 Introduction prediction API) or a transparent white-box model. Deep learning [LeCun et al., 2015] is a core engine that drives selection and training, to model deployment and system in- recent advances in artificial intelligence (AI) and machine tegration – this paper aims to provide a holistic overview of learning (ML), and it has broad impacts on our society and adversarial robustness for deep learning models. The research technology. However, there is a growing gap between AI themes include: (i) attack (risk identification and demonstra- technology’s creation and its deployment in the wild. One tion), (ii) defense (threat detection and mitigation), (iii) verifi- critical example is the lack of robustness, including natural cation (robustness certificate), and (iv) novel applications. In robustness to data distribution shifts, ability in generalization each theme, the fundamental concepts and key research prin- and adaptation to new tasks, and worst-case robustness when ciples will be presented in a unified and organized manner. facing an adversary (also known as adversarial robustness). Figure 1 shows the lifecycle of AI development and de- According to a recent Gartner report1 , 30% of cyberattacks ployment and different adversarial threats corresponding to by 2022 will involve data poisoning, model theft or adversar- attackers’ capabilities (also known as threat models). The ial examples. However, the industry seems underprepared. In lifecycle is further divided into two phases. The training a survey of 28 organizations spanning small as well as large phase includes data collection and pre-processing, as well as organizations, 25 organizations did not know how to secure model selection (e.g. architecture search and design), hyper- their AI/ML systems [Kumar et al., 2020]. Moreover, vari- parameter tuning, model parameter optimization, and valida- ous practical vulnerabilities and incidences incurred by AI- tion. After model training, the model is “frozen” (fixed model empowered systems have been reported in real life, such as architecture and parameters) and is ready for deployment. Adversarial ML Threat Matrix2 and AI Incident Database3 . Before deployment, there are possibly some post-hoc model To prepare deep-learning enabled AI systems for the real adjustment steps such as model compression and quantifica- world and to familiarize researchers with the error-prone risks tion for memory/energy reduction, calibration or risk mitiga- hidden in the lifecycle of AI model development and deploy- tion. The frozen model providing inference/prediction can ment – spanning from data collection and processing, model be deployed in a white-box or black-box manner. The for- mer means the model details are transparent to a user (e.g. ∗ releasing the model architecture and pre-trained weights for Contact Author 1 https://www.gartner.com/smarterwithgartner/ neural networks), while the latter means a user can access gartner-top-10-strategic-technology-trends-for-2020 to model predictions but does not know what the model is 2 https://github.com/mitre/advmlthreatmatrix (i.e., an access-limited model), such as a prediction API. The 3 https://incidentdatabase.ai/ gray-box setting is an mediocre scenario that assumes a user
Symbol Meaning Attack Objective fθ : Rd 7→ [0, 1]K K-way neural network classification model parameterized by θ Poisoning Design a poisoned dataset Dpoison such that models trained logit : Rd 7→ RK logit (pre-softmax) representation on Dpoison fail to generalize on Dtest (i.e. ŷθ (xtest ) 6= ytest ) (x, y) data sample x and its associated groundtruth label y Backdoor Embed a trigger ∆ with a target label t to Dtrain such that ŷθ (x) ∈ [K] top-1 label prediction of x by fθ ŷθ (xtest ) = ytest but ŷθ (xtest + ∆) = t Evasion (untargeted) Given fθ , find xadv such that xadv is similar to x but ŷθ (xadv ) 6= y xadv adversarial example of x Evasion (targeted) Given fθ , find xadv such that xadv is similar to x but ŷθ (xadv ) = t δ adversarial perturbation to x for evasion attack ∆ universal trigger pattern for backdoor attack Table 2: Objectives of adversarial attacks. t ∈ [K] target label for targeted attack loss(fθ (x), y) classification loss (e.g. cross entropy) g attcker’s loss function which can be realized through noisy data collection such as Dtrain / Dtest original training / testing dataset crowdsourcing. Specifically, the memorization effect of deep T data transformation function learning models [Zhang et al., 2017; Carlini et al., 2019b] can Table 1: Mathematical notation. be leveraged as vulnerabilities. We note that sometimes the term “data poisoning” entails both poisoning and backdoor knowing partial information about the deployed model. In attacks, though their attack objectives are different. some cases, a user may have knowledge of the training data and the deployed model is black-box, such as the case of an Poisoning attack aims to design a poisoned dataset Dpoison AI automation service that only returns a model prediction such that models trained on Dpoison will fail to generalize on portal based on user-provided training data. We also note that Dtest (i.e. ŷθ (xtest ) 6= ytest ). The poisoned dataset Dpoison these two phases can be recurrent: a deployed model can re- can be created by modifying the original training dataset enter the training phase with continuous model/data updates. Dtrain , such as label flipping, data addition/deletion, and fea- Throughout this paper, we focus on adversarial robust- ture modification. The rationale is that training on Dpoison will ness of neural networks for classification tasks. Many prin- land on a “bad” local minimum of model parameters. ciples in classification can be naturally extended to other ma- To control the amount of data modification and reduce the chine learning tasks, which will be discussed in Section 4. overall accuracy on Dtest (i.e. test accuracy), poisoning attack Based on Figure 1, this paper will focus on training-phase often assumes the knowledge of target model and its train- and deployment-phase attacks driven by the limitation of cur- ing method [Jagielski et al., 2018]. [Liu et al., 2020b] pro- rent ML techniques. While other adversarial threats concern- poses black-box poisoning with additional conditions on the ing model/data privacy and integrity are also crucial, such training loss function. Targeted poisoning attack aims at ma- as model stealing, membership inference, data leakage, and nipulating the prediction of a subset of data samples in Dtest , model injection, they will not be covered in this paper. We which can be accomplished by clean-label poisoning (small also note that adversarial robustness of non-deep-learning perturbations to a subset of Dtrain while keeping their labels models such as support vector machines has been investi- intact) [Shafahi et al., 2018; Zhu et al., 2019] or gradient- gated. We refer the readers to [Biggio and Roli, 2018] for matching poisoning [Geiping et al., 2021]. the research evolution in adversarial machine learning. Table 1 summarizes the main mathematical notations. We Backdoor attack is also known as Trojan attack. The cen- use [K] = {1, 2, . . . , K} to denote the set of K class labels. tral idea is to embed a universal trigger ∆ to a subset of Without loss of generality, we assume the data inputs are vec- data samples in Dtrain with a modified target label t [Gu et torized (flattened) as d-dimensional real-valued vectors, and al., 2019]. Examples of trigger patterns are a small patch the output (class confidence) of the K-way neural network in images and a specific text string in sentences. Typically, classifier fθ is nonnegative and sum to 1 (e.g. softmax as backdoor attack only assumes access to the training data and PK does not assume the knowledge of the model and its training. the final layer), that is, k=1 [fθ (·)]k = 1. The adversar- The model fθ trained on the tampered data is called a back- ial robustness of real-valued continuous data modalities such doored (Trojan) model. Its attack objective has two folds: as image, audio, time series, and tabular data can be studied (i) High standard accuracy in the absence of trigger – the based on a unified methodology. For discrete data modalities backdoored model should behave like a normal model (same such as texts and graphs, one can leverage their real-valued model trained on untampered data), i.e., ŷθ (xtest ) = ytest . (ii) embeddings [Lei et al., 2019], latent representations, or con- High attack success rate in the presence of trigger – the back- tinuous relaxation of the problem formulation (e.g. topology doored model will predict any data input with the trigger as attack in terms of edge addition/deletion in graph neural net- the target label t, i.e., ŷθ (xtest + ∆) = t. Therefore, backdoor works [Xu et al., 2019a]). Unless specified, in what follows attack is stealthy and insidious. The trigger pattern can also we will not further distinguish data modalities. be made input-aware and dynamic [Nguyen and Tran, 2020]. There is a growing concern of backdoor attacks in emerg- 2 Attacks ing machine learning systems featuring collaborative model This section will cover mainstream adversarial threats that training with local private data, such as federated learning aim to manipulate the prediction and decision-making of an [Bhagoji et al., 2019; Bagdasaryan et al., 2020]. Backdoor AI model through training-phase or deployment-phase at- attacks can be made more insidious by leveraging the in- tacks. Table 2 summarizes their attack objectives. nate local model/data heterogeneity and diversity [Zawad et al., 2021]. [Xie et al., 2020] proposes distributed backdoor 2.1 Training-Phase Attacks attacks by trigger pattern decomposition among malicious Training-phase attacks assume the ability to modify the train- clients to make the attack more stealthy and effective. We ing data to achieve malicious attempts on the resulting model, also refer the readers to the detailed survey of these two at-
Norm Meaning `0 number of modified features `1 total changes in modified features `2 Euclidean distance between x and xadv `∞ maximal change in modified features Table 3: `p norm similarity measures for additive perturbation δ = xadv − x. The change in each feature (dimension) between x and xadv is measured in absolute value. Figure 2: Taxonomy and illustration of evasion attacks. tacks in [Goldblum et al., 2020]. viewed as a black-box function and thus backpropagation for 2.2 Deployment-Phase Attacks computing input gradient is infeasible without knowing the The objective of deployment-phase attacks is to find a “sim- model details. In the soft-label black-box attack setting, an ilar” example T (x) of x such that the fixed model fθ will attacker can observe (parts of) class predictions and their as- evade its prediction from the original groundtruth label y. The sociated confidence scores. In the hard-label black-box attack evasion condition can be further separated into two cases: (i) (decision-based) setting, an attacker can only observe the top- untargeted attack such that fθ (x) = y but fθ (T (x)) 6= y, 1 label prediction, which is the least information required to or (ii) targeted attack such that fθ (x) = y but fθ (T (x)) = be returned to remain utility of the model. In addition to at- t, t 6= y. Such T (x) is known as an adversarial ex- tack success rate, query efficiency is also an important metric ample4 of x [Biggio et al., 2013; Szegedy et al., 2014; for performance evaluation of black-box attacks. Goodfellow et al., 2015], and it can be interpreted as out-of- Transfer attack is a branch of black-box attack that uses distribution sample or generalization error [Stutz et al., 2019]. adversarial examples generated from a white-box surrogate Data Similarity. Depending on data characteristics, speci- model to attack the target model. The surrogate model can fying a transformation function T (·) that preserves data sim- be either pre-trained [Liu et al., 2017] or distilled from a set ilarity between an original sample x and its transformed sam- of data samples with soft labels given by the target model for ple T (x) is a core mission for evasion attacks. A common training [Papernot et al., 2016; Papernot et al., 2017]. practice to select T (·) is through a simple additive perturba- Attack formulation. The process of finding an adversar- tion δ such that xadv = x + δ, or through domain-specific ial perturbation δ can be formulated as a constrained opti- knowledge such as rotation, object translation, and color mization problem with a specified attack loss g(δ|x, y, t, θ) changes for image data [Hosseini and Poovendran, 2018; reflecting the attack objective (t is omitted for untargeted at- Engstrom et al., 2019]. For additive perturbation (either on tack). The variation in problem formulations and solvers will data input or parameter(s) simulating semantic changes), the lead to different attack algorithms. We specify three exam- P 1/p d p ples below. Without loss of generality, we use `p norm as the `p norm (p ≥ 1) of δ defined as kδkp , i=1 i|δ | similarity measure (distortion) and untargeted attack as the and the pseudo norm `0 are surrogate metrics for measuring objective, and assume that all feasible data inputs lie in the similarity distance. Table 3 summarizes popular choices of scaled space S = [0, 1]d . `p norms and their meanings. Take image as an example, `0 • Minimal-distortion formulation: norm is used to design few-pixel (patch) attacks [Su et al., 2019], `1 norm is used to generate sparse and small perturba- Minimizeδ:x+δ∈S kδkp subject to ŷθ (x + δ) 6= ŷθ (x) (1) tions [Chen et al., 2018b], `∞ norm is used to confine maxi- mal changes in pixel values [Szegedy et al., 2014], and mixed • Penalty-based formulation: `p norms can also be used [Xu et al., 2019b]. Minimizeδ:x+δ∈S kδkp + λ · g(δ|x, y, θ) (2) Evasion Attack Taxonomy. Evasion attacks can be cate- gorized based on attackers’ knowledge of the target model. • Budget-based (norm bounded) formulation: Figure 2 illustrates the taxonomy of different attack types. White-box attack assumes complete knowledge about the Minimizeδ:x+δ∈S g(δ|x, y, θ) subject to kδkp ≤ (3) target model, including model architecture and model param- For untargeted attacks, the attacker’s loss can be the nega- eters. Consequently, an attacker can exploit the auto differ- tive classification loss g(δ|x, y, θ) = −loss(fθ (x + δ), y) entiation function offered by deep learning packages, such as or the truncated class margin loss (using either logit or soft- backpropagation (input gradient) from the model output to max output) defined as g(δ|x, y, θ) = max{[logit(x + δ)]y − the model input, to craft adversarial examples. maxk∈[K],k6=y [logit(x + δ)]k + κ, 0}. The margin loss sug- Black-box attack assumes an attacker can only observe the gests that g(δ|x, y, θ) achieves minimal value (i.e. 0) when model prediction of a data input (that is, a query) and does the top-1 class confidence score excluding the original class not know any other information. The target model can be y satisfies maxk∈[K],k6=y [logit(x + δ)]k ≥ logit(x + δ)]y + κ, 4 Sometimes adversarial example may carry a broader meaning where κ ≥ 0 is a tuning parameter governing their confi- of any data sample xadv that leads to incorrect prediction and there- dence gap. Similarly, for targeted attacks, the attacker’s loss fore dropping the dependence to a reference sample x, such as unre- can be g(δ|x, y, t, θ) = loss(fθ (x + δ), t) or g(δ|x, y, t, θ) = stricted adversarial examples [Brown et al., 2018]. max{maxk∈[K],k6=t [logit(x + δ)]k − [logit(x + δ)]t + κ, 0}.
When implementing black-box attacks, the logit margin loss defense is essentially a cat-and-mouse game. Many seem- can be replaced with the observable model output log fθ . ingly successfully empirical defenses were later weakened The attack formulation can be generalized to the univer- by advanced attacks that is defense-aware, which gives a sal perturbation setting such that it simultaneously evades all false sense of adversarial robustness due to information ob- model predictions. The universality can be w.r.t. data samples fuscation [Carlini and Wagner, 2017a; Athalye et al., 2018]. [Moosavi-Dezfooli et al., 2017], model ensembles [Tramèr Consequently, defenses are expected to be fully evaluated et al., 2018], or various data transformations [Athalye and against the best possible adaptive attacks that are defense- Sutskever, 2018]. [Wang et al., 2021a] shows that min-max aware [Carlini et al., 2019a; Tramer et al., 2020]. optimization can yield effective universal perturbations. While empirical robustness refers to the model perfor- mance against a set of known attacks, it may fail to serve Selected Attack Algorithms. We show some white-box as a proper robustness indicator against advanced and unseen and black-box attack algorithms driven by the three afore- attacks. To address this issue, certified robustness is used to mentioned attack formulations. For the minimal-distortion ensure the model is provably robust given a set of attack con- formulation, the attack constraint ŷθ (x + δ) 6= ŷθ (x) can be ditions (threat models). Verification can be viewed as a pas- rewritten as maxk∈[K],k6=y [fθ (x+δ)]k ≥ [fθ (x+δ)]y , which sive certified defense in the sense that its goal is to quantify a can be used to linearize the local decision boundary around x given model’s (local) robustness with guarantees. and allow for efficient projection to the closest linearized de- cision boundary, leading to white-box attack algorithms such 3.1 Empirical Defenses as DeepFool [Moosavi-Dezfooli et al., 2016] and fast adap- tive boundary (FAB) attack [Croce and Hein, 2020]. For the Empirical defenses are effective methods applied during penalty-based formulation, one can use change-of-variable on training or deployment phase to improve adversarial robust- δ to convert to an unconstrained optimization problem and ness without provable guarantees on their effectiveness. then use binary search on λ to find the smallest λ leading For training-phase attacks, data filtering and model finetu- to successful attack (i.e., g = 0), known as Carlini-Wagner ing are major approaches. For instance, [Tran et al., 2018] (C&W) white-box attack [Carlini and Wagner, 2017b]. For shows removing outliers using learned latent representations the budget-based formulation, one can apply projected gra- and retraining the model can reduce the poison effect. To dient descent (PGD), leading to the white-box PGD attack inspect a pre-trained model has backdoor or not, Neural [Madry et al., 2018]. Attack algorithms using input gradients Cleanse [Wang et al., 2019] reverse-engineers potential trig- of the loss function are called gradient-based attacks. ger patterns for detection. [Wang et al., 2020] proposes data- efficient detectors that require only one sample per class and Black-box attack algorithms often adopt either the penalty- are made data-free for convolutional neural networks. [Zhao based or budget-based formulation. Since the input gradient et al., 2020a] exploits the mode connectivity in the loss land- of the attacker’s loss is unavailable to obtain in the black-box scape to recover a backdoored model using limited clean data. setting, one principal approach is to perform gradient esti- For deployment-phase attacks, adversarial input detection mation using model queries and then use the estimated gra- schemes that exploit data characteristics such as spatial or dient to replace the true gradient in white-box attack algo- temporal correlations are shown to be effective, such as the rithms, leading to the zeroth-order optimization (ZOO) based detection of audio adversarial example using temporal depen- black-box attacks [Chen et al., 2017]. The choices in gradi- dency [Yang et al., 2019]. For training adversarially robust ent estimators [Tu et al., 2019; Nitin Bhagoji et al., 2018; models, adversarial training that aims to minimize the worst- Liu et al., 2018; Liu et al., 2019; Ilyas et al., 2018] and case loss evaluated by perturbed examples generated during ZOO solvers [Zhao et al., 2019; Zhao et al., 2020b; Ilyas et training is so far the strongest empirical defense [Madry et al., 2019] will give rise to different attack algorithms. The al., 2018]. Specifically, the standard formulation of adversar- choices in gradient estimators and ZOO solvers will give rise ial training can be expressed as the following min-max opti- to different attack algorithms [Liu et al., 2020a]. Hard-label mization over training samples {xi , yi }ni=1 : black-box attacks can still adopt ZOO principles by spend- ing extra queries to explore local loss landscapes for gra- n X dient estimation [Cheng et al., 2019; Cheng et al., 2020b; min max loss(fθ (xi + δ), y) (4) θ kδi kp ≤ Chen et al., 2020], which is more query-efficient than ran- i=1 dom exploration [Brendel et al., 2018]. The worst-case loss corresponding to the inner maximiza- Physical adversarial example is a prediction-evasive tion step is often evaluated by gradient-based attacks such physical adversarial object. Examples include stop sign as PGD attack [Madry et al., 2018]. Variants of adversarial [Eykholt et al., 2018], eye glass [Sharif et al., 2016], ph- training methods such as TRADES [Zhang et al., 2019] and sical patch [Brown et al., 2017], 3D printing [Athalye and customized adversarial training (CAT) [Cheng et al., 2020a] Sutskever, 2018], and T-shirt [Xu et al., 2020b]. have been proposed for improved robustness. [Cheng et al., 2021] proposes attack-independent robust training based on 3 Defenses and Verification self-progression. On 18 different ImageNet pre-trained mod- els, [Su et al., 2018] unveils an undesirable trade-off between Defenses are adversarial threat detection and mitigation standard accuracy and adversarial robustness. This trade-off strategies, which can be divided into empirical and certi- can be improved with the use of unlabeled data [Carmon et fied defenses. We note that the interplay between attack and al., 2019; Stanforth et al., 2019].
3.2 Certified Defenses augmentation tools to simultaneously improve model gener- Certified defenses provide performance guarantees on the alization and adversarial robustness [Hsu et al., 2022; Lei et hardened models. Adversarial attacks are ineffective if their al., 2019]. Other noteworthy applications include image syn- threat models fall within the provably robust conditions. thesis [Santurkar et al., 2019] generating contrastive explana- For training-phase attacks, [Steinhardt et al., 2017] pro- tions [Dhurandhar et al., 2018], and robust text CAPTCHAs poses certified data sanitization against poisoning attacks. [Shao et al., 2021]. [Weber et al., 2020] proposes randomized data training for Adversarial Robustness Beyond Classification and Input certified defense against backdoor attacks. For deployment- Perturbation. The formulations and principles in attacks phase attacks, randomized smoothing is an effective and and defenses for classification can be analogously applied to model-agnostic approach that adds random noises to a data other machine learning tasks. Examples include sequence-to- input to “smooth” the model and perform majority voting on sequence translation [Cheng et al., 2020c] and image caption- the model predictions. The certified radius (region) in `p - ing [Chen et al., 2018a]. Beyond input perturbation, the ro- norm perturbation ensuring consistent class prediction can bustness of model parameter perturbation [Tsai et al., 2021] be computed by information-theoretical approach [Li et al., also relates to model quantification [Weng et al., 2020] and 2019], differential privacy [Lecuyer et al., 2019], Neyman- energy-efficient inference [Stutz et al., 2020]. Pearson lemma [Cohen et al., 2019], or higher-order certifi- cation [Mohapatra et al., 2020a]. Instilling Adversarial Robustness into Foundation Mod- els. As foundation models adapt task-independent pre- 3.3 Verification training for general representation learning followed by task- Verification is often used in certifying local robustness against specific fine-tuning for fast adaptation, it is of utmost impor- evasion attacks. Given a neural network fθ and a data sample tance to understand (i) how to incorporate adversarial robust- x, verification (in its simplest form) aims to maximally certify ness into foundation model pre-training and (ii) how to maxi- an `p -norm bounded radius r on the perturbation δ to ensure mize adversarial robustness transfer from pre-training to fine- the model prediction on the perturbed sample x + δ is consis- tuning. [Fan et al., 2021; Wang et al., 2021b] show promising tent as long as δ is within the certified region. That is, for any results in adversarial robustness preservation and transfer in δ such that kδkp ≤ r, ŷθ (x + δ) = ŷθ (x). The certified radius meta learning and contrastive learning. The rapid growth and is a robustness certificate relating to the distance to the clos- intensifying demand on foundation models create a unique est decision boundary, which is computationally challenging opportunity to advocate adversarial robustness as a necessary (NP-complete) for neural networks [Katz et al., 2017]. How- native property in next-generation trustworthy AI tools. ever, its estimate (hence not a certificate) can be efficiently computed and used as a model-agnostic robustness metric, References such as the CLEVER score [Weng et al., 2018b]. To address the non-linearity induced by layer propagation in neural net- [Aramoon et al., 2021] Omid Aramoon, Pin-Yu Chen, and works, solving for a certified radius is often cast as a relaxed Gang Qu. Don’t forget to sign the gradients! MLSyS, optimization problem. The methods include convex poly- 3, 2021. tope [Wong and Kolter, 2018], semidefinite programming [Athalye and Sutskever, 2018] Anish Athalye and Ilya [Raghunathan et al., 2018], dual optimization [Dvijotham et Sutskever. Synthesizing robust adversarial examples. al., 2018], layer-wise linear bounds [Weng et al., 2018a], ICML, 2018. and interval bound propagation [Gowal et al., 2019]. The verification tools are also expanded to support general net- [Athalye et al., 2018] Anish Athalye, Nicholas Carlini, and work architectures [Zhang et al., 2018; Boopathy et al., 2019; David Wagner. Obfuscated gradients give a false sense of Xu et al., 2020a] and semantic adversarial examples [Moha- security: Circumventing defenses to adversarial examples. patra et al., 2020b]. The intermediate certified results can ICML, 2018. be used to train a more certifiable model [Wong et al., 2018; [Bagdasaryan et al., 2020] Eugene Bagdasaryan, Andreas Boopathy et al., 2021]. However, scalability to large-sized Veit, Yiqing Hua, Deborah Estrin, and Vitaly Shmatikov. neural networks remains a major challenge in verification. How to backdoor federated learning. AISTATS, 2020. [Bhagoji et al., 2019] Arjun Nitin Bhagoji, Supriyo 4 Remarks and Discussion Chakraborty, Prateek Mittal, and Seraphin Calo. Analyz- Here we make several concluding remarks and discussion. ing federated learning through an adversarial lens. ICML, Novel Applications. The insights from studying adversar- pages 634–643, 2019. ial robustness have led to several new use cases. Adver- [Biggio and Roli, 2018] Battista Biggio and Fabio Roli. sarial perturbation and data poisoning are used in generat- Wild patterns: Ten years after the rise of adversarial ma- ing contrastive explanations [Dhurandhar et al., 2018], per- chine learning. Pattern Recognition, 84:317–331, 2018. sonal privacy protection [Shan et al., 2020], data/model watermarking and fingerprinting [Sablayrolles et al., 2020; [Biggio et al., 2013] Battista Biggio, Igino Corona, Davide Aramoon et al., 2021; Wang et al., 2021c], and data-limited Maiorca, Blaine Nelson, Nedim Šrndić, Pavel Laskov, transfer learning [Tsai et al., 2020; Yang et al., 2021]. Ad- Giorgio Giacinto, and Fabio Roli. Evasion attacks against versarial examples with proper design are also efficient data machine learning at test time. ECML PKDD, 2013.
[Boopathy et al., 2019] Akhilan Boopathy, Tsui-Wei Weng, [Cheng et al., 2019] Minhao Cheng, Thong Le, Pin-Yu Pin-Yu Chen, Sijia Liu, and Luca Daniel. CNN-cert: An Chen, Jinfeng Yi, Huan Zhang, and Cho-Jui Hsieh. Query- efficient framework for certifying robustness of convolu- efficient hard-label black-box attack: An optimization- tional neural networks. AAAI, 2019. based approach. ICLR, 2019. [Boopathy et al., 2021] Akhilan Boopathy, Tsui-Wei Weng, [Cheng et al., 2020a] Minhao Cheng, Qi Lei, Pin-Yu Chen, Sijia Liu, Pin-Yu Chen, Gaoyuan Zhang, and Luca Daniel. Inderjit Dhillon, and Cho-Jui Hsieh. Cat: Customized ad- Fast training of provably robust neural networks by sin- versarial training for improved robustness. arXiv preprint gleprop. AAAI, 2021. arXiv:2002.06789, 2020. [Brendel et al., 2018] Wieland Brendel, Jonas Rauber, and [Cheng et al., 2020b] Minhao Cheng, Simranjit Singh, Matthias Bethge. Decision-based adversarial attacks: Re- Patrick H. Chen, Pin-Yu Chen, Sijia Liu, and Cho-Jui liable attacks against black-box machine learning models. Hsieh. Sign-OPT: A query-efficient hard-label adversarial ICLR, 2018. attack. ICLR, 2020. [Brown et al., 2017] Tom B Brown, Dandelion Mané, Aurko [Cheng et al., 2020c] Minhao Cheng, Jinfeng Yi, Huan Roy, Martı́n Abadi, and Justin Gilmer. Adversarial patch. Zhang, Pin-Yu Chen, and Cho-Jui Hsieh. Seq2sick: Evalu- arXiv preprint arXiv:1712.09665, 2017. ating the robustness of sequence-to-sequence models with [Brown et al., 2018] Tom B Brown, Nicholas Carlini, adversarial examples. AAAI, 2020. Chiyuan Zhang, Catherine Olsson, Paul Christiano, and [Cheng et al., 2021] Minhao Cheng, Pin-Yu Chen, Sijia Liu, Ian Goodfellow. Unrestricted adversarial examples. arXiv Shiyu Chang, Cho-Jui Hsieh, and Payel Das. Self- preprint arXiv:1809.08352, 2018. progressing robust training. AAAI, 2021. [Carlini and Wagner, 2017a] Nicholas Carlini and David [Cohen et al., 2019] Jeremy M Cohen, Elan Rosenfeld, and Wagner. Adversarial examples are not easily detected: By- J Zico Kolter. Certified adversarial robustness via random- passing ten detection methods. ACM Workshop on Artifi- ized smoothing. ICML, 2019. cial Intelligence and Security, pages 3–14, 2017. [Croce and Hein, 2020] Francesco Croce and Matthias Hein. [Carlini and Wagner, 2017b] Nicholas Carlini and David Minimally distorted adversarial examples with a fast adap- Wagner. Towards evaluating the robustness of neural net- tive boundary attack. ICML, pages 2196–2205, 2020. works. IEEE S&P, pages 39–57, 2017. [Carlini et al., 2019a] Nicholas Carlini, Anish Athalye, [Dhurandhar et al., 2018] Amit Dhurandhar, Pin-Yu Chen, Nicolas Papernot, Wieland Brendel, Jonas Rauber, Dim- Ronny Luss, Chun-Chen Tu, Paishun Ting, Karthikeyan itris Tsipras, Ian Goodfellow, Aleksander Madry, and Shanmugam, and Payel Das. Explanations based on the Alexey Kurakin. On evaluating adversarial robustness. missing: Towards contrastive explanations with pertinent arXiv preprint arXiv:1902.06705, 2019. negatives. NeurIPS, 2018. [Carlini et al., 2019b] Nicholas Carlini, Chang Liu, Úlfar Er- [Dvijotham et al., 2018] Krishnamurthy Dvijotham, Robert lingsson, Jernej Kos, and Dawn Song. The secret sharer: Stanforth, Sven Gowal, Timothy A Mann, and Pushmeet Evaluating and testing unintended memorization in neural Kohli. A dual approach to scalable verification of deep networks. USENIX Security, pages 267–284, 2019. networks. UAI, 1(2):3, 2018. [Carmon et al., 2019] Yair Carmon, Aditi Raghunathan, [Engstrom et al., 2019] Logan Engstrom, Brandon Tran, Ludwig Schmidt, Percy Liang, and John C Duchi. Un- Dimitris Tsipras, Ludwig Schmidt, and Aleksander labeled data improves adversarial robustness. NeurIPS, Madry. Exploring the landscape of spatial robustness. 2019. ICML, pages 1802–1811, 2019. [Chen et al., 2017] Pin-Yu Chen, Huan Zhang, Yash Sharma, [Eykholt et al., 2018] Kevin Eykholt, Ivan Evtimov, Ear- Jinfeng Yi, and Cho-Jui Hsieh. ZOO: Zeroth order opti- lence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, mization based black-box attacks to deep neural networks Atul Prakash, Tadayoshi Kohno, and Dawn Song. Robust without training substitute models. ACM Workshop on Ar- physical-world attacks on deep learning visual classifica- tificial Intelligence and Security, pages 15–26, 2017. tion. CVPR, pages 1625–1634, 2018. [Chen et al., 2018a] Hongge Chen, Huan Zhang, Pin-Yu [Fan et al., 2021] Lijie Fan, Sijia Liu, Pin-Yu Chen, Chen, Jinfeng Yi, and Cho-Jui Hsieh. Attacking visual lan- Gaoyuan Zhang, and Chuang Gan. When does contrastive guage grounding with adversarial examples: A case study learning preserve adversarial robustness from pretraining on neural image captioning. ACL, 1:2587–2597, 2018. to finetuning? NeurIPS, 34, 2021. [Chen et al., 2018b] Pin-Yu Chen, Yash Sharma, Huan [Geiping et al., 2021] Jonas Geiping, Liam Fowl, W Ronny Zhang, Jinfeng Yi, and Cho-Jui Hsieh. EAD: elastic-net Huang, Wojciech Czaja, Gavin Taylor, Michael Moeller, attacks to deep neural networks via adversarial examples. and Tom Goldstein. Witches’ brew: Industrial scale data AAAI, pages 10–17, 2018. poisoning via gradient matching. ICLR, 2021. [Chen et al., 2020] Jianbo Chen, Michael I Jordan, and Mar- [Goldblum et al., 2020] Micah Goldblum, Dimitris Tsipras, tin J Wainwright. Hopskipjumpattack: A query-efficient Chulin Xie, Xinyun Chen, Avi Schwarzschild, Dawn decision-based attack. IEEE S&P, 2020. Song, Aleksander Madry, Bo Li, and Tom Goldstein. Data
security for machine learning: Data poisoning, backdoor [Liu et al., 2017] Yanpei Liu, Xinyun Chen, Chang Liu, and attacks, and defenses. arXiv preprint arXiv:2012.10544, Dawn Song. Delving into transferable adversarial exam- 2020. ples and black-box attacks. ICLR, 2017. [Goodfellow et al., 2015] Ian J Goodfellow, Jonathon [Liu et al., 2018] Sijia Liu, Bhavya Kailkhura, Pin-Yu Chen, Shlens, and Christian Szegedy. Explaining and harnessing Paishun Ting, Shiyu Chang, and Lisa Amini. Zeroth-order adversarial examples. ICLR, 2015. stochastic variance reduction for nonconvex optimization. [Gowal et al., 2019] Sven Gowal, Krishnamurthy Dj Dvi- NeurIPS, pages 3731–3741, 2018. jotham, Robert Stanforth, Rudy Bunel, Chongli Qin, [Liu et al., 2019] Sijia Liu, Pin-Yu Chen, Xiangyi Chen, and Jonathan Uesato, Relja Arandjelovic, Timothy Mann, and Mingyi Hong. signSGD via zeroth-order oracle. ICLR, Pushmeet Kohli. Scalable verified training for provably ro- 2019. bust image classification. ICCV, pages 4842–4851, 2019. [Liu et al., 2020a] Sijia Liu, Pin-Yu Chen, Bhavya [Gu et al., 2019] T. Gu, K. Liu, B. Dolan-Gavitt, and Kailkhura, Gaoyuan Zhang, Alfred Hero, and Pramod K S. Garg. BadNets: Evaluating backdooring attacks on deep Varshney. A primer on zeroth-order optimization in signal neural networks. IEEE Access, 7:47230–47244, 2019. processing and machine learning. IEEE Signal Processing [Hosseini and Poovendran, 2018] Hossein Hosseini and Magazine, 2020. Radha Poovendran. Semantic adversarial examples. [Liu et al., 2020b] Sijia Liu, Songtao Lu, Xiangyi Chen, Yao CVPR Workshops, pages 1614–1619, 2018. Feng, Kaidi Xu, Abdullah Al-Dujaili, Mingyi Hong, and [Hsu et al., 2022] Chia-Yi Hsu, Pin-Yu Chen, Songtao Lu, Una-May O’Reilly. Min-max optimization without gradi- Sijia Liu, and Chia-Mu Yu. Adversarial examples can ents: Convergence and applications to black-box evasion be effective data augmentation for unsupervised machine and poisoning attacks. ICML, pages 6282–6293, 2020. learning. AAAI, 2022. [Madry et al., 2018] Aleksander Madry, Aleksandar [Ilyas et al., 2018] Andrew Ilyas, Logan Engstrom, Anish Makelov, Ludwig Schmidt, Dimitris Tsipras, and Athalye, and Jessy Lin. Black-box adversarial attacks with Adrian Vladu. Towards deep learning models resistant to limited queries and information. ICML, 2018. adversarial attacks. ICLR, 2018. [Ilyas et al., 2019] Andrew Ilyas, Logan Engstrom, and [Mohapatra et al., 2020a] Jeet Mohapatra, Ching-Yun Ko, Aleksander Madry. Prior convictions: Black-box adver- Tsui-Wei Weng, Pin-Yu Chen, Sijia Liu, and Luca sarial attacks with bandits and priors. ICLR, 2019. Daniel. Higher-order certification for randomized smooth- [Jagielski et al., 2018] Matthew Jagielski, Alina Oprea, Bat- ing. NeurIPS, 2020. tista Biggio, Chang Liu, Cristina Nita-Rotaru, and Bo Li. [Mohapatra et al., 2020b] Jeet Mohapatra, Tsui-Wei Weng, Manipulating machine learning: Poisoning attacks and Pin-Yu Chen, Sijia Liu, and Luca Daniel. Towards ver- countermeasures for regression learning. IEEE S&P, ifying robustness of neural networks against a family of pages 19–35, 2018. semantic perturbations. CVPR, pages 244–252, 2020. [Katz et al., 2017] Guy Katz, Clark Barrett, David L Dill, [Moosavi-Dezfooli et al., 2016] Seyed-Mohsen Moosavi- Kyle Julian, and Mykel J Kochenderfer. Reluplex: An Dezfooli, Alhussein Fawzi, and Pascal Frossard. Deep- efficient smt solver for verifying deep neural networks. fool: a simple and accurate method to fool deep neural International Conference on Computer Aided Verification, networks. CVPR, pages 2574–2582, 2016. pages 97–117, 2017. [Moosavi-Dezfooli et al., 2017] Seyed-Mohsen Moosavi- [Kumar et al., 2020] Ram Shankar Siva Kumar, Magnus Dezfooli, Alhussein Fawzi, Omar Fawzi, and Pascal Nyström, John Lambert, Andrew Marshall, Mario Go- Frossard. Universal adversarial perturbations. CVPR, ertzel, Andi Comissoneru, Matt Swann, and Sharon Xia. pages 86–94, 2017. Adversarial machine learning-industry perspectives. IEEE [Nguyen and Tran, 2020] Anh Nguyen and Anh Tran. Input- S&P Workshops, pages 69–75, 2020. aware dynamic backdoor attack. NeurIPS, 2020. [LeCun et al., 2015] Yann LeCun, Yoshua Bengio, and Ge- [Nitin Bhagoji et al., 2018] Arjun Nitin Bhagoji, Warren He, offrey Hinton. Deep learning. Nature, 2015. Bo Li, and Dawn Song. Practical black-box attacks on [Lecuyer et al., 2019] Mathias Lecuyer, Vaggelis Atlidakis, deep neural networks using efficient query mechanisms. Roxana Geambasu, Daniel Hsu, and Suman Jana. Cer- ECCV, pages 154–169, 2018. tified robustness to adversarial examples with differential [Papernot et al., 2016] Nicolas Papernot, Patrick McDaniel, privacy. IEEE S&P, pages 656–672, 2019. and Ian Goodfellow. Transferability in machine learning: [Lei et al., 2019] Qi Lei, Lingfei Wu, Pin-Yu Chen, Alexan- from phenomena to black-box attacks using adversarial dros G Dimakis, Inderjit S Dhillon, and Michael Witbrock. samples. arXiv preprint arXiv:1605.07277, 2016. Discrete adversarial attacks and submodular optimization [Papernot et al., 2017] Nicolas Papernot, Patrick McDaniel, with applications to text classification. SysML, 2019. Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Anan- [Li et al., 2019] Bai Li, Changyou Chen, Wenlin Wang, and thram Swami. Practical black-box attacks against machine Lawrence Carin. Certified adversarial robustness with ad- learning. ACM Asia Conference on Computer and Com- ditive noise. NeurIPS, 2019. munications Security, pages 506–519, 2017.
[Raghunathan et al., 2018] Aditi Raghunathan, Jacob Stein- [Tramèr et al., 2018] Florian Tramèr, Alexey Kurakin, Nico- hardt, and Percy Liang. Certified a against adversarial ex- las Papernot, Dan Boneh, and Patrick McDaniel. En- amples. ICLR, 2018. semble adversarial training: Attacks and defenses. ICLR, 2018. [Sablayrolles et al., 2020] Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, and Hervé Jégou. Radioactive [Tramer et al., 2020] Florian Tramer, Nicholas Carlini, data: tracing through training. ICML, 2020. Wieland Brendel, and Aleksander Madry. On adaptive attacks to adversarial example defenses. NeurIPS, 2020. [Santurkar et al., 2019] Shibani Santurkar, Dimitris Tsipras, [Tran et al., 2018] Brandon Tran, Jerry Li, and Aleksander Brandon Tran, Andrew Ilyas, Logan Engstrom, and Alek- Madry. Spectral signatures in backdoor attacks. NeurIPS, sander Madry. Image synthesis with a single (robust) clas- pages 8000–8010, 2018. sifier. arXiv preprint arXiv:1906.09453, 2019. [Tsai et al., 2020] Yun-Yun Tsai, Pin-Yu Chen, and Tsung- [Shafahi et al., 2018] Ali Shafahi, W Ronny Huang, Mahyar Yi Ho. Transfer learning without knowing: Reprogram- Najibi, Octavian Suciu, Christoph Studer, Tudor Dumitras, ming black-box machine learning models with scarce data and Tom Goldstein. Poison frogs! targeted clean-label poi- and limited resources. ICML, pages 9614–9624, 2020. soning attacks on neural networks. NeurIPS, pages 6103– 6113, 2018. [Tsai et al., 2021] Yu-Lin Tsai, Chia-Yi Hsu, Chia-Mu Yu, and Pin-Yu Chen. Formalizing generalization and adver- [Shan et al., 2020] Shawn Shan, Emily Wenger, Jiayun sarial robustness of neural networks to weight perturba- Zhang, Huiying Li, Haitao Zheng, and Ben Y Zhao. tions. NeurIPs, 34, 2021. Fawkes: Protecting privacy against unauthorized deep [Tu et al., 2019] Chun-Chen Tu, Paishun Ting, Pin-Yu Chen, learning models. USENIX Security, 2020. Sijia Liu, Huan Zhang, Jinfeng Yi, Cho-Jui Hsieh, and [Shao et al., 2021] Rulin Shao, Zhouxing Shi, Jinfeng Shin-Ming Cheng. Autozoom: Autoencoder-based zeroth Yi, Pin-Yu Chen, and Cho-Jui Hsieh. Robust text order optimization method for attacking black-box neural captchas using adversarial examples. arXiv preprint networks. AAAI, 33:742–749, 2019. arXiv:2101.02483, 2021. [Wang et al., 2019] Bolun Wang, Yuanshun Yao, Shawn [Sharif et al., 2016] Mahmood Sharif, Sruti Bhagavatula, Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, and Lujo Bauer, and Michael K Reiter. Accessorize to a crime: Ben Y Zhao. Neural cleanse: Identifying and mitigating Real and stealthy attacks on state-of-the-art face recogni- backdoor attacks in neural networks. IEEE S&P, pages tion. ACM CCS, pages 1528–1540, 2016. 707–723, 2019. [Wang et al., 2020] Ren Wang, Gaoyuan Zhang, Sijia Liu, [Stanforth et al., 2019] Robert Stanforth, Alhussein Fawzi, Pin-Yu Chen, Jinjun Xiong, and Meng Wang. Practical Pushmeet Kohli, et al. Are labels required for improving detection of trojan neural networks: Data-limited and data- adversarial robustness? NeurIPS, 2019. free cases. ECCV, pages 222–238, 2020. [Steinhardt et al., 2017] Jacob Steinhardt, Pang Wei Koh, [Wang et al., 2021a] Jingkang Wang, Tianyun Zhang, Sijia and Percy Liang. Certified defenses for data poisoning Liu, Pin-Yu Chen, Jiacen Xu, Makan Fardad, and Bo Li. attacks. NeurIPS, 2017. Adversarial attack generation empowered by min-max op- [Stutz et al., 2019] David Stutz, Matthias Hein, and Bernt timization. NeurIPS, 34, 2021. Schiele. Disentangling adversarial robustness and gener- [Wang et al., 2021b] Ren Wang, Kaidi Xu, Sijia Liu, Pin-Yu alization. CVPR, pages 6976–6987, 2019. Chen, Tsui-Wei Weng, Chuang Gan, and Meng Wang. On fast adversarial robustness adaptation in model-agnostic [Stutz et al., 2020] David Stutz, Nandhini Chandramoorthy, meta-learning. ICLR, 2021. Matthias Hein, and Bernt Schiele. Bit error robust- ness for energy-efficient dnn accelerators. arXiv preprint [Wang et al., 2021c] Siyue Wang, Xiao Wang, Pin Yu Chen, arXiv:2006.13977, 2020. Pu Zhao, and Xue Lin. Characteristic examples: High- robustness, low-transferability fingerprinting of neural net- [Su et al., 2018] Dong Su, Huan Zhang, Hongge Chen, Jin- works. IJCAI, 2021. feng Yi, Pin-Yu Chen, and Yupeng Gao. Is robustness the [Weber et al., 2020] Maurice Weber, Xiaojun Xu, Bojan cost of accuracy? a comprehensive study on the robustness Karlas, Ce Zhang, and Bo Li. Rab: Provable ro- of 18 deep image classification models. ECCV, 2018. bustness against backdoor attacks. arXiv preprint [Su et al., 2019] Jiawei Su, Danilo Vasconcellos Vargas, and arXiv:2003.08904, 2020. Kouichi Sakurai. One pixel attack for fooling deep neural [Weng et al., 2018a] Tsui-Wei Weng, Huan Zhang, Hongge networks. IEEE Transactions on Evolutionary Computa- Chen, Zhao Song, Cho-Jui Hsieh, Duane Boning, Inder- tion, 23(5):828–841, 2019. jit S Dhillon, and Luca Daniel. Towards fast computa- [Szegedy et al., 2014] Christian Szegedy, Wojciech tion of certified robustness for relu networks. International Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Coference on ICML, 2018. Goodfellow, and Rob Fergus. Intriguing properties of [Weng et al., 2018b] Tsui-Wei Weng, Huan Zhang, Pin-Yu neural networks. ICLR, 2014. Chen, Jinfeng Yi, Dong Su, Yupeng Gao, Cho-Jui Hsieh,
and Luca Daniel. Evaluating the robustness of neural net- [Zhao et al., 2019] Pu Zhao, Sijia Liu, Pin-Yu Chen, Nghia works: An extreme value theory approach. ICLR, 2018. Hoang, Kaidi Xu, Bhavya Kailkhura, and Xue Lin. On the [Weng et al., 2020] Tsui-Wei Weng, Pu Zhao, Sijia Liu, Pin- design of black-box adversarial examples by leveraging Yu Chen, Xue Lin, and Luca Daniel. Towards certificated gradient-free optimization and operator splitting method. model robustness against weight perturbations. AAAI, ICCV, pages 121–130, 2019. pages 6356–6363, 2020. [Zhao et al., 2020a] Pu Zhao, Pin-Yu Chen, Payel Das, [Wong and Kolter, 2018] Eric Wong and Zico Kolter. Prov- Karthikeyan Natesan Ramamurthy, and Xue Lin. Bridg- able defenses against adversarial examples via the convex ing mode connectivity in loss landscapes and adversarial outer adversarial polytope. ICML, 2018. robustness. ICLR, 2020. [Wong et al., 2018] Eric Wong, Frank R Schmidt, Jan Hen- [Zhao et al., 2020b] Pu Zhao, Pin-Yu Chen, Siyue Wang, drik Metzen, and J Zico Kolter. Scaling provable adversar- and Xue Lin. Towards query-efficient black-box adversary ial defenses. NeurIPS, 2018. with zeroth-order natural gradient descent. AAAI, 2020. [Xie et al., 2020] Chulin Xie, Keli Huang, Pin-Yu Chen, and [Zhu et al., 2019] Chen Zhu, W Ronny Huang, Hengduo Bo Li. DBA: Distributed backdoor attacks against feder- Li, Gavin Taylor, Christoph Studer, and Tom Goldstein. ated learning. ICLR, 2020. Transferable clean-label poisoning attacks on deep neural nets. ICML, pages 7614–7623, 2019. [Xu et al., 2019a] Kaidi Xu, Hongge Chen, Sijia Liu, Pin-Yu Chen, Tsui-Wei Weng, Mingyi Hong, and Xue Lin. Topol- ogy attack and defense for graph neural networks: An op- timization perspective. IJCAI, 2019. [Xu et al., 2019b] Kaidi Xu, Sijia Liu, Pu Zhao, Pin-Yu Chen, Huan Zhang, Quanfu Fan, Deniz Erdogmus, Yanzhi Wang, and Xue Lin. Structured adversarial attack: To- wards general implementation and better interpretability. ICLR, 2019. [Xu et al., 2020a] Kaidi Xu, Zhouxing Shi, Huan Zhang, Minlie Huang, Kai-Wei Chang, Bhavya Kailkhura, Xue Lin, and Cho-Jui Hsieh. Automatic perturbation analysis on general computational graphs. NeurIPS, 2020. [Xu et al., 2020b] Kaidi Xu, Gaoyuan Zhang, Sijia Liu, Quanfu Fan, Mengshu Sun, Hongge Chen, Pin-Yu Chen, Yanzhi Wang, and Xue Lin. Adversarial t-shirt! evading person detectors in a physical world. ECCV, 2020. [Yang et al., 2019] Zhuolin Yang, Bo Li, Pin-Yu Chen, and Dawn Song. Characterizing audio adversarial examples using temporal dependency. ICLR, 2019. [Yang et al., 2021] Chao-Han Huck Yang, Yun-Yun Tsai, and Pin-Yu Chen. Voice2series: Reprogramming acous- tic models for time series classification. In ICML, 2021. [Zawad et al., 2021] Syed Zawad, Ahsan Ali, Pin-Yu Chen, Ali Anwar, Yi Zhou, Nathalie Baracaldo, Yuan Tian, and Feng Yan. Curse or redemption? how data heterogeneity affects the robustness of federated learning. AAAI, 2021. [Zhang et al., 2017] Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. Understanding deep learning requires rethinking generalization. ICLR, 2017. [Zhang et al., 2018] Huan Zhang, Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, and Luca Daniel. Efficient neu- ral network robustness certification with general activation functions. NeurIPS, pages 4944–4953, 2018. [Zhang et al., 2019] Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric Xing, Laurent El Ghaoui, and Michael Jordan. Theoretically principled trade-off between robustness and accuracy. ICML, pages 7472–7482, 2019.
You can also read