Preserving Privacy and Security in Federated Learning

Page created by Darryl Waters
 
CONTINUE READING
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015                                                                                      1

                                                               Preserving Privacy and Security in
                                                                      Federated Learning
                                                                                                Truc Nguyen and My T. Thai

                                              Abstract—Federated learning is known to be vulnerable to both security and privacy issues. Existing research has focused either on
                                              preventing poisoning attacks from users or on concealing the local model updates from the server, but not both. However, integrating
                                              these two lines of research remains a crucial challenge since they often conflict with one another with respect to the threat model.
                                              In this work, we develop a principle framework that offers both privacy guarantees for users and detection against poisoning attacks
                                              from them. With a new threat model that includes both an honest-but-curious server and malicious users, we first propose a secure
arXiv:2202.03402v2 [cs.LG] 17 May 2022

                                              aggregation protocol using homomorphic encryption for the server to combine local model updates in a private manner. Then, a
                                              zero-knowledge proof protocol is leveraged to shift the task of detecting attacks in the local models from the server to the users. The
                                              key observation here is that the server no longer needs access to the local models for attack detection. Therefore, our framework
                                              enables the central server to identify poisoned model updates without violating the privacy guarantees of secure aggregation.
                                              Index Terms—Federated learning, zero-knowledge proof, homomorphic encryption, model poisoning

                                                                                                                      F

                                         1    I NTRODUCTION
                                             Federated learning is an engaging framework for a large-                     the local models in order to update the global model with-
                                         scale distributed training of deep learning models with                          out learning any information about each individual local
                                         thousands to millions of users. In every round, a central                        model. In specific, the server can compute the sum of the
                                         server distributes the current global model to a random                          parameters of the local models while not having access to
                                         subset of users. Each of the users trains locally and submits                    the local models themselves. As a result, the local model
                                         a model update to the server. Then, the server averages                          updates are concealed from the server, thereby preventing
                                         the updates into a new global model. Federated learning                          the server from exploiting the updates of any user to infer
                                         has inspired many applications in many domains, especially                       their private training data.
                                         training image classifiers and next-word predictors on users’                         On the other hand, previous studies also unveil a vul-
                                         smartphones [15]. To exploit a wide range of training data                       nerability of federated learning when it comes to certain
                                         while maintaining users’ privacy, federated learning by                          adversarial attacks from malicious users, especially the poi-
                                         design has no visibility into users’ local data and training.                    soning attacks [3], [28]. This vulnerability leverages the fact
                                             Despite the great potential of federated learning in large-                  that federated learning gives users the freedom to train
                                         scale distributed training with thousands to millions of                         their local models. Specifically, any user can train in any
                                         users, the current system is still vulnerable to certain privacy                 way that benefits the attack, such as arbitrarily modifying
                                         and security risks. First, although the training data of each                    the weights of their local model. To prevent such attacks,
                                         user is not disclosed to the server, the model update is. This                   a conventional approach is to have the central server run
                                         causes a privacy threat since it is suggested that a trained                     some defense mechanisms to inspect each of the model
                                         neural network’s parameters enable a reconstruction and/or                       updates, such as using anomaly detection algorithms to
                                         inference of the original training data [1], [12]. Second, fed-                  filter out the poisoned ones [3].
                                         erated learning is generically vulnerable to model poisoning                          However, combining these two lines of research on pri-
                                         attacks that leverage the fact that federated learning enables                   vacy and security is not trivial since they contradict one
                                         adversarial users to directly influence the global model,                        another. In particular, by allowing the server to inspect local
                                         thereby allowing considerably more powerful attacks [3].                         model updates from users to filter out the attacked ones,
                                         Recent research on federated learning focuses on either                          it violates the security model of the secure aggregation. In
                                         improving the privacy of model updates [2], [7], [30] or                         fact, under a secure aggregation setting, the server should
                                         preventing certain poisoning attacks from malicious users                        not be able to learn any useful information when observing
                                         [3], [28], [29], but not both.                                                   an individual model update. Therefore, the server cannot
                                             To preserve users privacy, secure aggregation protocols                      run any defense mechanism directly on each model update
                                         have been introduced into federated learning to devise a                         to deduce whether it is attack-free or not.
                                         training framework that protect the local models [2], [7],                            To tackle this issue, in this paper, we propose a frame-
                                         [30]. These protocols enable the server to privately combine                     work that integrates secure aggregation with defense mech-
                                                                                                                          anisms against poisoning attacks from users. To circumvent
                                         •   T. Nguyen and My T. Thai are with the Department of Computer &               the aforementioned problem, we shift the task of running
                                             Information Science & Engineering, University of Florida, Gainesville,       the defense mechanism onto the users, and then each user
                                             FL, 32611.                                                                   is given the ability to attest the output of the defense mech-
                                             E-mail: truc.nguyen@ufl.edu and mythai@cise.ufl.edu
                                                                                                                          anism with respect to their model update. Simply speaking,
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015                                                                         2

the users are the ones who run the defense mechanism, then       2.1        System model
they have to prove to the server that the mechanism was          We define two entities in our protocol:
executed correctly. To achieve this, the framework leverages
a zero-knowledge proof (ZKP) protocol that the users can               1)     A set of n users U : Each user has a local training
use to prove the correctness of the execution of a defense                    dataset, computes a model update based on this
mechanism. The ZKP protocol must also make it difficult                       dataset, and collaborates with each other to securely
to generate a valid proof if the defense mechanism was                        aggregate the model updates.
not run correctly, and the proof must not reveal any useful            2)     A single server S : Receives the aggregated model
information about the model update in order the retain the                    updates from the users and computes a global
privacy guarantee of secure aggregation.                                      model that can later be downloaded by the users.
    To demonstrate the feasibility of the framework, we use
                                                                     We denote the model update of each user u ∈ U as an
the backdoor attack as a use case in which we construct a
                                                                 input vector xu P of dimension m. For simplicity, all values
ZKP protocol for a specific backdoor defense mechanism
                                                                 in both xu and u∈U xu are assumed to be integers in the
[19]. Furthermore, we also propose a secure aggregation
                                                                 range [0, N ) forPsome known N . The protocol is correct if S
protocol in federated learning to address the limitations of
                                                                 can learn x̄ = u∈U 0 xu for some subset of users U 0 ⊆ U .
previous work. Specifically, we show that our proposed ag-
                                                                 We assume a federated learning process that follows the
gregation protocol is robust against malicious users in a way
                                                                 FedAvg framework [21], in which U 0 is a random subset
that it can still maintain privacy and liveness despite the
                                                                 of U (the server selects a random subset of users for each
fact that some users may not follow the protocol honestly.
                                                                 training round). Security requires that (1) S only learns the
Our framework then can combine both the ZKP and the
                                                                 sum of the users’ model updates including contributions
secure aggregation protocols to tackle the above-mentioned
                                                                 from a large portion of users, (2) S can verify that x̄ is the
privacy and security risks in federated learning.
                                                                 sum of clean model updates, and (3) each user u ∈ U learns
Contributions. Our main contributions are as follows:
                                                                 nothing.
    •   We establish a framework intergrating both the se-           In addition to securely computing x̄, each user u ∈ U
        cure aggregation protocol and defense mechanisms         computes a zero-knowledge proof πu proving that xu is
        against poisoning attacks without violating any pri-     a clean model and not leaking any information about xu .
        vacy guarantees.                                         This proof πu is then sent to and validated by the server S .
    •   We propose a new secure aggregation protocol using       The server then verifies that {πu }u∈U is consistent with the
        homomorphic encryption for federated learning that       obtained x̄.
        can tolerate malicious users and a semi-honest server
        while maintaining both privacy and liveness.
    •   We construct a ZKP protocol for a backdoor defense       2.2        Security model
        to demonstrate the feasibility of our framework.         Threat model. We take into account a computationally
    •   Finally, we analyze the computation and commu-           bounded adversary that can corrupts the server or a subset
        nication cost of the framework and provide some          of users in the following manners. The server is considered
        benchmarks regarding its performance.                    honest-but-curious in a way that it behaves honestly accord-
Organization. The rest of the paper is structured as follows.    ing to the training protocol of federated learning but also
Section 2 provides the system and security models. The           attempts to learn as much as possible from the data received
main framework for combining both the secure aggregation         from the users. The server can also collude with corrupted
protocol and defense mechanisms against poisoning attacks        users in U to learn the inputs of the remaining users.
is shown in Section 3. Section 4 presents our proposed secure    Moreover, some users try to inject unclean model updates to
aggregation protocol for federated learning. Section 5 gives     poison the global model and they can also arbitrarily deviate
the ZKP protocol for a backdoor defense mechanism. We            from the protocol.
evaluate the complexity and benchmark our solution in                As regards the security of the key generation and distri-
Section 6. We discuss some related work in Section 7 and         bution processes, there has been extensive research in con-
provide some concluding remarks in Section 8.                    structing multi-party computation (MPC) protocols where
                                                                 a public key is computed collectively and every user com-
2   S YSTEM AND S ECURITY MODEL                                  putes private keys [10], [22]. Such a protocol can be used
                                                                 in this architecture to securely generate necessary keys for
Current systems proposed to address either the poisoning         the users and the server. Hence, this paper does not focus
attacks from users or the privacy of model updates are           on securing the key generation and distribution processes
designed using security models that conflict each other. In      but instead assume that the keys are created securely and
fact, when preserving the privacy of model updates, the          honest users’ keys are not leaked to others.
central server and the users are all considered as honest-but-   Design goals. Besides the security aspects, our design also
curious (or semi-honest). However, when defending against        aims to address unique challenges in federated learning,
poisoning attacks from users, the server is considered honest    including operating on high-dimensional vectors and users
while the users are malicious.                                   dropping out. With respect to the threat model, we summa-
    Due to that conflict, this section establishes a more gen-   rize the design goals as follows:
eral security model combining the ones proposed in pre-
vious work on security and privacy of federated learning.              1)     The server learns nothing except what can be in-
From the security model, we also define some design goals.                    ferred from x̄.
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015                                                                             3

Fig. 1: Secure framework for FL. Each user u ∈ U trains a local model xu that is used as
                                                                                      Pan input to (1) User aggregation
and (2) ZKP of attack-free model. The User aggregation component returns c̄ = Epk ( u∈U 0 xu ) which is the encryption
of the sum over the local models of honest users. The ZKP of attack-free model component returns the proof πu and the
commitment C(xu ) for each xu . The outputs of these two component are then used by the central server as inputs to the
Computing global model component. This component validates {πu }u∈U , obtains the global model x̄ from c̄, checks if
C(x̄) is consistent with {C(xu )}u∈U , and then returns x̄.

      2)     Each user does not learn anything about the others’
             inputs.                                                                           User aggregation
      3)     The server can verify that x̄ is the sum of clean       Inputs:
             models.
      4)     The protocol operates efficiently on high-                    •Local model xu of each user u ∈ U
             dimensional vectors                                           •Public key pk
      5)     Robust to users dropping out.                           Output: c̄ = Epk ( u∈U 0 xu ) where U 0 ⊆ U is the set of
                                                                                        P
      6)     Maintain undisrupted service (i.e., liveness) in case   honest users
             a small subset of users deviate from the protocol.      Properties:
                                                                           •     The encryption algorithm E(·) is additively homo-
                                                                                 morphic and IND-CPA secure.
3     S ECURE F RAMEWORK FOR F EDERATED L EARN -
                                                                           •     xu is not revealed to other users U \{u} nor the server
ING                                                                              S . More formally, there exists a PPT simulator that
To address the new security model proposed in Section 2, we                      can simulate each u ∈ U such that it can generate
propose a framework to integrate secure aggregation with                         a view for the adversary that is computationally
defenses against poisoning attacks in a way that they can                        indistinguishable from the adversary’s view when
maintain their respective security and privacy properties.                       interacting with honest users.
Our framework is designed to ensure the privacy of users’                  •     Robust to users dropping out
local model while preventing certain attacks from users.
As shown in Fig. 1, our framework consists of three com-                             Fig. 2: User aggregation component
ponents: (1) users aggregation, (2) zero-knowledge proof
of attack-free model, and (3) computing global model. We
discuss the properties of each component as follows.                 3.2       Zero-knowledge proof of attack-free model
                                                                     This component generates a zero-knowledge proof proving
3.1        Users aggregation                                         the local model is free from a certain poisoning attack. In
                                                                     specific, we first assume the existence of a function that
This component securely computes the sum of local models             verifies an attack-free model as follows:
from the users. First, each user u ∈ U trains a local model
update xu and uses it as an input to P      this component.                               (
Then, the component outputs c̄ = Epk ( u∈U 0 xu ) which                                    C(x)    if x is attack-free
                                                                                   R(x) =                                           (1)
is an encryption over the sum of the model updates of                                      ⊥    otherwise
all honest users U 0 ⊆ U . Epk (·) denotes a viable additive
homomorphic encryption scheme and we require that it                 where x is a local model, C(x) is the commitment of x.
must have the property of indistinguishability under chosen          Next, the component implements a zero-knowledge proof
plaintext attack (IND-CPA). The encryption function will be          protocol for the users to attest the correct execution of
discussed in detail in Section 4.                                    R(x) without revealing x. A specific implementation of this
    Furthermore, this aggregation must maintain the privacy          component for defending against backdoor attacks is shown
with respect to each local model xu , particularly, we should        in Section 5. The properties of this component is shown in
be able to simulate the output of each user by a Probabilistic       Fig. 3.
Polynomial Time (PPT) simulator. We specify the security                 The prover algorithm P is run by the users, while the
properties of the Users aggregation component in Fig. 2.             verifier algorithm V is run by the server S .
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015                                                                        4

                                                                   a random variable representing the joint views of all parties
          Zero-knowledge proof of attack-free model                that are corrupted by the adversary according to the threat
Inputs:                                                            model. We show that it is possible to simulate the combined
                                                                   view of any subset of honest-but-curious parties given only
    •    Local model xu of users u                                 the inputs of those parties and the global model x̄. This
    •    Function R(·) verifying an attack-free model              means that those parties do not learn anything other than
Outputs:                                                           their own inputs and x̄.

    •    Zero-knowledge proof πu verifying the correct exe- Theorem 1. There exists a PPT simulator S such that for
         cution of R(xu )                                   Adv ⊆ U ∪ S , the output of the simulator S is computationally
    •    Commitment C(xu )                                  indistinguishable from the output of Adv

Properties:                                                        Proof. Refer to Appendix A.

    •    The commitment scheme C(·) is additively homo-
         morphic.                                                  4     S ECURE AGGREGATION IN F EDERATED L EARN -
    •    Denote (P, V ) as a pair of prover, verifier algorithm,   ING
         respecitvely. If R(xu ) = C(xu ) then Pr[(P, V )(xu ) =
                                                                   In this section, we describe a secure aggregation scheme that
         accept] = 1
                                                                   realizes the User aggregation component of our framework.
    •    If R(xu ) = ⊥ then Pr[(P, V )(xu ) = accept] ≤ 
                                                                   Although some secure aggregation schemes have been pro-
         where  denotes a soundness error.
                                                                   posed for federated learning, they are not secure against
    •    There exists a simulator that, given no access to the
                                                                   our threat model in Section 2. Particularly, the aggregation
         prover, can produce a transcript that is computa-
                                                                   schemes in [2] and [30] assume no collusion between the
         tionally indistinguishable from the one between an
                                                                   server and a subset of users. The work of Bonawitz et al.
         honest prover and an honest verifier.
                                                                   [7] fails to reconstruct the global model if one malicious
                                                                   user arbitrarily deviates from the secret sharing protocol.
          Fig. 3: Zero-knowledge proof component                   A detailed discussion of these related work is provided in
                                                                   Section 7.
                                                                       The main challenge here is that we want to provide ro-
                    Computing global model                         bustness in spite of malicious users, i.e., the protocol should
                                                                   be able to tolerate users that do not behave according to the
Inputs:
                                                                   protocol and produce correct results based on honest users.
    •    Zero-knowledge proof πu for u ∈ U                         Moreover, we need to minimize the users’ communication
    •    Commitment C(xu ) for u ∈ U                               overhead when applying this scheme.
     •   c̄ obtained from the User aggregation
     •   Secret key sk
                                                             4.1 Cryptographic primitives
 Outputs: Global model x̄ = u∈U 0 xu where U 0 ⊆ U is the Damgard-Jurik and Paillier cryptosystems [9], [23]. Given
                               P
 set of honest users                                         pk = (n, g) where n is the public modulus and g ∈ Z∗ns+1
 Properties:                                                 is the base, the Damgard-Jurik encryption of a message
                                                                                             s
     •   Validate πu and only keep valid proofs              m ∈ Zns is Epk (m) = g m rn mod ns+1 for some random
     •   x̄ = Dsk (c̄)                                       r ∈ Z∗n and a positive natural number s. The Damgard-
                                           denotes the homo- Jurik cryptosystem is additively homomorphic such that
                   Q                   Q
     •   C(x̄) = u∈U 0 C(xu ) where
         morphic addition of the commitment scheme.          Epk (m1 ) · Epk (m2 ) = Epk (m1 + m2 ). This cryptosystem
                                                             is semantically secure (i.e., IND-CPA) based on the assump-
                                                             tion that computing n-th residue classes is believed to be
         Fig. 4: Computing global model component            computationally difficult. The Paillier’s scheme is a special
                                                             case of Damgard-Jurik with s = 1.
                                                                 In the threshold variant of these cryptosystems [9],
3.3 Computing global model                                   several shares of the secret key are distributed to some
This component takes as input the outputs from the other decryption parties, such that any subset of at least t of them
two components and produces the corresponding global can perform decryption efficiently, while less than t cannot
model. Hence, it serves to combine two previous com- decrypt the ciphertext. Given a ciphertext, each party can
ponents and it is run solely by the server. Specifically, it obtain a partial decryption. When t partial decryptions are
needs to be able to decrypt c̄ from the first component, and available, the original plaintext can be obtained via a Share
to validate the zero-knowledge proofs {πu }u∈U 0 from the combining algorithm.
second component. Then, it outputs the sum of the model Pedersen commitment [24]. A general commitment scheme
updates from the honest users. We define this component in includes an algorithm C that takes as inputs a secret mes-
Fig. 4.                                                      sage m and a random r, and outputs the commitment
    From the third component, we can prove that this frame- commit to the message m, i.e., commit = C(m, r). A com-
work preserves the privacy of users’ model updates given mitment scheme has two important properties: (1) commit
that all the components satisfy their properties. Let Adv be does not reveal anything about m (i.e., hiding), and (2) it
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015                                                                                5

is infeasible to find (m0 , r0 ) 6= (m, r) such that C(m0 , r0 ) =        in a way that allows the server to decrypt some cu to obtain
C(m, r) (i.e., binding). Pedersen commitment [24] is a real-              the corresponding xu . By using the threshold encryption, as
ization of such scheme and it is also additively homomor-                 long as there is at least one honest user in Ut , the server will
phic such that C(m, r) · C(m0 , r0 ) = C(m + m0 , r + r0 )                not be able to learn xu of any user. Hence, the protocol can
                                                                          tolerate up to t − 1 malicious users, with t < |U|.
4.2   Secure aggregation protocol                                             Second, in the encryption phase, if a user deviates from
                                                                          the protocol, e.g., by computing cu as a random string or
Our secure aggregation scheme relies on additive homo-
                                                                          using a different key other than pk , the server will not be
morphic encryption schemes, that is Epk (m1 + m2 ) =
                                                                          able to decrypt c̄. To tolerate this behavior, the server can
Epk (m1 )·Epk (m2 ). In this work, we use the Daamgard-Jurik
                                                                          iteratively drop each cu from c̄ until it obtains a decryptable
public-key cryptosystem [9] which is both IND-CPA secure
                                                                          value.
and additively homomorphic. The cryptosystem supports
                                                                              In the decryption phase, if a user u∗ does not cooperate
threshold encryption and operates on the set Z∗n .
                                                                          to provide c0u∗ , then the server will not have all necessary
    We also assume that the communication channel be-
                                                                          information to perform decryption. However, if we still
tween each pair of users is secured with end-to-end en-
                                                                          have at least t honest users (not counting u∗ ), we can
cryption, such that an external observer cannot eavesdrop
                                                                          eliminate u∗ from Ut and add another user to Ut , hence,
the data being transmitted between the users. The secure
                                                                          the server can still perform decryption.
communication channel between any pair of users can be
                                                                              In summary, our secure aggregation protocol can tolerate
mediated by the server.
                                                                          t−1 malicious users and be robust against |U|−t users drop-
    In the one-time setup phase, the secret shares {su }u∈U of
                                                                          ping out (where t < |U|). Therefore, t should be determined
the private key sk are distributed among n users in U via the
                                                                          by calibrating the trade-off between the usability (small t)
threshold encryption scheme. The scheme requires a subset
                                                                          and security (large t).
Ut ⊂ U of at least t users to perform decryption with sk . This
process of key distribution can be done via a trusted dealer
or via an MPC protocol. After that, each secure aggregation               4.3     Efficient encryption of xu
process is divided into 2 phases: (1) Encryption and (2)                  We discuss two methods for reducing the ciphertext size
Decryption.                                                               when encrypting xu . Since xu is a vector of dimension m,
Encryption
P               phase: To securely compute the sum x̄ =                   naively encrypting every element of xu using either the
  u∈U   x u , each user u ∈ U first computes the encryption               Paillier or Damgard-Jurik cryptosystems will result in a
cu = Epk (xu ) and then send cu to the server. The server                 huge ciphertext with the expansion factor of 2 log n (i.e., the
then computes the product of all {cu }u∈U , denoting as c̄:               ratio between the size of cu and of xu ). Below, we present
                                                                          (1) a polynomial packing method and (2) a hybrid encryp-
  c̄ =
         Y
               cu   mod ns+1 =
                                 Y
                                       Epk (xu )    mod ns+1              tion method that includes a key encapsulation mechanism
         u∈U                     u∈U
                                                                          (KEM) and a data encapsulation mechanism (DEM).
                                                                  ! (2)   Polynomial packing. We can reduce the expansion factor
                                                                          by packing xu into polynomials as follows. Suppose xu ≡
                                         X
                                                            s+1
                             = Epk             xu   mod n
                                                                             (1)  (2)      (m)
                                         u∈U                              (xu , xu , ..., xu ), we transform this into:
    Additionally, per IND-CPA, the server knowing x̄ (or
any of the non-leaf nodes in the aggregation tree) learns                       x0u = x(1)  (2)  b   (3)  2b      (m)
                                                                                       u + xu · 2 + xu · 2 ... + xu   · 2(m−1)b         (3)
nothing about each individual xu for u ∈ U . If some subset
of users dropout, the rest can still submit their cu and the              where b = dlog N e. Hence,     x0u
                                                                                                         becomes a mb-bit number.
server can compute c̄ for the remaining users.                               Then, to encrypt xu , we simply run the Damgard-Jurik
Decryption phase: Suppose that up to this phase, some                     encryption on x0u . Specifically, we obtain the ciphertext
amount of less than N − t users dropout, that means we still              cu = E(x0u ). This cu will have (d log
                                                                                                              mb
                                                                                                                 n e + 1) log n bits. The
have the subset Ut of at least t users remaining to perform               expansion factor can be calculated as:
decryption.                                                                                            mb
    First, the server sends c̄ to all users in Ut . Each user
                                                                                                   (d log n e + 1) log n
                                                                                             lim                      =1                (4)
u ∈ Ut computes c0u = c̄2su ∆ where ∆ = t!, and then                                        m→∞            mb
sends c0u back to the server. The server then applies P     the               Therefore, as m increases, i.e., larger models, the ex-
Share combining algorithm in [9] to compute x̄ = u∈U xu                   pansion factor approaches 1, thereby implying minimal
from {c0u }u∈Ut . As a result, the server can securely obtain x̄          overhead. Note that since we are packing into polynomials,
without knowing each individual xu .                                      the additive homomorphic property of the ciphertexts is
User drop-outs. Furthermore, by design, it can be seen from               retained.
this protocol that we can tolerate up to |U|−t users dropping                 Additionally, if we use this polynomial packing scheme
out while the server still be able to construct the global                with the Paillier encryption, it would result in an expansion
model of all users. In other words, as long as there are at               factor of 2 as Paillier requires the plaintext size to be smaller
least t users remaining, it will not affect the outcome of the            than the key size. To keep the expansion factor of 1 when
protocol. Therefore, our protocol is robust to user drop-outs.            using Paillier, we introduce a KEM-DEM scheme as follows.
Security against active adversary. We analyze the security                Hybrid encryption (KEM-DEM technique). In [8], the
in the active adversary model. First, the protocol is secure in           authors propose a KEM-DEM approach to compress the
spite of a collusion between the server and a subset of users             ciphertext of Fully Homomorphic Encryption (FHE). With
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015                                                                          6

FHE, they can use blockciphers to generate the keystreams.
However, since our work only deals with Partially HE (PHE)
like Paillier, we cannot use blockciphers, hence, we need to
propose a different approach to generate the keystream with
only PHE.
    The general idea of our protocol is as follows. Suppose
a user wants to send Epk (x) to the server (for simplicity,
we use the notation x instead of xu ). Instead of sending c =
Epk (x) directly, the user chooses a secret random symmetric
key k that has the same size as an element in x and computes
c0 = (Epk (k), SEk (x)) where SEk (x) = x−k , which means
subtracting k from every element in x. c0 is then sent to the                        Fig. 5: KEM-DEM for PHE
server. When the server receives c0 , it can obtain c = Epk (x)
by computing Epk (k) × Epk (SEk (x)) = Epk (x) = c.
    The correctness of this scheme holds because the server         where c is the size of the IV. Therefore, this scheme also
can obtain Epk (x) from the user. The security also holds           achieves the expansion factor close to 1 as m grows, how-
because the server does not know the key k chosen by                ever, it incurs some computational overhead as compared to
the user, hence, it cannot know x. However, this scheme             the polynomial packing scheme.
is not semantically secure, i.e., IND-CPA secure, because by            Although both the KEM-DEM scheme and polynomial
defining SEk (x) = x − k , identical elements in x will result      packing have the same asymptotic expansion factor (e.g.,
in the same ciphertext.                                             1), each has its own advantages and disadvantages. The
    To resolve this, we use an IV (initialization vector) for the   polynomial packing is simple and works out-of-the-box if
encryption function. IV is a nonce which could be a random          we use the Damgard-Jurik cryptosystem. This is because
number or a counter that is increased each time it is used.         Damgard-Jurik allows the plaintext size to be greater than
Denote x = (x1 , ..., xm ), on the user’s side:                     the key size. However, it will not work with conventional
                                                                    cryptosystems like Paillier as they require the plaintext size
   1)    Choose the secret l-bit random key k and a nonce
                                                                    to be smaller than the key size. The KEM-DEM scheme
         IV.
                                                                    would work for all cryptosystems but it is more complex
   2)    compute the keystream ks: For i ∈ [1, m], compute
                                                                    and incurs more computational overhead than the poly-
         ksi = k × (IV ki) mod p where p is an l-bit number.
                                                                    nomial packing scheme. Since our protocol is using the
   3)    Compute SEk (x) = x − ks = (x1 − ks1 , ..., xm −
                                                                    Damgard-Jurik cryptosystem, Section 6 only evaluates the
         ksm )
                                                                    polynomial packing scheme.
   4)    Send (IV, Epk (k), SEk (x)) to the server
On the server’s side, receiving (IV, Epk (k), SEk (x)):
                                                                    5     U SE CASE : ZKP FOR BACKDOOR DEFENSES
   1)    Compute the encrypted keystream Epk (ks): For
         i ∈ [1, m], compute Epk (k)IV ki mod n2 = Epk (k ×         This section describes an implementation that realizes the
         (IV ki)) = Epk (ksi )                                      Zero-knowledge proof component of our framework. Using
   2)    Compute Epk (x) = Epk (SEk (x)) · Epk (ks) =               the backdoor attack as a use case, we provide a ZKP protocol
         Epk (x1 − ks1 + ks1 , ..., xm − ksm + ksm ) =              that can detect backdoor attacks from users. Before that, we
         Epk (x1 , ..., xm )                                        establish some background regarding multi-party protocol
                                                                    (MPC) and ZKP.
Lemma 1. By picking k uniformly at random, and p as an l-bit
prime number, the DEM component (i.e., SEk (x)) is IND-CPA
secure.                                                             5.1   Preliminaries
                                                                    Multi-party computation (MPC) protocol. In a generic MPC
Proof. To show that the DEM component is IND-CPA secure,            protocol, we have n players P1 , ..., Pn , each player Pi holds
it is sufficient prove that k × (IV ki) mod p is uniformly          a secret value xi and wants to compute y = f (x) with
random with respect to k (Theorem 3.32 in [17]). In fact, in        x = (x1 , ..., xn ) while keeping his input private. The players
Z∗p , by picking k uniformly at random, k × (IV ki) mod p is        jointly run an n-party MPC protocol Πf . The view of the
uniformly distributed if and only if neither k nor (IV ki)          player Pi , denoted by V iewPi (x), is defined as the concate-
shares prime factors with p. Hence, p should be an l-bit            nation of the private input xi and all the messages received
prime number and k can be chosen randomly from Z∗p .                by Pi during the execution of Πf . Two views are consistent
    Since the KEM component (i.e., Epk (k)) is IND-CPA              if the messages reported in V iewPj (x) as incoming from
secure by default using Paillier, and the DEM component             Pi are consistent with the outgoing message implied by
is also IND-CPA secure as previously shown in Lemma 1,              V iewPj (x) (i 6= j). Furthermore, the protocol Πf is defined
we can derive the following theorem:                                as perfectly correct if there are n functions Πf,1 , ..., Πf,n
                                                                    such that y = Πf,i (V iewPi (x)) for all i ∈ [n]. An MPC
Theorem 2. The KEM-DEM scheme is IND-CPA secure.
                                                                    protocol is designed to be t-private where 1 ≤ t < n: the
   Figure 5 illustrates this scheme. The bandwidth expan-           views of any t-subset of the parties do not reveal anything
sion factor of this scheme can be computed as:                      more about the private inputs of any other party.
                          mb + 2 log n + c                          Zero-knowledge proof (ZKP). Suppose that we have a
                    lim                    =1                 (5)   public function f , a secret input x and a public output y .
                   m→∞          mb
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015                                                                                      7

By using a ZKP protocol, a prover wants to prove that it            Proof. Refer to Appendix B.
knows an x s.t. f (x) = y , without revealing what x is. A
ZKP protocol must have three properties as follows: (1) if              Given a model x, we define R(r, x) for some random r
the statement f (x) = y is correct, the probability that an         as follows:
honest verifier accepting the proof from an honest prover is                      (
1 (i.e., completeness), (2) if the statement is incorrect, with a                  C(r, x)    if backdoor(x) returns 1
probability less than some small soundness error, an honest             R(r, x) =                                         (6)
                                                                                   ⊥     otherwise
verify can accept the proof from a dishonest prover showing
that the statement is correct (i.e., soundness), and (3) during     where C(r, x) is the commitment of x
the execution of the ZKP protocol, the verifier cannot learn
                                                                       Based on [16], we construct the ZKP protocol as follows.
anything other than the fact that the statement is correct (i.e.,
                                                                    Each user u ∈ U generates a zero-knowledge proof πu
zero-knowledge).
                                                                    proving xu is clean, i.e., R(ru , xu ) = C(ru , xu ). To produce
Backdoor attacks and defenses. Backdoor attacks aim to
                                                                    the proof, each user u runs Algorithm 2 to obtain πu . Note
make the target model to produce a specific attacker-chosen
                                                                    that Πf is a 3-party 2-private MPC protocol, which means
output label whenever it encounters a known trigger in the
                                                                    a subset of any 2 views do not reveal anything about the
input. At the same time, the backdoored model behaves
                                                                    input xu . Then, πu and commitu = C(ru , xu ) are sent to
similarly to a clean model on normal inputs. In the context
                                                                    the server where they will be validated as in Algorithm 3.
of federated learning, any user can replace the global model
                                                                       Finally, for the server to validate that the x̄ obtained
with another so that (1) the new model maintains a similar
                                                                    from secure aggregation is the sum of the models that were
accuracy on the federated-learning task, and (2) the attacker
has control over how the model behaves on a backdoor task                Q by the ZKP protocol,Pit collects {r
                                                                    validated                                      u }u∈U and verify
specified by the attacker [3].
                                                                    that    u∈U commitu = C            u∈U ru , x̄ . This is derived
                                                                    from the additive homomorphic property of the Pedersen
    To defend against backdoor attacks, Liu et al. [19] pro-
                                                                    commitment scheme, where
pose a pruning defense that can disable backdoors by remov-
ing backdoor neurons that are dormant for normal inputs. The
defense mechanism works in the following manner: for each                Y                Y                      X            X
                                                                                                                                          !
iteration, the defender executes the backdoored DNN on                       commitu =         C(ru , xu ) = C         ru ,          xu
samples from the testing dataset, and stores the activation          u∈U                 u∈U                     u∈U          u∈U
                                                                                                                                              (7)
of each neuron per test sample. The defender then computes
                                                                                                                                 !
                                                                                                                 X
the neurons’ average activations over all samples. Next,                                                 =C            ru , x̄
the defender prunes the neuron having the lowest average                                                         u∈U
activation and obtains the accuracy of the resulted DNN on
the testing dataset. The defense process terminates when the
accuracy drops below a predetermined threshold.                      Algorithm 2: Generate zero-knowledge proof
                                                                         Input: xu , λ ∈ N, a perfectly correct and 2-private
5.2   ZKP protocol to backdoor detection                                         MPC Πf protocol among 3 players
To inspect a local model for backdoors while keeping the                 Output: Proof πu
model private, we propose a non-interactive ZKP such that            1   πu ← ∅;
each user u ∈ U can prove that xu is clean without leaking           2   for k = 1, ..., λ do
any information about xu .                                           3       Samples 3 random vectors x1 , x2 , x3 such that
   First, we devise an algorithm backdoor that checks for                     xu = x1 + x2 + x3 ;
backdoor in a given DNN model x as in Algorithm 1.                   4       Consider an 3-input function
                                                                              f (x1 , x2 , x3 ) = R(r, x1 + x2 + x3 ) and emulate
                                                                              the protocol Πf on inputs x1 , x2 , x3 . Obtain the
 Algorithm 1: backdoor(x) – Check for backdoors in
                                                                              views vi = V iewPi (x) for all i ∈ [3];
 a DNN model
                                                                     5       Computes the commitments
   Input: DNN model x, validation dataset
                                                                              commit1 , ..., commit3 to each of the n
   Output: 1 if x is backdoored, 0 otherwise
                                                                              produced views v1 , v2 , v3 ;
 1 Test x against the validation dataset, and record the
                                                                     6       For j ∈ {1, 2}, computes
    average activation of each neuron;
                                                                              ej = H(j, {commiti }i∈[3] );
 2 k ← the neuron that has the minimum average
                                                                     7       πu ←
    activation;                                                                                                                   
                                                                              πu ∪ {ej }j∈{1,2} , {vej }j∈{1,2} , {commiti }i∈[3]
 3 Prune k from x;
                                                                     8   return πu
 4 Test x against the validation dataset;
 5 if accuracy drops by a threshold τ then
 6     return 1
                                                                    Theorem 4. The proposed zero-knowledge proof protocol satisfies
 7 else
                                                                    the completeness, zero-knowledge, and soundness properties with
 8     return 0                                                                       λ
                                                                    soundness error 32 where λ is a security parameter.

Theorem 3. backdoor(x) returns 1 ⇒ x is a backdoored model          Proof. Refer to Appendix C.
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015                                                                                  8

 Algorithm 3: Validate zero-knowledge proof
   Input: πu , commitu
   Output: 1 if accept, 0 if reject
 1 for pk ∈ πu do
                                                      
 2      {ej }j∈{1,2} , {vej }j∈{1,2} , {commiti }i∈[3] ← pk ;
 3     if {commiti }i∈[3] is invalid then
 4         return 0
 5     if ej 6= H(j, {commiti }i∈[3] ) for j ∈ {1, 2} then
 6         return 0
 7     if ∃e ∈ {e1 , e2 } : Πf,e (V iewPe (x)) 6= commitu         Fig. 6: Wall-clock running time per user, as the size of the
        then                                                      data vector increases.
 8         return 0
 9     if ve1 and ve2 are not consistent with each other then
10         return 0
11 return 1

6     E VALUATION
In this section, we analyze the computation and commu-
nication cost of the proposed protocol on both the users
and the server. Then we implement a proof-of-concept and          (a) Total data expansion factor per     (b) Total data transfer per user.
                                                                  user, as compared to sending the        The amount of transmitted data
benchmark the performance.
                                                                  raw data vector to the server. Dif-     at 1024-bit, 2048-bit, and 3072-bit
                                                                  ferent lines represent different val-   keys are nearly identical to 4096-
6.1   Performance analysis                                        ues of k.                               bit key

Users. For the communication cost, w.r.t the secure aggrega-                  Fig. 7: Bandwidth consumption per user
tion, a user has to send out the encrypted model, which cost
O(m). Regarding the zero-knowledge proof, the user needs
to submit a proof of size O(mkλ) where k is the size of the       Also note that the computation of each user is independent
commitments.                                                      of each other, hence, the number of users do not affect the
    As regards the computation cost, the computation in-          user’s running time. By polynomial packing the data vector
cludes packing the data vector into polynomials and per-          and optimizing the encryption process, the user can encrypt
form encryption which take O(m + s) in total for the secure       the entire data vector within seconds.
aggregation. The zero-knowledge proof protocol requires λ             Fig. 7a shows the bandwidth expansion factor of the
runs of O(L) model inferences where L is the size of the          model encryption per user as we increase the data vector
validation dataset, thus the total cost is O(λL)                  size. It can be observed that the factor approaches 1 as
Server. In the secure aggregation scheme, since the server        the model size increases, thereby conforming the analysis
receives data that are sent from all users, the communica-        of reducing the ciphertext size. Moreover, the number of
tion complexity is O(m|U|). As regards the computation            users also does not affect the bandwidth consumption. This
cost, for the secure aggregation, the computation includes        result implies that our secure aggregation scheme induces
aggregating the ciphertext from each user and perform             minimal overhead as compared to sending the raw data
decryption on the aggregated model, which takes O(m|U|2 )         vector, and can scale to a large number of users.
in total. Validating the zero-knowledge proofs from all users
                                                                      Fig. 7b illustrates the total data transfer of each user. The
requires another O(|U|kλ) computations.
                                                                  total data increases linearly with the data vector size, which
                                                                  follows the communication complexity O(m). Furthermore,
6.2   Proof-of-concept and benchmarking                           as shown in Fig. 7a, the amount of transmitted data is very
Secure aggregation. To measure the performance, we im-            close to the raw data size under all key sizes, therefore,
plement a proof-of-concept in C++ and Python 3. The               the total data transfer is approximately the vector size
Damgard-Jurik cryptosystem is used with 1024-, 2048-, 3072-       multiplied by 3 bytes.
, and 4096-bit keys. To handle big numbers efficiently, we            On the server’s side, Fig. 8a shows the running time on
use the GNU Multiple Precision PArithmetic Library [14]. We       the server as we increase the number of users. As can be
assume that each element in u∈U xu can be stored with             seen, for m = 100000, it only takes more than 15 minutes
3 bytes without overflow. This assumption conforms with           to run, and for m = 500000, the server can run within
the experiments in [7]. All the experiments are conducted         90 minutes, which is a good running time for machine
on a single-threaded machine equipped with an Intel Core          learning in distributed system. This result conforms the
i7-8550U CPU and 16GB of RAM running Ubuntu 20.04.                running time complexity of O(m|U|2 ). Furthermore, Fig. 8b
    Wall-clock running times for users is plotted in Fig. 6.      gives the amount of data received by the server. Since we
We measure the running time with key sizes of 1024, 2048,         already optimized the size of sent data on the user-side,
and 4096, respectively. As can be seen, under all settings, the   the bandwidth consumption of the server is also optimized,
result conforms with the running time complexity analysis.        which follows the communication complexity O(m|U|).
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015                                                                              9

                                                                          users. However, the protocol relies on users honestly fol-
                                                                          lowing the secret sharing scheme. Consequently, although
                                                                          the privacy guarantee for honest users still retains, a single
                                                                          malicious user can arbitrarily deviate from the secret sharing
                                                                          protocol and make the server fail to reconstruct the global
                                                                          model. Furthermore, there is no mechanism to identify the
                                                                          attacker if such an attack occurs.
                                                                              In [2] and [30], the decryption key is distributed to
                                                                          the users, and the server uses homomorphic encryption to
(a) Wall-clock running time of        (b) Total bandwidth consumption
the server at different data vector   of the server at various number     blindly aggregate the model updates. However, the authors
sizes, as the number of users in-     of users, as the data vector size   assume that there is no collusion between the server and
creases.                              increase.                           users so that the server cannot learn the decryption key.
                                                                          Hence, the system does not work under our security model
Fig. 8: Running time and bandwidth consumption of the
                                                                          where users can be malicious and collude with the server.
server. The key size is fixed to 4096 bits
                                                                              Additionally, there have been studies on how to use
                                                                          generic secure MPC based on secret sharing to securely
                                                                          compute any function among multiple parties [4], [11], [18].
                                                                          However, with these protocols, each party has to send a
                                                                          secret share of its whole data vector to a subset of the other
                                                                          parties. In order to make the protocols robust, the size of
                                                                          this subset of users should be considerably large. Since each
                                                                          secret share has the same size as the entire data vector’s,
                                                                          these approaches are not practical in federated learning in
                                                                          which we need to deal with high-dimensional vectors.
     (a) User generating ZKP          (b) Server validating ZKPs of 100   Zero-knowledge proof for detecting poisoned models.
                                      users                               Although there have been many studies on developing a
                                                                          ZKP protocol based on MPC, they have not been widely
Fig. 9: Wall-clock running time of the zero-knowledge proof
                                                                          used in the machine learning context. The IKOS protocol
protocol on both users and server, as λ increases
                                                                          proposed by Ishai et al. [16] is the first work that leverages
                                                                          secure multi-party computation protocols to devise a zero-
Zero-knowledge proof. For the implementation of the zero-                 knowledge argument. Giacomelli et al. [13] later refine this
knowledge backdoor detection, we train a neural network                   approach and construct ZKBoo, a zero-knowledge argument
to use as an input. We use the MNIST dataset and train a 3-               system for Boolean circuits using collision-resistant hashes
layer DNN model in the Tensorflow framework using ReLU                    that does not require a trusted setup. Our proposed ZKP
activation at the hidden layers and softmax at the output                 protocol is inspired by the IKOS protocol with repetitions
                                                                                                                λ
layer. The training batch size is 128. Then, we implement an              to reduce the soundness error to 23 . We also demonstrate
MPC protocol that is run between 3 emulated parties based                 how such protocol can be used in the context of machine
on [26]. The model’s accuracy is 93.4% on the test set.                   learning, especially for attesting non-poisoned models while
    To benchmark the performance of the ZKP protocol,                     maintaining the privacy guarantees.
we measure the proving and validation time against the                        Another line of research in ZKP focuses on zkSNARK
soundness error. As shown before, the soundness error is                  implementations [5] which have been used in practical ap-
2λ                                                                        plications like ZCash [25]. However, these systems depend
3   where λ is the security parameter. Fig. 9a illustrates
the proving time to generate a ZKP by each user. We                       on cryptographic assumptions that are not standard, and
can see that as λ increases, the soundness error decreases,               have large overhead in terms of memory consumption and
nonetheless, the proving time increases linearly. This is a               computation cost, thereby limiting the statement sizes that
trade-off, since the protocol runs slower if we need to obtain            they can manage. Therefore, it remains a critical challenge
a better soundness error. It only takes more than 5 minutes               whether zkSNARK could be used in machine learning
to achieve a soundness error of 0.06, and roughly 7 minutes               where the circuit size would be sufficiently large.
to achieve 0.03.                                                          Defense mechanisms against poisoning attacks. There
    Fig. 9b shows the validation time of the ZKP protocol                 have been multiple research studies on defending against
when the server validates the proofs of 100 users. Similar to             poisoning attacks. Liu et al. propose to remove backdoors
the proving time, there is a trade-off between the running                by pruning redundant neurons [19]. On the other hand,
time and the soundness error. For soundness error of 0.03,                Bagdasaryan et al. [3] devise several anomaly detection
we need about 3.5 minutes to validate 100 users, which is                 methods to filter out attacks. More recently proposed de-
suitable for most FL use cases [6].                                       fense mechanisms [20], [27] detect backdoors by finding
                                                                          differences between normal and infected label(s).
                                                                              Our work leverages such a defense mechanism to con-
7    R ELATED WORK                                                        struct a ZKP protocol for FL in which the users can run the
Secure aggregation in FL. Leveraging secret sharing and                   defense locally on their model updates, and attest its output
random masking, Bonawitz et al. [7] propose a secure aggre-               to the server, without revealing any information about the
gation protocol and utilize it to aggregate local models from             local models. Different defense mechanisms may have dif-
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015                                                                                                10

ferent constructions of the ZKP protocols, nonetheless, they                     [11] Ivan Damgård, Valerio Pastro, Nigel Smart, and Sarah Zakarias.
must all abide by the properties specified in our framework                           Multiparty computation from somewhat homomorphic encryp-
                                                                                      tion. In Annual Cryptology Conference, pages 643–662. Springer,
(Section 3). As shown in Section 5, we construct a ZKP                                2012.
protocol based on the defense proposed by Liu et al. [19].                       [12] Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. Model
                                                                                      inversion attacks that exploit confidence information and basic
                                                                                      countermeasures. In Proceedings of the 22nd ACM SIGSAC Con-
8     C ONCLUSION                                                                     ference on Computer and Communications Security, pages 1322–1333,
                                                                                      2015.
In this paper, we have proposed a secure framework for                           [13] Irene Giacomelli, Jesper Madsen, and Claudio Orlandi. Zkboo:
federated learning. Unlike existing research, we integrate                            Faster zero-knowledge for boolean circuits. In 25th {usenix}
both secure aggregation and defense mechanisms against                                security symposium ({usenix} security 16), pages 1069–1083, 2016.
                                                                                 [14] Torbjörn Granlund. Gnu multiple precision arithmetic library.
poisoning attacks under the same threat model that main-                              http://gmplib. org/, 2010.
tains their respective security and privacy guarantees. We                       [15] Andrew Hard, Kanishka Rao, Rajiv Mathews, Swaroop Ra-
have also proposed a secure aggregation protocol that can                             maswamy, Françoise Beaufays, Sean Augenstein, Hubert Eichner,
maintain liveness and privacy for model updates against                               Chloé Kiddon, and Daniel Ramage. Federated learning for mobile
                                                                                      keyboard prediction. arXiv preprint arXiv:1811.03604, 2018.
malicious users. Furthermore, we have designed a ZKP pro-                        [16] Yuval Ishai, Eyal Kushilevitz, Rafail Ostrovsky, and Amit Sahai.
tocol for users to attest non-backdoored models without re-                           Zero-knowledge from secure multiparty computation. In Pro-
vealing any information about their models. Our framework                             ceedings of the thirty-ninth annual ACM symposium on Theory of
                                                                                      computing, pages 21–30, 2007.
combines these two protocols and shows that the server can
                                                                                 [17] Jonathan Katz and Yehuda Lindell. Introduction to modern cryptog-
detect backdoors while preserving the privacy of the model                            raphy. CRC press, 2020.
updates. The privacy guarantees for the users’ models have                       [18] Yehuda Lindell, Benny Pinkas, Nigel P Smart, and Avishay
been theoretically proven. Moreover, we have presented                                Yanai. Efficient constant round multi-party computation combin-
                                                                                      ing bmr and spdz. In Annual Cryptology Conference, pages 319–338.
an analysis on the computation and communication cost                                 Springer, 2015.
and provided some benchmarks regarding its runtime and                           [19] Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg. Fine-
bandwidth consumption.                                                                pruning: Defending against backdooring attacks on deep neural
                                                                                      networks. In International Symposium on Research in Attacks, Intru-
                                                                                      sions, and Defenses, pages 273–294. Springer, 2018.
R EFERENCES                                                                      [20] Yingqi Liu, Wen-Chuan Lee, Guanhong Tao, Shiqing Ma, Yousra
                                                                                      Aafer, and Xiangyu Zhang. Abs: Scanning neural networks for
[1]  Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan,                       back-doors by artificial brain stimulation. In Proceedings of the 2019
     Ilya Mironov, Kunal Talwar, and Li Zhang. Deep learning with                     ACM SIGSAC Conference on Computer and Communications Security,
     differential privacy. In Proceedings of the 2016 ACM SIGSAC                      pages 1265–1282, 2019.
     conference on computer and communications security, pages 308–318,          [21] Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson,
     2016.                                                                            and Blaise Aguera y Arcas. Communication-efficient learning of
[2] Yoshinori Aono, Takuya Hayashi, Lihua Wang, Shiho Moriai, et al.                  deep networks from decentralized data. In Artificial intelligence and
     Privacy-preserving deep learning via additively homomorphic                      statistics, pages 1273–1282. PMLR, 2017.
     encryption. IEEE Transactions on Information Forensics and Security,        [22] Takashi Nishide and Kouichi Sakurai. Distributed paillier cryp-
     13(5):1333–1345, 2017.                                                           tosystem without trusted dealer. In International Workshop on
[3] Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin,                     Information Security Applications, pages 44–60. Springer, 2010.
     and Vitaly Shmatikov. How to backdoor federated learning. In                [23] Pascal Paillier. Public-key cryptosystems based on composite
     International Conference on Artificial Intelligence and Statistics, pages        degree residuosity classes. In International conference on the theory
     2938–2948. PMLR, 2020.                                                           and applications of cryptographic techniques, pages 223–238. Springer,
[4] Michael Ben-Or, Shafi Goldwasser, and Avi Wigderson. Complete-                    1999.
     ness theorems for non-cryptographic fault-tolerant distributed              [24] Torben Pryds Pedersen. Non-interactive and information-theoretic
     computation. In Providing Sound Foundations for Cryptography: On                 secure verifiable secret sharing. In Annual international cryptology
     the Work of Shafi Goldwasser and Silvio Micali, pages 351–371. 2019.             conference, pages 129–140. Springer, 1991.
[5] Nir Bitansky, Alessandro Chiesa, Yuval Ishai, Omer Paneth, and               [25] Eli Ben Sasson, Alessandro Chiesa, Christina Garman, Matthew
     Rafail Ostrovsky. Succinct non-interactive arguments via linear                  Green, Ian Miers, Eran Tromer, and Madars Virza. Zerocash:
     interactive proofs. In Theory of Cryptography Conference, pages 315–             Decentralized anonymous payments from bitcoin. In 2014 IEEE
     333. Springer, 2013.                                                             Symposium on Security and Privacy, pages 459–474. IEEE, 2014.
[6] Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry                  [26] Sameer Wagh, Divya Gupta, and Nishanth Chandran. Securenn: 3-
     Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub                        party secure computation for neural network training. Proceedings
     Konečnỳ, Stefano Mazzocchi, H Brendan McMahan, et al. To-                      on Privacy Enhancing Technologies, 2019(3):26–49, 2019.
     wards federated learning at scale: System design. arXiv preprint            [27] Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal
     arXiv:1902.01046, 2019.                                                          Viswanath, Haitao Zheng, and Ben Y Zhao. Neural cleanse:
[7] Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marce-                      Identifying and mitigating backdoor attacks in neural networks.
     done, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron                      In 2019 IEEE Symposium on Security and Privacy (SP), pages 707–
     Segal, and Karn Seth. Practical secure aggregation for privacy-                  723. IEEE, 2019.
     preserving machine learning. In proceedings of the 2017 ACM                 [28] Hongyi Wang, Kartik Sreenivasan, Shashank Rajput, Harit Vish-
     SIGSAC Conference on Computer and Communications Security, pages                 wakarma, Saurabh Agarwal, Jy-yong Sohn, Kangwook Lee, and
     1175–1191, 2017.                                                                 Dimitris Papailiopoulos. Attack of the tails: Yes, you really
[8] Anne Canteaut, Sergiu Carpov, Caroline Fontaine, Tancrède Le-                     can backdoor federated learning. In H. Larochelle, M. Ranzato,
     point, María Naya-Plasencia, Pascal Paillier, and Renaud Sirdey.                 R. Hadsell, M. F. Balcan, and H. Lin, editors, Advances in Neural
     Stream ciphers: A practical solution for efficient homomorphic-                  Information Processing Systems, volume 33, pages 16070–16084.
     ciphertext compression. Journal of Cryptology, 31(3):885–916, 2018.              Curran Associates, Inc., 2020.
[9] Ivan Damgård, Mads Jurik, and Jesper Buus Nielsen. A generaliza-             [29] Chulin Xie, Keli Huang, Pin-Yu Chen, and Bo Li. Dba: Distributed
     tion of paillier’s public-key system with applications to electronic             backdoor attacks against federated learning. In International Con-
     voting. International Journal of Information Security, 9(6):371–385,             ference on Learning Representations, 2019.
     2010.                                                                       [30] Chengliang Zhang, Suyi Li, Junzhe Xia, Wei Wang, Feng Yan,
[10] Ivan Damgård and Maciej Koprowski. Practical threshold rsa                       and Yang Liu. Batchcrypt: Efficient homomorphic encryption for
     signatures without a trusted dealer. In International Conference on              cross-silo federated learning. In 2020 {USENIX} Annual Technical
     the Theory and Applications of Cryptographic Techniques, pages 152–              Conference ({USENIX}{ATC} 20), pages 493–506, 2020.
     165. Springer, 2001.
You can also read