)-Indistinguishable Mixing for Cryptocurrencies - Sciendo

Page created by Gary Hunter
 
CONTINUE READING
)-Indistinguishable Mixing for Cryptocurrencies - Sciendo
Proceedings on Privacy Enhancing Technologies ; 2022 (1):49–74

Mingyu Liang*, Ioanna Karantaidou, Foteini Baldimtsi, S. Dov Gordon, and Mayank Varia

(, δ)-Indistinguishable Mixing for
Cryptocurrencies
Abstract: We propose a new theoretical approach for             tions [56]. However, despite the high adoption rates and
building anonymous mixing mechanisms for cryptocur-             the importance of cryptocurrencies in the current finan-
rencies. Rather than requiring a fully uniform permuta-         cial ecosystem, user privacy is repeatedly overlooked.
tion during mixing, we relax the requirement, insisting              In any typical blockchain based cryptocurrency, all
only that neighboring permutations are similarly likely.        user transactions are public and stored in the blockchain
This is defined formally by borrowing from the defini-          forever. In systems like Bitcoin, the user identity is never
tion of differential privacy. This relaxed privacy defini-      posted on the blockchain; instead, users are represented
tion allows us to greatly reduce the amount of interac-         by random looking addresses (or else pseudonyms),
tion and computation in the mixing protocol. Our con-           which in the early years of Bitcoin created the percep-
struction achieves O(n·polylog(n)) computation time for         tion that Bitcoin transactions were anonymous. Today,
mixing n addresses, whereas all other mixing schemes            however, we are well aware that this is not true. Bit-
require O(n2 ) total computation across all parties. Ad-        coin transactions are only pseudoanonymous, and they
ditionally, we support a smooth tolerance of fail-stop          are linkable to each other. Not only can any observer
adversaries and do not require any trusted setup. We an-        of the blockchain trivially link together transactions in-
alyze the security of our generic protocol under the UC         volving a specific address, but they can also cluster to-
framework, and under a stand-alone, game-based defi-            gether sets of addresses as belonging to the same user by
nition. We finally describe an instantiation using ring         relying on simple transaction patterns, such as receiv-
signatures and confidential transactions.                       ing change amounts, merging addresses together, and
                                                                so on. This linkability, when combined with external
Keywords: Anonymous Mixing, Cryptocurrency, Differ-
                                                                data, such as data from exchanges or ground truth data,
ential Privacy
                                                                or when aided by active attacks such as “dusting at-
DOI 10.2478/popets-2022-0004                                    tacks” [22], can result in complete de-anonymization of
Received 2021-05-31; revised 2021-09-15; accepted 2021-09-16.   users [2, 42, 52]. The trend of linking transactions and
                                                                de-anonymizing users has followed in other cryptocur-
                                                                rencies that were built using the same mechanics as Bit-
1 Introduction                                                  coin, such as Ethereum, Litecoin, BitcoinCash [21, 36]
                                                                as well as cryptocurrencies constructed under very dif-
Cryptocurrencies are the main application of blockchain         ferent principles such as Ripple [45].
technologies. Starting with Bitcoin back in 2008 [48],               One approach to providing anonymity is through
a large number of different types of cryptocurrencies           the use of mixing protocols. The study of mixing pro-
have emerged, with the total number of distinct avail-          tocols has a long history in cryptography, first in the
able cryptocurrencies today getting close to 3000. Exist-       context of anonymous messaging and electronic voting,
ing cryptocurrencies share a market capital of over $1.5        and then with applications in secure multiparty compu-
trillion [20] and a total of over 1 million daily transac-      tation. Broadly, and without attempting to be exhaus-
                                                                tive, protocols for mixing fall into a few categories . In
                                                                mix chains, each party takes a turn shuffling the mes-
*Corresponding Author: Mingyu Liang: George Mason
                                                                sages. These are especially useful when there is some
University, E-mail: mliang5@gmu.edu                             small set of servers that are trusted to contain an hon-
Ioanna Karantaidou: George Mason University, E-mail:            est participant. However, when n parties are involved in
ikaranta@gmu.edu                                                the mix, the communication complexity is O(n2 ), and
Foteini Baldimtsi: George Mason University, E-mail:             the round complexity is O(n), as each party takes a
foteini@gmu.edu
                                                                turn, sequentially, with mixing the values and forward-
S. Dov Gordon: George Mason University, E-mail: gor-
don@gmu.edu                                                     ing them along. Another approach to mixing is to use
Mayank Varia: Boston University, E-mail: varia@bu.edu           secure computation to obliviously evaluate a switching
)-Indistinguishable Mixing for Cryptocurrencies - Sciendo
(, δ)-Indistinguishable Mixing for Cryptocurrencies   50

network. While we have efficient constant-round pro-              lieves that Bob is spending his coin at a particular local
tocols for obliviously evaluating circuits, these proto-          coffee shop, but this location accepts hundreds of coins
cols require O(n2 ) communication (independent of the             on any given day. Alice then finds out that Bob has par-
circuit size).1 Additionally, it is not immediately clear         ticipated in a small mix involving a few thousand coins
how to efficiently implement a switching network in the           from around the world. If any coin from that mix is later
context of crypto-currencies, as “moving” coins requires          spent at the location of interest, it almost certainly be-
knowledge of a secret signing key, and moving them                longs to Bob, as it is unlikely that many other people in
obliviously would require creating signatures without             Bob’s town participated in the same mixing protocol.
anyone, including the key holder or the recipient, learn-              Although we do not make a formal claim along these
ing who signed and who received the coin. Obliviously             lines, the intuition is clear: we need large-scale mixing.
accessing these keys at each step of the circuit would be         To entirely remove the value of auxiliary information, we
prohibitively expensive.                                          must mix the full network of unspent coins (UTXOs)
     In the setting of cryptocurrencies, most decentral-          before any of those coins are spent again. Currently,
ized mixing mechanisms [44, 54] are designed as peer-             the Bitcoin network has about 68 million UTXOs. Scal-
to-peer (P2P) protocols, and allow a set of mutually              ing current approaches to such a large number of coins
distrusting peers to publish their messages anonymously           seems prohibitive, and finding protocols with asymp-
without requiring any external party. Such protocols are          totic improvement on this very old problem would rep-
often realized through the use of DC-nets [19], which             resent a major breakthrough.
require O(n2 ) computational cost, and a total of O(n2 )
communication when n parties are mixing their coins
(dominated by the underlying anonymous broadcast).                1.1 Our Contributions
They are also highly interactive, requiring O(n) rounds
if there is a constant fraction of corrupted parties. Con-        We propose a new approach for mixing at scale. Rather
cretely, decentralized mixing protocols, such as Coin-            than relaxing security by reducing the size of the mix,
shuffle(++) [53, 54], require heavy coordination among            we propose to relax the distribution over the shuffling
the participating parties.                                        permutations. Inspired by the definition of differen-
                                                                  tial privacy [23], we propose the definition of (, δ)-
The Need for Large Scale Mixing. Summing up the
                                                                  indistinguishability, which requires that for all “neigh-
previous discussion, all known, fully secure mixing pro-
                                                                  boring” permutations, any observed leakage from the
tocols require O(n2 ) communication, and all that are
                                                                  mixing using these permutations should be similarly
amenable to mixing cryptocurrency have round com-
                                                                  likely. That is, for all pairs of input addresses (sa , sb )
plexity O(n). The high computation and communica-
                                                                  and output addresses (rc , rd ), an observer of the mixing
tion costs of decentralized mixing protocols has lead to
                                                                  protocol should learn little, statistically, about whether
the adoption of small scale mixing, in the hopes that a
                                                                  sender sa pays receiver rc and sb pays rd , or whether sa
little privacy is better than no privacy. For example, as
                                                                  pays rd and sb pays rc . Although traditionally used to
reported in [26], when the CoinJoin protocol was used
                                                                  bound leakage when querying datasets, differential pri-
on top of Bitcoin in the 2015-2017 period, it had a mean
                                                                  vacy has inspired privacy definitions in many new ap-
of 3.98 coins in each transaction, and a standard de-
                                                                  plication domains, such as secure computation [30, 41],
viation of 1.72. At the same time, it has been shown
                                                                  oblivious computation in a client/server model [5, 18,
that a small amount of auxiliary information, e.g. that
                                                                  34, 62], and anonymous messaging [38, 57, 60].
two or more mixed coins belong to the same entity (po-
tentially learned through cookie tracking), can lead to                We build the first cryptocurrency mixing mecha-
cluster intersection attacks, removing the privacy gained         nism inspired by differential privacy, where two neigh-
from mixing [26]. Thus, it has been demonstrated that             boring mixing permutations give rise to nearly the same
small scale mixing can be useless. In fact, it is not hard        set of transactions. Our protocol requires O(n log n) to-
to imagine a scenario where small scale mixing can do             tal communication when n users are mixing coins, has
more harm than good. For example, imagine Alice be-               sub-quadratic total computation cost, and requires only
                                                                  constant or polylog(n) communication rounds (depend-
                                                                  ing on which protocol variant is chosen). In addition
1 One exception is the work of Movahedi et al. [47], which uses   to our constructions, we provide the first definitions for
quorums in the MPC to avoid the O(n2 ) term but still suffers     anonymous cryptocurrency mixing with leakage. We be-
from the efficient signing problem we mention.                    lieve that this weakening of the anonymity definition
(, δ)-Indistinguishable Mixing for Cryptocurrencies   51

may open the door to new protocol constructions, be-           use of butterfly networks, every item takes a disjoint
yond the ones described here.                                  path to its destination. In our construction, in order to
                                                               avoid the need for further coordination, we allow multi-
                                                               ple users to pass their currency through the same net-
1.2 Overview of Our Construction                               work nodes by allowing multiple parties to assign ad-
                                                               dresses in the same node (with the exception of the in-
Our construction is a distributed mixing mechanism for         put layer). We therefore refer to those intermediate and
n users, based on an anonymous transaction building            output nodes as buckets.
block, which allows transferring a hidden value from an             Each participant only handles her own currency,
address sp to an address rp , such that the link between       sending it along the path from the source to the tar-
sender and receiver is obfuscated. We provide a rigorous       get. The use of anonymous transactions for moving the
definition of this functionality in Section 3. Such a build-   value through the network provides some privacy, but
ing block can be realized via the use of ring signatures       does not suffice for any formal guarantee: we also need
and confidential transactions, as done in Monero [43]          to hide the number of addresses in each bucket, as this
(see Section 1.3), or via the use of generic ZK argu-          reveals information about the permutation being used.
ments. A naive solution for mixing, if we implemented          The number of addresses in each bucket can be viewed
anonymous transactions with ring signatures, would re-         as a histogram, which is the canonical problem for the
quire every participant to create a spending transaction       application of differential privacy. We add “noisy” ad-
with ring size n (in order to ensure an anonymity set          dresses in every bucket to hide the number of real ad-
of size n). Although fully secure, such a naive solution       dresses. Noisy addresses will be contributed distribu-
has quadratic computation and verification costs, and          tively, by all participants, and without the need of coor-
n log n total communication cost (if a state of the art        dination, by using a noise distribution that is additive
logarithmic-size ring signature [37, 63] is used).             (i.e. negative binomial distribution/Polya distribution).
     As discussed above, in order to achieve sub-              Protocol Complexity: We explore butterfly networks
quadratic overall computation cost (i.e., sublinear for        with different branch factors and depths. If we apply a
each mixing participant) we relax the level of privacy by      butterfly network with only one intermediate layer of
allowing some leakage. We divide the mixing procedure          buckets (what we call the “one-layer case”), we achieve
into multiple rounds of smaller mixings (which is a typi-      a sub-quadratic expected computation cost of O(n1.5 )
cal pattern for various mixing protocols). In each round,      for all participants. The communication cost, if instan-
every participant creates a transaction that mixes her         tiated with logarithmic size ring signature, is O(n log n)
coin with a small subset of the total set of participants.     for all participants, as in the naive case. Using multi-
To determine the mix groups for each round, we rely            layer butterfly networks can further reduce computation
on the structure of an n-size butterfly network. What          cost, but at the cost of increased communication cost.
is nice about butterfly networks is that they provide a        For example, by using a log n depth butterfly network
path from each of the n input nodes to the n output            (for the 2-ary case), we have O(n log3 n) expected com-
nodes while ensuring that the fan-in, k, and the depth         putational cost and O(n log2 n log log n) expected com-
d = logk (n) are sub-linear in n.                              munication cost. Recall that the naive construction re-
     At a high level, our protocol (detailed in Section 4)     quires O(n2 ) computation, and, as we will discuss in
works as follows. Each participant owns a pair of ad-          Section 1.3, the total computation cost for n transac-
dresses: a source address in which their currency resides      tions in Monero, or an n size mixing mechanism like
at the start of the protocol, and target address that will     Coinshuffle++ is quadatric in n.
hold the currency after the mixing is complete: (sp , rp ).         Through a clever improvement in which we merge
These source address is assigned a unique node at the          certain buckets in the one-layer construction, we man-
input layer of the butterfly network, and the target is        age to “reuse” some noise, substantially improving the
placed randomly in one of the output nodes of the net-         parameters. Concretely, for privacy parameters  =
work (possibly sharing that node with another target           2.303 and δ = 10−4 (a common selection in the liter-
address). Participants transfer their currency from the        ature), we require an average of 37 noisy addresses in
source to the target by creating and placing public keys       each bucket. By a proof-of-concept evaluation, we show
in the intermediate nodes, and making a series of anony-       that when n > 605, we achieve better concrete computa-
mous transactions to these keys, one for each level of the     tional efficiency when compared with the naive solution.
tree, according to the network topology. In the typical        We also show, that the communication cost is bounded
(, δ)-Indistinguishable Mixing for Cryptocurrencies   52

within three times that of the naive approach when us-         parties that communicate today are likely to communi-
ing log-size ring signatures. See Section 6 for details.       cate again tomorrow.
Security: In Section 3, we provide two security defi-          Moving Towards Practice: In Section 7, we pro-
nitions for our mixing protocol: a simulation-based one        vide a strengthening of our construction that handles
in the universally composable (UC) security framework          transaction fees and provides resilience against DDoS
for the case of mixing coins of equal value, and a game-       attacks. In Appendix B, we discuss an instantiation of
based definition that supports mixing of variable val-         our generic mixing construction built upon any anony-
ues and that explicitly specifies the adversary’s advan-       mous transaction functionality, such as Monero’s ring
tage in distinguishing neighboring permutations. For           signatures and confidential transactions. For our instan-
our simulation-based definition, we design an anony-           tiation, we modify the ring signature content in order
                                 T
mous mixing functionality FAnonMix       that, after receiv-   to support loop-payments, i.e. the ability for an input
ing a number of input and output addresses from par-           address to transfer amounts back to itself. Then, we de-
ticipants, performs mixing at predetermined time slots         scribe the ring’s structure as predefined by the parent
                                   T
T . The mixing functionality FAnonMix       leverages exist-   buckets in the previous level and the output address for
ing functionalities for a bulletin board and a globally-       both real and noisy transactions, and we claim full in-
accessible clock.                                              distinguishability between them. We present a variant
     In Section 5 we show that our protocol is secure          of our construction in Section 7 that is compatible with
against malicious behavior with any number of corrup-          current signature and fee requirements of Monero.
tions (for both our definitions). Intuitively, since each
participant handles her funds all the way to the desti-
nation, there is no way for a malicious party to steal         1.3 Related Work
money. Colluding users can decrease the anonymity set
size by the number of colluding parties, but nothing else      The community has reacted to the privacy shortcom-
is revealed. Furthermore, our protocol is robust against       ings of cryptocurrencies by providing a number of solu-
aborts, unlike many existing coordination-based proto-         tions that can be classified under two main categories:
cols that require abandoning the protocol if a malicious       mixing mechanisms for existing, already deployed cryp-
party aborts. A user can drop out at any time, and             tocurrencies (e.g., [9, 31, 40, 44, 53–55]) that require
the remaining users can complete the protocol without          O(n2 ) communication and computation as described
restarting with a slightly reduced anonymity set.              earlier, and new stand-alone private cryptocurrencies
Composition: A common concern with differentially              [7, 43, 61, 64] that employ cryptographic mechanisms to
private protocols is the deterioration of the privacy pa-      hide the transaction value and participants. Compared
rameter through the composition of multiple executions.        to these works, our protocol relies only on standard
We prove that this is not an issue for our protocol. The       assumptions, does not require a trusted setup phase,
input to our (, δ)-indistinguishable mixing mechanism         achieves sub-quadratic total computation and commu-
is the random permutation used for mixing. Each time           nication cost, protects each person’s currency against
the protocol starts, we choose another uniform, indepen-       abort attacks, and scales to support large scale mixing.
dent permutation as input; the previous permutation is              Monero is a Cryptonote-style protocol that uses ring
discarded and never used again for further mixing. Intu-       signatures [27, 51] to obscure the sender of a transac-
itive, repetitive mixing, even a weak one, adds random-        tion within a set of potential senders, plus homomor-
ness to the composed mixing results. We formalize this         phic commitments and range proofs to hide transaction
concept as an iterated butterfly network in Appendix           values [50]. If n Monero users wish to transact anony-
G and prove that it provides more privacy, not less in         mously today, each of them must perform computation,
there. Surprisingly, this iterated approach can also im-       verification, and communication that is linear in the size
prove efficiency when epsilon is already small. Specifi-       of the anonymity set, so the total costs are O(n2 ). As
cally, instead of relying on larger noise for better privacy   a result, Monero is only efficient for small scale mixes;
guarantee, it is more efficient to keep the original noise     earlier versions of Monero allowed ring sizes as small as
magnitude but shuffle multiple times. In contrast, when        4, which led to sender deducibility attacks [8, 46]. It is
differential privacy has been used in an application like      worth mentioning that a formalized security analysis is
anonymous messaging [57, 60], the input across execu-          included in [37], but is not compatible with the current
tions is correlated in a dangerous way, since pairs of         version of the Monero protocol.
(, δ)-Indistinguishable Mixing for Cryptocurrencies              53

     Zcash is based on succinct zero-knowledge argu-
ments (zk-SNARKs) [28] and can offer private mix-
ing at scale, but the underlying zk-SNARKs require
trusted setup as well as strong (non-falsifiable) crypto-
graphic assumptions2 . The total computational cost of
generating these zk-SNARKs for all n parties will grow
as O(n log(t) log(log(t)), where t denotes the (monoton-
ically increasing) number of Zcash transactions ever
made. QuisQuis [25] attempts to solve the problem of
UTXO size, but requires similar computation costs to
Monero: O(n2 ) for n users transacting anonymously.               Fig. 1. 2-ary 8-size Butterfly Network, with nodes in 1, . . . , d
                                                                  layers treated as buckets

2 Building Blocks                                                 (starting from 0), where r equals to the ith leftmost
                                                                  position’s value of b’s base-k representation. In other
2.1 Butterfly Network Topology                                    words, at layer i, the path proceeds through bucket hi, ci
                                                                  where c shares its i most significant symbols with b and
A butterfly network [29] is a graph structure that fa-            the remaining symbols with a (for base k).
cilitates routing. It provides a path from each of the n              A congestion happens when more than one path
input nodes to the n output nodes while ensuring that             pass through the same intermediate bucket. We will
both fan-in k and depth d = logk (n) are sub-linear in n.         show later in Section 6.1 that a congestion with size
                                                                  Ω(log n) only happens with negligible probability in our
Definition 2.1 (k-ary n-size Butterfly Network). A k-             construction, unless a malicious adversary is conducting
ary n-size butterfly network consists of n nodes in each          a (detectable) denial of service attack.
of the d+1 layers, where d = logk n. Let hi, ji denote the            Our usage of the butterfly network draws inspira-
jth node in the ith layer with both indices starting with         tion from switching networks [1, 10]. A novelty of our
0. A directed edge links two nodes hi − 1, j1 i and hi, j2 i      construction is the use of buckets so that multiple trans-
if and only if j1 = j2 or the base-k representation of j1         actions can be present at the same position in an inter-
and j2 differ only in the ith most significant position.          mediate layer.
A binary butterfly network is a butterfly network with
k = 2 and d = lg(n).
                                                                  2.2 (, δ)-Indistinguishable Mixing
Fig. 1 shows a binary butterfly network of size 8. For
example, h1, 2i is connected to both h2, 0i and h2, 2i be-        We start by formally providing the definitions of neigh-
cause the base-2 representations 02 = 000 and 22 = 010            boring permutations:
only differ in the second most significant bit. Notice that
in the binary butterfly network, a packet travels left if         Definition 2.2. Permutations π and π 0 are said to                   be
the ith significant bit of b’s binary representation is 0,        neighbors if and only if they have a swap distance                   of
and travels right otherwise.                                      1, where the swap distance is the minimum number                     of
    In this work, we refer to layer 0 nodes as input nodes,       times that pairs of elements of π must be swapped                    in
layer d nodes as output buckets, and all nodes in inter-          order to produce π 0 .
mediate layers as intermediate buckets.
    A butterfly network contains a unique path from               Intuitively, defining neighboring permutations through
every input node h0, ai to every output bucket hd, bi. The        swap distance ensures that even an adversary who
path can be calculated incrementally: at layer i − 1, the         knows all the mappings between the mixing inputs and
packet travels through the rth leftmost outgoing edge             outputs, except any two pairs, still cannot gain a sig-
                                                                  nificant advantage in determining the mapping of these
                                                                  two pairs, even when given the leakage of the mixing
2 Recent works propose zk-SNARKs with different types of          mechanism.
setup and crypto assumptions, though all are still non-standard
and the prover computation cost remains high [6, 12, 39].
(, δ)-Indistinguishable Mixing for Cryptocurrencies       54

     We give the formal definition for (, δ)-                 as the histogram of intermediate bucket weights) draws
indistinguishable mixing, inspired by differential privacy     the same conclusions, regardless of whether my coin was
[23, 24]:                                                      swapped with that of another party.
                                                                    Similar to differential privacy, the following lemma
Definition 2.3 ((, δ)-Indistinguishable Mixing). Let          and theorem hold for Definition 2.3 and their proofs
L be a randomized algorithm that receives a permuta-           trivially follow the same arguments as used in differen-
tion as input. We say that L is (, δ)-indistinguishable       tial privacy [24].
if for any two neighboring input permutations π and π 0
and for all T ⊆ Range(L):                                      Lemma 2.4 (Post-Processing). Let L be a randomized
                                                               algorithm that is (, δ)-indistinguishable. Let g be an ar-
          Pr[L(π) ∈ T ] ≤ e Pr[L(π 0 ) ∈ T ] + δ,             bitrary deterministic or randomized function. Then g◦L,
in which Range(L) denotes the universe of all possible         the composition of functions g and L, is also (, δ)-
output of algorithm L.                                         indistinguishable.

Throughout the text, we also refer to  and δ as privacy       Theorem 2.5 (Composition). Let Li be a (i , δi )-
parameters. Additionally, we say the mixing protocol           indistinguishable algorithm for i ∈ [k]. Define L[k] (π) =
                                                                                                           Pk       Pk
is an (, δ)-indistinguishable mixing mechanism, if L is       (L1 (π), . . . , Lk (π)). Then L[k] (π) is ( i=1 i , i=1 δi )-
(, δ)-indistinguishable and it captures the whole view of     indistinguishable.
any observer and the adversary of the mixing protocol.
     Analogous to the definition of differential privacy for   Let Π denote the whole set of permutations for n ele-
databases, our definition captures arbitrary background        ments. We consider a deterministic function f : Π → Nm
knowledge of the adversary. Just as in the classic defi-       that takes a permutation and returns m natural num-
nition, this follows by providing the adversary the abil-      bers. As in standard differential privacy, we define the
ity to choose the pair of neighboring permutations. The        sensitivity of f as:
fact that the inputs are constrained to being permu-
                                                               Definition 2.6 (`1 -Sensitivity). The `1 -sensitivity of f
tations can be viewed as a lower bound on this back-
                                                               is:
ground knowledge: the adversary knows, for free, that
                                                                          ∆f = max        ||f (π) − f (π 0 )||1
all user inputs are unique (i.e. they constitute a per-                              0 ∀π,π ∈Π
mutation). The adversary’s ability to choose the neigh-        where   π, π 0   denote two neighboring permutations.
boring permutations captures further, arbitrary back-
ground knowledge.                                              ∆f captures the magnitude of f ’s output change be-
     Additionally, while constraining the user inputs to       tween worst-case neighboring inputs.
a permutation imposes a notion of “neighboring” that               In this work, we sample noise from the negative bi-
results from swapping two inputs, rather than dropping         nomial distribution which models the number of suc-
or modifying a single input value, it has been observed        cesses in a series of independent and identically dis-
in the context of differential privacy for databases that      tributed (i.i.d) Bernoulli trials until r failures occur.
such modifications to the definition of neighboring in-        Formally, we have:
puts do not have substantive impacts on the definition
([58]). Defining neighboring datasets by removing a sin-       Definition 2.7 (Negative Binomial Distribution).
gle input value is appealing, because it captures the in-      The negative binomial distribution is a non-negative
tuition that any individual’s participation cannot be de-      discrete probability distribution with probability mass
tected. Under our definition, the number of participants       function:
is fixed, so this intuition no longer holds. However, even                           
                                                                                      x+r−1
                                                                                                
if it is known that a particular user participated, pri-                 NBr,p (x) =              · (1 − p)r · px
                                                                                        r−1
vacy is guaranteed because their input could have been
any arbitrary value from the domain.                           where r is any positive integer and p ∈ [0, 1]. Addition-
     Formally, we can also apply the same semantically         ally, its cumulative distribution function is:
flavored interpretation of (, δ)-indistinguishable mixing                               x
                                                                                         X
as in [32], modifying it slightly for our scenario: regard-            FNB (x; r, p) =         NBr,p (x) = I1−p (r, x + 1)
                                                                                         i=0
less of external knowledge, an adversary with access to
our mixing protocol’s leakage (which we characterize           where I· (·, ·) is the regularized incomplete beta function.
(, δ)-Indistinguishable Mixing for Cryptocurrencies    55

We use NB(r, p) to refer to the negative binomial dis-           Definition 2.9 (Universal Composability). Let F be
tribution itself. The negative binomial distribution sat-        an ideal functionality, and let Π be an interactive pro-
isfies an additive property: NB(r1 , p) + NB(r2 , p) =           tocol for computing F among a set of parties P while
NB(r1 + r2 , p).                                                 making calls to an ideal functionality G. Protocol Π is
     The number of failures r in the negative binomial           said to UC realize F in the G-hybrid model if for every
distribution can also be generalized to any positive real        non-uniform PPT malicious adversary A, there exists a
value, and the generalized distribution is often referred        non-uniform PPT adversary S in the ideal model such
to as the Polya distribution:                                    that for all non-uniform PPT environments E and all
                                                                 valid inputs x1 , . . . , xp , the following two distributions
Definition 2.8 (Polya Distribution).                             are computationally indistinguishable in κ:
                                                                                               c
                         Γ(x + r)                                             EXECΠG ,A,E ≡ EXECF G ,S,E
            Polr,p (x) =          · (1 − p)r · px
                          x!Γ(r)
                                                                 The polynomial running times of the various interactive
where r is any positive real value and p ∈ [0, 1], and Γ(·)      Turing machines are related; we refer to [13] for details.
is the Gamma function.

Notice that in the special case that r is an integer,
Γ(x+r)      (x+r−1)!      x+r−1
                                                                2.4 UC Functionalities
 x!Γ(r) = x!(r−1)! =         x    . Similarly, the Polya dis-
tribution also satisfies an additive property, let Pol(r1 , p)   We use three UC functionalities as building blocks. The
and Pol(r2 , p) be any two Polya distributions, then:            first two have been extensively examined in prior work:
Pol(r1 , p) + Pol(r2 , p) = Pol(r1 + r2 , p).                    a public bulletin board Gbb [14, 16, 17] and a globally-
     The additive property of negative binomial distri-          accessible clock Gclock [3, 16, 33, 59]. We include these
bution and Polya distribution allow all n participants           functionalities in Appendix A for completeness.
to collectively sample noise from a negative binomial            Anonymous Transaction Functionality. In Figure
distribution NB(r, p) without cooperation. Each partic-          2, we provide a UC functionality GAnonTrans for partially-
ipant can simply sample noise from a Polya distribution          anonymous transactions on top of a ledger. This func-
Pol(r/n, p) and later sum up their noises.                       tionality utilizes the simple bulletin board functionality
                                                                 Gbb and is meant to be a simplified version of exist-
                                                                 ing UC models of blockchain ledgers [4, 35] that take a
2.3 UC Secure Computation                                        transaction amount from a sender and deliver the spec-
                                                                 ified value to the receiver by updating the recorded bal-
Cryptographically secure computation is formalized by            ances. In order to focus on details that are more relevant
demonstrating that an ideal world execution is indistin-         to this work, our functionality intentionally lacks some
guishable from the real world protocol. At a high level,         features, such as clustering transactions into blocks,
the intuition behind security in the universal compos-           that are more realistic reflections of cryptocurrencies,
ability (UC) framework [13] is that any adversary A at-          but that are irrelevant to our constructions. At the same
tacking a protocol π should learn no more information            time, our transaction functionality more closely resem-
than could have been obtained via the use of a simulator         bles that of an account based system as opposed to a
S that makes calls to the ideal functionality F instead.         transaction based one, i.e. the functionality is responsi-
The UC formalism includes an environment E that rep-             ble for checking user balances before transactions.
resents “the rest of the world”; it provides inputs to                On the other hand, our UC anonymous transac-
all parties and also attempts to distinguish between the         tion functionality augments prior work that was mostly
real and ideal executions. The UC framework includes a           modeling linkable, Bitcoin style, ledgers in two ways:
powerful composition theorem that supports modularity            first, we provide confidential transactions, and second,
[13]; composition continues to hold in a “hybrid” setting        we permit equivocation of the sender of any transac-
in which both the real protocol and ideal functionality          tion within a small anonymity set S. This set must be
may call a common sub-routine. In this work, we re-              specified by the (real) sender at the time of the trans-
quire the strengthened model of UC with global setup             action, and it will be publicly revealed to all parties.
[15] that has recently been instantiated as the default          We caution that a small equivocation set may be pri-
behavior within the UC framework [13].                           vate in a standalone setting but need not retain this
                                                                 privacy when composed with other transactions [46].
(, δ)-Indistinguishable Mixing for Cryptocurrencies   56

         GAnonTrans [Gbb ]: Anonymous Transaction              scures the true permutation π which maps the source
                           functionality                       addresses to the receiver addresses.
                                                                                           T
                                                                   In a high level, our FAnonMix  functionality receives
       Registration: User i submits a pair of an
                                                               input and output addresses from participants, up until
       address addr and a value v. If addr is not yet
       assigned to any other users, the functionality          some fixed time T . When the mixing begins at time T ,
       assigns addr to i and stores a record (addr, v)         the functionality will follow steps that will resemble a
       indexed with i.                                         mixing protocol utilizing the GAnonTrans functionality. It
       Transfer: User i, or FAnonMix T       on its behalf,    will additionally provide the leakage function L to the
       submits a transfer request with parameters              adversary as defined by the input and output addresses.
       (S, j, addrr , vr ) where S denotes a set of ad-        The adversary will return a set of transactions for a k-
       dresses S = {addr1 , ..., addr` } and j is an index     ary butterfly network of size n including transactions
       j ∈ {1, . . . , `}. The functionality first checks if
                                                               made to intermediate addresses for each of the logk n
       addrr is already registered. Then, it also checks if
       addr1 , ..., addr` are registered and whether addrj     layers. The functionality will construct the transactions
       is indexed with the user index i. The function-         through calls to the GAnonTrans functionality and notify
       ality checks if vj ≥ vr (i.e. the sending address       the adversary on completion.
       carries enough funds). If any of the checks fail,
       it returns an “Error” message.                          Assumptions. The assumptions made by our ideal func-
       Then the functionality replaces (addrr , vr0 )          tionality are as follows:
       with (addrr , vr0 + vr ) and (addrj , vj ) with         – Mixing Value: We assume that all honest input ad-
       (addrj , vj − vr ) for addrj that belongs to user i.        dresses carry the same value v, i.e. all parties are
       Additionally, the functionality adds (S, addrr )
                                                                   mixing equal funds. Also each participant can spec-
       to Gbb . Else, it ignores the request.
                                                                   ify multiple input and output addresses.
       Balance: User i submits a balance request
                                                               – Sufficient Funds: Given that our protocol does not
       with address addr. If user i has a registered
                                                                   have an upper bound on the number of parties that
       record (addrs , vs ), then a delayed output is sent
       to user i with the private value vs ; otherwise, it         can mix in any given session, and that it will not
       returns “Error”.                                            begin prior to time T , there is no reason to check
       Output: The functionality sends (S, addrr ) to
                                                                   whether an input address carries a sufficient balance
       all the participants.                                       prior to T . If a user does not carry the fixed mixing
                                                                   amount v, it will be dropped by the mixing protocol.
Fig. 2. Anonymous Transaction functionality                    – Corruption Model: The adversary can register any
                                                                   number of corrupted parties during registration.
At the same time, a large equivocation set provides
strong security but may be prohibitively expensive in          Discussion. We note that an extension of our UC func-
many cryptocurrencies; in Monero for instance, trans-          tionality in the multi-value mixing case is not trivial.
action size and computation scale linearly with the size       The difficulty stems by the fact that a simulator needs
of the anonymity set. This work provides a way to boot-        to know the transaction values of each path of the but-
strap the privacy benefits of a large anonymity set, in        terfly network in order to be able to simulate them.
a provable manner, while only calling the functionality
GAnonTrans using small anonymity sets.
                                                               3.2 A Game-Based Definition for
                                                                   (, δ)-Indistinguishable Mixing
3 Definitions
                                                               We also provide a game-based indistinguishability (GB-
                                                               indistinguishability) definition to more clearly showcase
3.1 A UC Definition for Secure and
                                                               the adversary’s advantage in distinguishing between two
    (, δ)-Indistinguishable Mixing                            neighbor mixing addresses permutations.

In this section, we define anonymous mixing in the UC          Definition 3.1. Let Πn be a mixing protocol involving
                                          T
setting by presenting our functionality FAnonMix in Fig-       n parties. Consider the following experiment ExpM ix
ure 3. Our functionality will capture our mixing proto-        between a challenger C and an adversary A:
col, which uses a butterfly network topology that ob-
(, δ)-Indistinguishable Mixing for Cryptocurrencies     57

             T
            FAnonMix [GAnonTrans , Gbb , Gclock ](v, k, L):                be the total number of adversarial parties and n
            Anonymous Mixing Functionality                                 the total number of parties. A selects two neighbor-
                                                                           ing permutations π0 , π1 (cf. Definition 2.2) over the
       The AnonMix functionality interacts with
                                                                           unions of the challenger and the adversary’s sets,
       the adversary, all parties, and three sub-
       functionalities. It is parameterized by some                        i.e., π0 , π1 : S IN ∪ Ŝ IN → S OU T ∪ Ŝ OU T . The per-
       mixing value v, the fan-in parameter for the                        mutations must agree on the adversarial addresses
       butterfly network k, and a leakage distribution                     π0 |Ŝ IN = π1 |Ŝ IN .
       L. Mixing occurs at a specific, pre-defined                    3. The challenger flips a coin b ∈ {0, 1} and runs the
       absolute time T . A list L is initialized to ∅.
                                                                           mixing protocol Πn on input πb . We define the out-
       Register:               Upon              receiving                 put of the protocol as Out.
       (Register, addrINi  , addr OU T
                                  i    )   from      party            4. The adversary is given Out and guesses b0 . He wins
       Pi , first run Check and only continue if Mix is
                                                                           if b = b0 .
       not invoked. Add (Pi , addrIN         OU T
                                     i , addri     ) to L,
       the list of participants in the current mixing
                                                                      We say that Πn is a (z, t)-GB-indistinguishable mixing
       session. The functionality posts addrIN i  to Gbb .            protocol for any PPT adversary who has corrupted up
                                                                      to t parties if the advantage of winning the game defined
       Ping: This method has no parameters and can
       be called by anyone (even the environment). It                 in ExpM ix is bounded by some value z parameterized by
       invokes Check and returns execution to its caller.             our choice of privacy parameters.
       Check: Send GetTime to Gclock and receive τ
       from Gclock . If τ > T then run the Mix process.
       Otherwise, return to the calling method.
                                                                      4 Construction
       Mix:
       –    The functionality randomly assigns an out-
                                                                      We construct an anonymous mixing protocol Π~sT,n,d,c  ,~
                                                                                                                             v ,~
                                                                                                                                r
            put bucket to each output address in L. Let
            w
            ~ denote the vector of bucket weights, i.e.,
                                                                      that allows n participants, where each participant p
            the number of output addresses contained                  owns a sender address sp and a receiver address rp , to
            in each bucket.                                           anonymously transfer a value vp from sp to rp . To hide
       –    The functionality determines π based on                   the mapping between sender and receiver addresses, we
            the sets addrIN [], addrOU T [], and the out-             employ d rounds of mixing in which the participants mix
            put bucket assignments. It computes L(π),
                                                                      with a subset of all participants in the round (these sub-
            and provides (L(π), w) ~ to the adversary.
       –    The adversary returns a set of transactions               sets are determined by the interconnections between two
            for a k-ary butterfly network of size n,                  consecutive layers of buckets on a butterfly network).
            specifying for each transaction the values                Each participant creates ephemeral addresses to hold
            (S, j, addrr , v) and which layer of the net-             the value in each round, except the last round where
            work it corresponds to.
                                                                      transactions are made to the receiver addresses.
       –    The functionality calls GAnonTrans () for all re-
            quested transactions.
                                                                           In order to argue that the view of the protocol is
       –    Notify Adversary that mixing was done.                    (, δ)-indistinguishable, participants contribute to sev-
                                                                      eral “noisy addresses” that contain zero value via “noisy
Fig. 3. Anonymous Mixing Functionality                                transactions” that carry zero value. We refer to the ad-
                                                                      dresses that actually carry value as described in the
1. The challenger sends S IN = {addrIN                         IN
                                               1 , . . . , addrn0 }   above paragraph as real addresses. In particular, we rely
                           OU T               OU T
   and S OU T = {addr1          , . . . , addrn0 } to A. The          on confidential transactions to ensure that a noisy trans-
   sets denote the input and output addresses of honest               action is individually indistinguishable from a real one,
   parties (respectively), and each set is of size n0 .               and we sample the number of noisy addresses from neg-
2. A sends to C two chosen sets of input and output                   ative binomial distribution.
   addresses Ŝ IN , Ŝ OU T on behalf of the malicious par-               To determine the mixing participants, our protocol
   ties, along with their secret keys (in order to al-                relies on using a blockchain as a public bulletin board
   low C to run the mixing protocol itself). Addition-                where parties publish all necessary addresses in a layer
   ally, the adversary might opt to corrupt some of                   by layer fashion. When a party wishes to have a transac-
   the addresses sent by the challenger, in which case                tion output mixed, it includes a relevant “to be mixed"
   he is given their secret keys and these addresses                  flag. In the first phase of the mixing protocol, the sender
   are moved from set S to corresponding Ŝ. Let t                    addresses to be mixed will all be output transaction
(, δ)-Indistinguishable Mixing for Cryptocurrencies   58

addresses marked as “to be mixed" within some spe-                 noisy addresses (if any) with positive integers. We also
cific time period T (i.e. last 100 blocks). Similarly, any         use array notation to refer to all addresses meeting cer-
subsequent layers of transactions can rely on transac-             tain characteristics. Specifically, the array ap [] contains
tion outputs posted on the blockchain from the previous            all addresses made by participant p (in the whole net-
layer to form the subset of mixing parties in the current          work), and a[] contains all addresses for all participants
layer. We note that, similar to other privacy preserving           while s[] contains all sender addresses.
cryptocurrencies, we require anonymous communication
when users post their transactions on the bulletin board.
                                                                   4.2 Generating and Sending Addresses

4.1 Building the Butterfly Network                                 We describe our full mixing protocol Π~sT,n,d,c
                                                                                                            ,~
                                                                                                             v ,~
                                                                                                                r
                                                                                                                   in Figure
                                                                   4. The protocol involves n parties but given that it is
Given n parties and an arbitrary choice of fan-in k, the           symmetric, we opt to describe the protocol from the per-
transaction graph in our protocol follows a k-ary n-size           spective of a single party p. Throughout the protocol,
butterfly network with depth d = logk (n) in which the             we let ←$ refer to sampling value from certain distri-
input layer nodes correspond to sender addresses and               butions, or sampling a key from some key generation
the output layer nodes correspond to receiver addresses.           algorithm. At a high level, the protocol can be divided
Each participant owns one pair of such nodes. We re-               into an offline and an online phase.
mark that most of our construction applies generically             1. Offline Phase. Each party posts a transaction in-
to any network topology that ensures unique paths from             dicating its intent to participate in a mix. In lines 2-7,
source to receiver addresses, however, we restrict our at-         each participant retrieves the collection of sender ad-
tention to the butterfly network in this work.                     dresses from the ledger and sorts them to determine
      The sender addresses in the input layer are ordered          their starting position, source. Each participant then
by simply sorting them, where each address occupies one            randomly samples an output layer position, target, and
position in the input layer. The positions for receiver            locally runs a sub-routine Path(source, target) to deter-
addresses in the output layer are independently chosen             mine her unique path from the sender address to the re-
by each participant. For every address except the sender           ceiver address along the network. In lines 9-12, for each
address, the participant includes its position in current          bucket on the participant’s path through the intermedi-
layer in the transaction posted to blockchain.                     ate layers, she creates a fresh “real address” which will
      In our protocol, each participant is solely respon-          be used to pass non-zero transactions from her sender
sible for transferring her own coins through the en-               address to the intended receiver address. In lines 13-
tire (unique) path in the butterfly network from her               17, each participant independently generates a certain
sender address to her receiver address. As a result, she           number of noisy addresses, which could be zero, for each
must own an address for each layer along that trans-               bucket in the intermediate layers.
action path. Since paths by different participants can                 In this phase, all real and noisy addresses except the
cross through the same position in the intermediate                sender address are freshly created by the participant and
(1, . . . , d − 1) and output layers, our protocol allows for      kept locally.
multiple addresses to reside in these positions and refer
                                                                   2. Online Phase. Next, the participants use the ledger
each position as a bucket.
                                                                   in order to synchronize each layer of transactions. De-
      We use sp and rp respectively to denote the sender
                                                                   pending on the size of n and the specifics of the underly-
address and receiver address of a participant p. We use
                                                                   ing ledger (i.e. block/transaction size), a specific number
vp to denote the value that p wants to transfer from sp
                                                           hi,ji   of ledger blocks will be devoted for each mixing layer;
to rp . More generally, we denote each address by ap,l .
                                                                   only transactions posted within these blocks contribute
Here, the superscript hi, ji denotes that the address is
                                                                   toward this layer. In each round of these transactions,
located within the bucket in layer i at the jth position
                                                                   every participant is responsible to create transactions
from the left. The left subscript indicates that this ad-
                                                                   paid to all addresses she owned, real or noisy, in the
dress is owned by participant p, and we use the counter
                                                                   current/lower layer.
l ∈ N as a unique index to distinguish between all ad-
                                                     hi,ji
dresses owned by participant p within bucket ap,l . By
convention, we reserve l = 0 to denote the one real ad-
dress (if any) that p places in bucket hi, ji, and we index
(, δ)-Indistinguishable Mixing for Cryptocurrencies   59

Fig. 4. Protocol Π~sT,n,d,c
                     ,~
                      v ,~
                         r
                            (s[]). The protocol is symmetric, so we                                      In lines 20-23, each participant determines the cur-
show the protocol from the view of a single participant p who                                        rent set of input addresses from either the sorted sender
sends vp value from sender address sp to receiver address rp .
                                                                                                     addresses array or the ledger, depending on whether
    1:       Gbb .Write(sp , Mix, T )// Post sp to mix at time T                                     this is a first round transaction or not. For the latter,
                                                                                                     each output address in the previous layer must also in-
   . . . . . . . . . . . . . . . . . . . . .Offline Phase. . . . . . . . . . . . . . . . . . . . .   clude its bucket location on the ledger. In lines 24-28,
    2:       Gbb .Read(s[])// Read all addresses posted at time T                                    for a real transaction paid to a real output address, the
    3:       Sort(s[])                                                                               participant disguises the real input address among all
                                                                                                     other addresses from the input set. Notice that since
    4:       source := Find(sp , s[])
                                                                                                     this transaction is along the path that a real value fol-
  // Return location index of p’s sender address.
                                                                                                     lows, there must exist a real input address owned by the
    5:       target := ←$ Uniform(n)
                                                                                                     same participant in one of the parent buckets on pre-
  // Sample location index of p’s receiver address.
                                                                                                     vious layer. On the other hand, if it is a noisy transac-
               h0,sourcei
    6:       ap,0            := sp                                                                   tion, there might not exist a parent input address owned
              hd,targeti
    7:       ap,0            := rp                                                                   by the same participant. Here, we simply assume that
    8:       ap [].Append(ap,0
                                    h0,sourcei       hd,targeti
                                                  , ap,0          )
                                                                                                     a participant can create a transaction with zero value
                                                                                                     without owning any address in the input set. Later, we
  // Maintain an array of p’s own addresses.
                                                                                                     show a slightly different construction that remove this
    9:       pathbuckets[] := Path(source, target)
                                                                                                     assumption.
  // Find path from sender address to receiver address
                                                                                                     Sub-Routines. The full protocol Π~sT,n,d,c
                                                                                                                                              ,~
                                                                                                                                               v ,~
                                                                                                                                                  r
                                                                                                                                                    in Figure 4
  10 :       for buckethi,ji in pathbuckets[] do
                                                                                                     uses the following sub-routines. For some we have UC
  // Generate real addresses for buckets along the path
                   hi,ji
                                                                                                     functionalities described in Section 2.4 and the others
  11 :           ap,0 ←$ AddressGen(1κ )
                                                                                                     are straightforward enough that we omit detailed de-
                                        hi,ji
  12 :           ap [].Append(ap,0 )                                                                 scriptions for brevity.
  13 :       for buckethi,ji in allbuckets[] do                                                      – GAnonTrans .Transfer(inputs[], output, v): Transfers v
  // Iterate for every bucket in the intermediate network                                                coins to the output address from one of the addresses
  14 :           x ←$ Noise(n, , δ)                                                                     in inputs[], as specified by the partially-anonymous
  15 :           for l = 1..x do// Generate x noisy addresses                                            transaction ledger functionality defined in Figure 2.
                       hi,ji                                                                         – AddressGen(1κ ): Generates an ephemeral address.
  16 :               ap,l      ←$ AddressGen(1κ )
                                            hi,ji
                                                                                                         Notice that both a real address and fake address
  17 :               ap [].Append(ap,l )
                                                                                                         are generated in the same way.
   . . . . . . . . . . . . . . . . . . . . .Online Phase. . . . . . . . . . . . . . . . . . . . .    – Find(arr[], value): Return the index of the value in
                                                                                                         the array.
  18 :       for i = 1, 2, . . . , d do
                                                                                                     – Path(source, target): Return an array of buckets (de-
  // Synchronize each layer of transactions among all
                                                                                                         noted by hi, ji) along the unique path from source
                           hi,ji
  19 :           for ap,l          in ap [] do                                                           to target.
  // Iterate through every p’s address in the current layer                                          – Sort(arr[]): Sort the array in increasing order. We
  20 :               if i == 1                                                                           use this function to sort the broadcasted array of
  21 :                   inputs[] := Parent(hi, ji, s[])                                                 participants’ addresses. If each address is generated
  // Read parent addresses from the sender addresses array                                               randomly, then the sort itself can be viewed as ran-
  22 :               else                                                                                dom permutation.
  23 :                   inputs[] := ReadParent(hi, ji, Gbb .Read)                                   – Noise(n, , δ): Sample noise that satisfies the (, δ)-
  // Read parent addresses from the ledger                                                               indistinguishable guarantee. Since the total noise is
                                        hi,ji                                                            contributed by all n participants, the amount of in-
  24 :               output := ap,l
                                                                                                         dividual noise required is also a function of n. In
  25 :               if l == 0 then // Create real transaction
                                                                                                         particular, to ensure the total noise follows a nega-
  26 :                   GAnonTrans .Transfer(inputs[], output, vp )
                                                                                                         tive binomial distribution NB(r, p), each participant
  27 :               else // Create fake transaction
                                                                                                         samples noise independently from a Polya distribu-
  28 :                   GAnonTrans .Transfer(inputs[], output, 0)                                       tion Pol(r/n, p).
(, δ)-Indistinguishable Mixing for Cryptocurrencies     60

–   Parent(hi, ji, s[]): Return the set of addresses resided   dress paths of malicious parties will be identified, and
    in the parent buckets of hi, ji from all sender ad-        cause the reduction of anonymity set to n − t. The noisy
    dresses s[] where i is the layer number and j is the       addresses disclosed will reduce the effective noise mag-
    bucket location within the current layer. The parent       nitude to (n − t)/n of the original for each bucket. To
    relationship is defined by the network topology.           achieve the target anonymity set size and privacy pa-
–   ReadParent(hi, ji, Gbb .Read): Return the set of ad-       rameter, both issues can be easily addressed by setting
    dresses from the parent buckets of hi, ji from Gbb .       a larger required number of mixing parties n0 = n + t.
                                                               Instantiation of Our Protocol Using Ring Signa-
Communication Cost. As the participants rely on                tures. For compatibility with Monero, in Appendix B,
published transactions on blockchain to establish the          we provide an instantiation of our protocol using ring
network construction, it suffices to only consider the on-     signatures over confidential transactions (CTs).
chain communication cost. (We store on-chain all infor-
mation that is needed for future transaction verification
i.e., the sequence of the transactions). On-chain com-
munication happens in Steps 26 and 28 and consists of
                                                               5 Privacy and Security Analysis
an expected total of n(1 + λ) × d messages for all partic-
ipants. Here λ is the average amount of noisy addresses        5.1 Privacy Analysis
per bucket. Concretely messages are transactions with
one output address and their size depends on the in-           We now rigorously prove that the leakage in our mix-
put set size and specific instantiation (discussed in Ap-      ing protocol Π~sT,n,d,c
                                                                                 ,~v ,~
                                                                                      r
                                                                                        is (, δ)-indistinguishable. We build
pendix B). As mentioned earlier, on-chain communica-           up to this proof via a series of lemmas that first an-
tion requires anonymous communication channels (i.e.           alyze the negative binomial mechanism generally and
Tor or Atom without the need of a bulletin board).             then apply this analysis toward the noise distribution
                                                               within Π~sT,n,d,c
                                                                          ,~
                                                                           v ,~
                                                                              r
                                                                                 .
Handling Address Overflow. Having many ad-
dresses in a parent bucket (either through malicious           Lemma 5.1. Let f1 : Π → N be a function with `1 -
behavior or excessive noisy addresses from honest sam-         sensitivity 1. Let M1 (π) be the mechanism that output
pling) forces the child bucket/node to create anonymous        the sum of f1 (π) and some noisy value sampled from
transactions of larger sizes, and increases the overall        NB(r, p) with r ≥ 1. Then M1 is (, δ)-indistinguishable
computation. We formally describe such a scenario in           for  ≥ ln(1/p)                              
                                                                                  k δ = I1−p (r, k + 1) − e I1−p (r, k),
                                                                                 and
Section 6, and provide a formal bound. As a counter-
                                                                           j
                                                                             p(r−1)
                                                               where k =      e −p   .
measure, the participants can set an upper bound on
the number of total address allowed in one bucket, and         We defer the proof of this lemma to Appendix C.
agree to abort the protocol (or simply halt the trans-
actions that connect to this bucket) when any bucket           Lemma 5.2. Let f2 : Π → N2 be a function such that
exceeds this bound. This prevents inefficient executions,      for any pair of neighboring inputs π and π 0 (Defini-
but could lead to denial of service attacks where an ad-       tion 2.2), with f2 (π) = (x1 , x2 ), f2 (π 0 ) equals to either
versary maliciously adds noise that exceed the bound.          (x1 + 1, x2 − 1) or (x1 − 1, x2 + 1). Let M2 be a mech-
In Section 7 we explain how to avoid such attacks.             anism that perturbs the output of f2 by independently
User Aborts. Our discussion accounts for both hon-             adding NB(r, p) noise to each element. Then, M2 is
est user aborts and malicious aborts during the protocol       (2, δ)-indistinguishable with , δ defined as in Lemma
execution. As opposed to previous work on cryptocur-           5.1.
rency mixing [53, 54], our protocol makes no attempt to
detect and handle user aborts. This is mainly because          We defer the proof of this lemma to Appendix C.
an aborted user does not hinder the normal transaction             Next, we consider the (randomized) leakage func-
flow for the rest of the users. In terms of privacy loss,      tion L : Π → Nn×(d−1) as a map from an input/output
we assume the worst case scenario where all aborts are         permutation to a matrix representing the intermediate
caused by t malicious parties. Aborts can happen in any        bucket weights in the network. We informally state that
phase of the protocol, and without loss of generality we       L captures the whole view of the adversary (and leave
assume that in the end all malicious parties publicly          the formal argument to Section 5.2) from our mixing
reveal all their real and noisy addresses. The real ad-        protocol, so it suffices to prove L is indistinguishable to
(, δ)-Indistinguishable Mixing for Cryptocurrencies   61

show that our mixing protocol is an indistinguishable            5.2 Security Analysis
mixing mechanism. Note that in Fig. 3, the set of out-
put weights are given. We first argue that the output            UC Security. Our protocol satisfies the anonymous
                                                                                        T
                                                                 mixing functionality FAnonMix defined in Section 3.1. In
weight itself does not leak any information regarding
the permutation chosen, i.e., the output buckets weights         Appendix D, we describe a simulator that internally em-
are independent of the permutation (which is why we              ulates an execution of the real protocol with black-box
do not place any noisy addresses in the output layer).           access to the real adversary.
To see this, consider neighboring permutations, π1 and
π2 , that have swapped targets in the source/target pairs        Theorem 5.5. Given the probabilistic, polynomial time
(s1 , t1 ), (s2 , t2 ). For any fixed sequence of weights w
                                                          ~ in   function L defined in the previous section, the proto-
                                                                 col described in Figure 4 UC-realizes FAnonMix T       in the
the output layer, its probability of occurrence is identi-
cal under both permutations, as the probability of swap-         (GAnonTrans , Gbb , Gclock )-hybrid world with static corrup-
ping the locations of t1 and t2 such that they are con-          tion, with L leakage and (κ, ¯, δ̄)-privacy for ¯ and δ̄ as
sistent with π2 is identical to the probability of placing       specified in Theorem 5.4.
them in the locations consistent π1 . As a result, our
                                                                 We defer the proof of this theorem to Appendix D.
analysis of Lemma 5.3 and Theorem 5.4 are conditioned
on any fixed sequence of weights in the output layer.            Game-Based Security.
      We begin with the following simple observation.
                                                                 Theorem 5.6. Protocol Π~sT,n,d,c
                                                                                               ,~
                                                                                                v ,~
                                                                                                   r
                                                                                                      is a ( 2¯ + δ̄, t)-GB-
Lemma 5.3 (Swap Sensitivity). For any fixed se-                  indistinguishable mixing protocol for any PPT adver-
quence of output weights w,
                          ~ in each layer of the inter-          sary, under the condition that the noise magnitude in
mediate buckets, a swap from permutation π to a neigh-           the mixing protocol is multiplied by a factor of n/(n − t),
boring permutation π 0 can result in one of the following        where t is the number of corruption parties.
two cases: (a) the number of real addresses in two buck-
                                                                 The proof is in Appendix E. It relies on the post-
ets each increase by one and the number of addresses
                                                                 processing property (Lemma 2.4).
in another two buckets each decrease by one, or (b) the
number of real addresses in each of this layer’s buckets
remain unchanged.
                                                                 6 Complexity of Our Construction
Proof. This is an immediate consequence of the fact
that the sender and receiver addresses in π or π 0 imply a
                                                                 6.1 One Layer Butterfly Network
unique path and one weight to each bucket it passes.
Finally, the following theorem holds for any output layer        In this subsection, we discuss a practical setup of our
                                                                                    √
bucket weights.                                                  protocol, using a n-ary Butterfly network. In the rest
                                                                 of this section, we also simply refer to this as the “one-
Theorem 5.4. For any fixed sequence of output weights            layer case,” as there is only one intermediate layer.
~ the leakage function L : Π → Nn×(d−1) is (¯
w,                                                 , δ̄)-       Asymptotic Analysis. Under the assumption that the
indistinguishable, where ¯ = 4(d − 1), δ̄ = 2(d − 1)δ,         computational cost of a single anonymous transaction is
and , δ are defined as in Lemma 5.1.                            linear in the size of the input set, we can calculate the
                                                                 overall computational and communication costs of the
Proof. According to Lemma 5.3, there are at most                 protocol by counting the number of edges connecting
2(d − 1) pairs of buckets that increase and decrease by          addresses in the butterfly network. We stress that for
one across all d − 1 intermediate layers. As our protocol        any address, there is an edge between it and every ad-
essentially applies noise sampled from negative binomial         dress in its parent buckets. In other words, the in-degree
distribution on all buckets, it can be decomposed into           of an address is equal to the number of addresses in its
2(d − 1) parallel run of M2 on each pair of buckets.             parent buckets, not the number of its parent buckets.
Each run of M2 is (2, δ)-indistinguishable, according
to Lemma 5.2. Therefore, composition (Theorem 2.5)               Theorem 6.1. The expected              number of edges for
yields the desired result.                                       one layer butterfly network           is O(n1.5 ). Addition-
                                                                 ally, the number of edges             is upper bounded by
                                                                 O(n1.5 log n log log n) except for    negligible probability.
You can also read