Blockchain Transaction Processing
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Blockchain Transaction Processing Suyash Gupta, Mohammad Sadoghi arXiv:2107.11592v2 [cs.DB] 4 Aug 2021 Synonyms Overview • Blockchain Data Management In 2008, Satoshi Nakamoto (Satoshi • Blockchain Consensus 2008) introduced the design of an unan- • Cryptocurrency ticipated technology that revolutionized the research across the distributed systems community. Nakamoto pre- sented the design of a peer-to-peer Definitions digital-commodity exchange system, which although employed by several A blockchain is an append-only participants, prevents the use of a cen- linked-list of blocks, which is main- tralized design. Nakamoto envisioned a tained at each participating node. system where the participants exchange Each block records a set of transac- commodities among themselves in a tions and their associated metadata. democratic, decentralized and trans- Blockchain transactions act on the parent manner while upholding their identical ledger data stored at each right to privacy. Nakamoto visualized node. Blockchain was first perceived this digital-commodity as a monetary by Satoshi Nakamoto (Satoshi 2008) token that could be used by partici- as a peer-to-peer digital-commodity pants to provide or receive services. (also known as crypto-currency) ex- This led to the birth of Bitcoin—a change system. Blockchains received cryptocurrency—and introduction of a traction due to their inherent property of new design paradigm Blockchain. immutability—once a block is accepted, A blockchain in its simplest form is it cannot be reverted. an append-only linked-list of blocks. 1
2 Suyash Gupta, Mohammad Sadoghi at the execution layer and persisted in the immutable ledger at the storage layer. Clients and servers also employ necessary cryptographic constructs to Fig. 1: Basic Blockchain Representations securely exchange messages among each other. The preceding discussion allows us Each block in this chain is linked to to summarize that a blockchain system the previous block in the chain Gupta aims at providing a safe and resilient et al (2019a, 2020a). Blockchains are storage for transactions. In the suc- often termed as immutable as modifying ceeding sections, we will discuss these an existing block requires modifying concepts in detail and will illustrate the all the previous blocks in the chain. mechanisms pertaining to blockchain Each block includes a set of transactions transaction processing. We will also and the associated meta-data. Figure 1 study key principles required to order presents a schematic representation of a and validate these client transactions blockchain. Blockchain systems guar- and provide analysis of some existing antee decentralization as the full-copy blockchain applications. of the chain is maintained by several participants1 . Moreover, a block is only accepted into the chain after all the participants have reached consensus on Key Research Findings the order and contents of the block. In specific, admittance of a block to the Each blockchain system can be visu- chain implies that the transactions iin the alized as a secure representation of a block have been executed and verified. traditional database system (Nawab Hence, blockchain helps in achieving 2018). Similar to a database system, key properties such as democracy and each blockchain application also re- transparency. ceives transactions from multiple A blockchain system can be de- clients. In its vanilla form, a blockchain scribed as a collection of layers. At transaction is a collection of read or the application layer, there are clients, write operations. Clients issue these which send their transactions to a set of transactions to the servers for processing severs to process. The communication and exchange of digital-commodities. among the clients and servers take In general, each server in a place at the networking layer. Servers blockchain application stores a full- participate at the ordering layer to copy of the chain. Hence, without any assign a unique order to each incoming loss of generality, we can claim that client transaction in a Byzantine Fault- servers of a blockchain system are Tolerant (henceforth referred to as BFT) replicas of each other. In specific, a manner. Following a successful order- blockchain application lays down a ing, the client transaction is processed replicated design where each replica participates in ordering and executing 1 In sharded blockchain systems, no shard may the incoming client transaction. have complete copy but data is still securely replicated.
Blockchain Transaction Processing 3 Prior works have shown that it is from participating in the management possible to make a replicated system of the blockchain. Further, we use handle failures (Lamport 1998; Steen circles to demarcate different zones of and Tanenbaum 2017). In any replicated operation; certain nodes are allowed to system, replicas participate in a fault- lead the consensus (or create the next tolerant consensus protocol to decide block) while some nodes are allowed to the order to execute a client transaction. participate in the consensus protocol. Blockchain applications also adhere Public Blockchain systems, such to this philosophy. They employ a as Bitcoin (Satoshi 2008) and BFT consensus protocol to achieve Ethereum (Wood 2015), allow any consensus under byzantine failures. But, node to participate in the consensus why do BFT protocols need to handle process and propose the next valid block byzantine failures? As a blockchain for the chain. Hence, a public or per- system promotes democracy, it permits missionless blockchain system upholds display of adversarial behavior by its democratic nature by providing each malicious replicas during consensus. To node with equal probability2 of creating tackle such malicious activities, each the next block to be added to the chain. blockchain application relies on the Private Blockchain systems run at design and properties dictated by a BFT the other extreme end of the spectrum. consensus protocol. These blockchain systems permit only a specific set of nodes to be part of the consensus protocol and restrict the creation of next block to an even smaller Blockchain Topologies subset of nodes. Private blockchain de- signs are attractive to large multi-sector A key parameter that renders the companies and banks, which may chose design of a blockchain system is its to allow some of their customers to underlying application. On the basis of participate in the consensus protocol, permissions available to a participating while restricting creation of next block node, a blockchain application can be to its employees. categorized blockchain as permissioned, Hybrid Blockchain systems attain permissionless or hybrid (Pilkington a middle ground between the two 2015; Cachin and Vukolic 2017). extremes. Although these systems allow Although the blockchain community any node to be part of the consensus agrees on the characteristics of a protocol, they restrict the task of propos- permissionless or public blockchain ing and creating the next block to a infrastructure, there is a lack of concise designated subset of replicas. For in- definitions to explain other models. stance, Ripple (Schwartz et al 2014)—a On the basis of topology, we cate- cryptocurrency—supports a variant of gorize a blockchain systems under four the hybrid model. In Ripple, only some heads: public, private, permissioned public institutions have the permissions and hybrid. Figure 2 presents a pic- torial representation of the different 2 The equal probability of creating a block is categories. In these figures, nodes that only guaranteed when all the nodes have ex- lack any connections are disallowed actly same amount of resources, and each node is working independently.
4 Suyash Gupta, Mohammad Sadoghi (a) Public Blockchain. (b) Hybrid Blockchain. (c) Permissioned Blockchain. (d) Private Blockchain. Fig. 2: Topologies for Blockchain Systems. to select the transactions that will be 2020c) are some of the state-of-the-art part of the next block. permissioned blockchain applications Amidst all these topologies, per- and fabrics. missioned blockchain systems have successfully created a niche space for their design (Androulaki et al 2018; Gupta et al 2020c). Permissioned Blockchain Transactional Flow blockchain applications allow any node participate in the consensus protocol but The initial block of any blockchain is require the identities of all participants termed as the genesis block (Decker and to be known a priori. Although partici- Wattenhofer 2013). Genesis block is pants loose their privacy, permissioned a special block that is numbered zero, blockchain applications provide each and is hard-coded in every blockchain participant equal opportunity to propose application. Each other block links to the next block. Notice that permissioned some previously existing block. Hence, blockchain applications place no other a blockchain grows by appending new special restrictions on the behavior of blocks to the existing chain. a participant. Hyperledger Fabric (An- A transaction in a blockchain system droulaki et al 2018), Libra coin (Libra is identical to any distributed or OLTP 2019) and R ESILIENT DB (Gupta et al transaction (TPP Council 2010) that acts
Blockchain Transaction Processing 5 on some data. Traditional blockchain Achieving fault-tolerant distributed applications (such as Bitcoin) consist consensus is an age-old problem. of transactions that represent an ex- Commit protocols such as Two-Phase change of money between two entities Commit (Gray 1978), Three-Phase (or users). Each valid transaction is Commit (Skeen 1982) and Easy- recorded in a block, which can can con- Commit (Gupta and Sadoghi 2018, tain multiple transactions, for efficiency. 2020) help in reaching agreement Immutability is achieved by leveraging among the participants in a parti- strong cryptographic properties such as tioned distributed databases (Qadah hashing (Katz and Lindell 2007). and Sadoghi 2018; Qadah et al 2020; Figure 3 illustrates the three main Sadoghi and Blanas 2019). However, phases required by any blockchain commit protocols can only handle node application to create a new block. The failures and are unsafe under message client transmits a transactional request delay or loss. to one of the participants. This partici- Paxos (Lamport 1998) and View- pating node multicasts the client request stamped Replication (Oki and Liskov to all other nodes. We term this phase 1988) allow a distributed system of as Transaction Dissemination. Once, all replicas to achieve consensus in the the nodes have a copy of client request, presence of crash-faults. In a system they initiate a consensus protocol. of n replicas, a system employing The choice of underlying consensus Paxos for consensus can handle up to protocol affects the time complexity and n failures where n ≥ 2 f + 1. Notice resource consumption. The winner of that these f failures need not be simple the consensus phase proposes the next replica crashes but can also take form block and transmits it to all other nodes. of message losses and delays. However, This transmission process is equivalent crash-fault tolerant protocols such as to adding an entry (block) to the global Paxos and Viewstamped Replication distributed ledger. cannot handle any malicious behavior. A byzantine-fault tolerant protocol aims at reaching consensus in a system of n replicas where at most f replicas Blockchain Consensus can act as byzantine and n ≥ 3 f + 1. Traditional BFT protocols promote At the core of any blockchain applica- a primary-backup model where one tion is a BFT consensus protocol which replica is designated as the primary and states that given a client transaction, other replicas act as backups. It is the the aim of this consensus protocol is task of the primary to initiate consensus to ensure all the non-faulty replicas among all the backups. Notice that all assign the same order to this trans- the above discussed protocols, such as action. Depending on the underlying Two-Phase Commit, Paxos and so on, topology, we can broadly categorize follow the primary-backup model. The consensus protocols into two categories: key reason primary-backup model is permissionless consensus protocols and preferred is because of its simplicity and permissioned consensus protocols. its ability to blame the primary for an unsuccessful consensus.
6 Suyash Gupta, Mohammad Sadoghi Fig. 3: Blockchain Flow: Three main phases in any blockchain application are represented. (a) Client sends a transaction to one of the server, which it disseminates to all the other servers. (b) Servers run the underlying consensus protocol, to determine the block creator. (c) New block is created, and transmitted to each node, which also implies adding to global ledger. Recent blockchain applications and live. A replicated system is called present several new protocols for achiev- as safe if all its replicas are consistent, ing consensus: Proof-of-Work (Jakob- that is, have the same state. A replicated sson and Juels 1999; Satoshi 2008), system is termed as live if its replicas are Proof-of-Stake (King and Nadal 2012) able to make progress, that is, process and Proof-of-Authority (Parity Tech- incoming client requests. nologies 2018). Prior works have shown A majority of existing BFT protocols that these consensus protocols provide guarantee safety under asynchronous similar guarantees as traditional BFT environment, that is, messages can protocols (Garay et al 2015). Hence, get loss, delayed or duplicated, and in the rest of this section, we illustrate up to f replicas may act byzantine. some of the state-of-the-art blockchain Further, any BFT protocol employs protocols for both permissioned and cryptographic constructs to prevent permissionless systems. malicious replicas from impersonating non-faulty replicas. As clients send their transactions to other replicas, so each client uses digital signatures to Permissioned Consensus sign its message (Menezes et al 1996; Katz and Lindell 2007). For all other A decade prior to the inception of the messages, depending on the algorithmic first blockchain application, the problem steps, the system can employ either of achieving fault-tolerant distributed asymmetric-key digital signatures or consensus problem had already excited less-expensive symmetric-key message practitioners and researchers (Lamport authentication codes (Katz and Lindell 1998; Oki and Liskov 1988; Castro 2007). Hence, we assume authenticated and Liskov 1999). Distributed systems communication: malicious replicas can research community agreed that a impersonate each other, but no replica byzantine-fault tolerant system can only can impersonate a non-faulty replica. be considered correct if it is both safe Further, replicas will accept only those
Blockchain Transaction Processing 7 messages which are well-formed, that is, P REPARE message to all the replicas. have valid message authentication codes When a replica R receives P REPARE or digital signatures (as applicable). messages from 2 f replicas in support PBFT. Practical Byzantine Fault Tol- of the request m sent by P, then R erance (Castro and Liskov 1999) if often marks the request as prepared. This considered as the first protocol to present information gives R an assurance that a practical design for achieving byzan- a majority of non-faulty replicas are tine fault-tolerance in a distributed sys- also agreeing to order this request at tem. P BFT follows the primary-backup sequence k. Next, R acknowledges the model where the primary replica initi- prepared request by sending a C OMMIT ates the consensus among all the repli- message to all the replicas. When a cas. It is the responsibility of the primary replica R receives C OMMIT messages to ensure all the backup replicas success- from 2 f + 1 replicas, then R achieves a fully order every incoming client trans- unique guarantee on the order of m, that action otherwise it risks replacement. If is, a majority of non-faulty replicas have the primary is non-malicious and the net- also prepared m. This allows replica R work is reliable, P BFT guarantees con- to go ahead and execute the request m sensus in three phases. as the k-th request. Finally, R sends the P BFT protocol starts when a client C result of executing m as a response to wants a transaction to be executed and the client C . sends a request m to the primary replica The client C needs f + 1 matching P. The primary P checks if the client responses from distinct replicas, to mark signature is valid and if this is the case, its request m as complete. It is possible it creates a P RE - PREPARE message and that the client may not receive sufficient sends that message to all the backups. number of matching responses. To This P RE - PREPARE message includes a handle such cases, the client initiates sequence number (an integer) and a hash a timer prior to sending its request. In of the client request. The sequence num- specific, each client waits on a timer ber k states the order to execute the trans- for receiving f + 1 identical responses. action while the hash acts as a digest, If the client timeouts while waiting which can be used in future communica- for f + 1 responses, then it forwards tions as an alias for the client request3 . its request m to all the replicas. When When a replica R receives a a backup replica R receives a client P RE - PREPARE message from the pri- request m, it forwards that request to the mary P, it performs the following primary P and starts its timer. If P checks: (i) verifies the client signature fails to send a P RE - PREPARE message on m, (ii) checks if P is the primary, corresponding to m, then R concludes and (iii) ensures the sequence number that P is byzantine and initiates pri- k has not already been used. If the mary replacement. Existing literature P RE - PREPARE message passes all the terms this primary replacement process checks, R agrees to support primary’s as view-change because each primary order for this request and sends a represents a view of the system. The view-change protocol only starts when 3 Client requests are often of the order of several at least f + 1 replicas are ready to kilobytes and sending an hash instead optimizes replace the primary. This condition is the communication.
8 Suyash Gupta, Mohammad Sadoghi necessary as up to f replicas can be agrees to execute this request. Such an byzantine and may even request replace- execution is termed as speculative as ment of a non-faulty primary. Hence, the replica R is unaware of the state at when at least f + 1 replicas request other replicas. In specific, a byzantine replacement, remaining replicas assume primary could have equivocated and sent that there is at least one non-faulty different replicas distinct client requests. replica which has been affected. Once the replica R executes the request, For a successful view-change to take it sends the reply to client C . The client place, a new primary has to be selected. C marks the request complete if it P BFT follows a simple principle: if the receives matching identical responses replica with index i is the current pri- from at least 3 f + 1 replicas. mary, then replica with index j will be A keen reader can easily notice that the next primary, where j = (i + 1) mod the onus is on the client to ensure system n. But, how does a replica concludes is safe. Further, when n = 3 f + 1, then that it is time for it to act as the new the client has to wait for responses from primary. When any replica R receives all the replicas. Due to these restrictions, V IEW-C HANGE messages from 2 f + 1 Z YZZYVA’s fast-path works only if distinct replicas that want to elect it as there are no failures. In Z YZZYVA, the primary, then it initiates the process the client waits on a timer while ex- of switching to next view. Notice that pecting 3 f + 1 responses. If the client the process of switching to next view re- timeouts prior to receiving responses, quires ensuring all the replicas have the then it initiates the slow-path. In the common state. Thus, the new primary slow-path, client has to summarize the also needs to provide this information as state it received from different replicas part of the N EW-V IEW message. and needs to decide whether primary Zyzzyva. It is evident from P BFT’s replacement needs to be initiated or a design that it requires three phases of simple recovery protocol is sufficient communication of which two necessitate to ensure system remains live. Clearly, quadratic communication complexity. the slow-path is no longer linear and Hence, there is a need for optimized requires multiple phases. Moreover, if protocols, which can achieve the same the client is malicious, then the replicas goals with much lesser communication could be momentarily unsafe until overheads. Z YZZYVA (Kotla et al there is a good client. Another key 2007) presents a twin-path protocol that challenge with twin-path protocols is achieves consensus in a single linear finding the optimal timeout value. Prior phase if there are no failures. All the works have shown that finding a timeout replicas in the Z YZZYVA start in the value can be hard and Z YZZYVA faces fast-path and switch to the slow-path severe reduction is throughput under under failures. Note that a recent work failures (Clement et al 2009a,b; Gupta has illustrated that Z YZZYVA is unsafe et al 2021a). under failures (Abraham et al 2017). S BFT. The key aim behind the design In Z YZZYVA, when a non-primary of S BFT (Golan Gueta et al 2019) is replica R receives a P RE - PREPARE to make a consensus protocol that can message from the primary P, it as- guarantee safe consensus with linear sumes that the primary is non-faulty and message complexity in periods of no
Blockchain Transaction Processing 9 failures. In fact, like Z YZZYVA, S BFT is fast path. If the collector timeouts wait- also a twin-path protocol. S BFT employs ing for threshold shares from 3 f + c + 1 threshold signatures to achieve linear replicas, it switches to the slow path, communication complexity. which requires two additional linear Threshold signatures are based on phases to complete consensus. asymmetric cryptography. In specific, HotStuff. In any primary-backup each replica holds a distinct private key, BFT protocol, if the primary acts which it can use to create a signature malicious, then the protocols employ the share. Next, one can produce a valid accompanying view-change algorithm threshold signature given at least t such to detect and replace the malicious signature shares from distinct replicas primary. This view-change algorithm (the exact value of t is dependent on the leads to a momentary disruption in underlying consensus protocol). system throughput until the resumption At a closer look, it seems like S BFT of service. requires more phases than P BFT. This H OTSTUFF (Yin et al 2019) proposes occurs because S BFT linearizes each eliminating the dependence of a BFT phase of P BFT through use of threshold consensus protocol from one primary by signatures. In S BFT, when a replica R replacing primary at the end of every receives a P RE - PREPARE message, it consensus. Although this rotating leader agrees to support from the primary’s design escapes the cost of a view-change sequence by generating a threshold protocol, it enforces an implicit sequen- share. The replica R sends this share tial paradigm. Each primary needs to to a specific replica designated as the wait for its turn before it can propose a collector. When a collector receives new request. message from at least 3 f + 2c + 1 repli- In H OTSTUFF, in round i, the replica cas it generates a threshold signatures with identifier i mod n acts as the pri- and sends this signature to all the repli- mary and proposes a request to all the cas. When a replica receives a threshold replicas. Each replica on receiving this signature from the collector, it executes request, creates a threshold share and the request to generate a response, cre- sends to the replica R with identifier ates a threshold share on this response (i + 1) mod n. If R receives threshold and sends these to a specific replica shares from 2 f + 1 replicas, then it com- designated as the executor. The executor bines them into a threshold signature and waits for f + 1 identical responses and initiates the consensus for round i + 1 combines them into threshold signature. by broadcasting its proposal along with Next, the executor sends this signature the computed threshold signature. No- to all the replicas and clients. tice that replicas have not executed the For S BFT’s fast path to work as request and replied to the client. H OT- stated, either there should be no failures STUFF’s aim is to linearize the consen- or at least 3 f + 2c + 1 replicas should sus proposed by P BFT protocol, which participate in consensus where up to it does by splitting each phase of P BFT c > 0 replicas can crash-fail (no byzan- into two using threshold signatures. To tine failures). Moreover, the primary can reduce the communication, it chains the act as both collector and executor but phases. Hence, a replica executes the re- S BFT suggests using distinct replicas in quest for the i-th round once it receives
10 Suyash Gupta, Mohammad Sadoghi a threshold signature from the primary check, each replica agrees to support the of (i + 3)-th round. Evidently, chaining first k-th proposal it receives from the helps H OTSTUFF to some extent but it primary by sending a S UPPORT message does not eliminate its sequential nature. that includes its unique threshold share This sequential nature forces H OTSTUFF to the primary. The primary P waits for to loose out on an opportunity to process 2 f + 1 threshold shares, and on receiv- messages out-of-order. ing such shares, it combines them into PoE. Proof-of-Execution (henceforth a threshold signature and broadcasts as referred to as P O E) consensus protocol a C ERTIFY message. When a replica aims at achieving consensus in three R receives the C ERTIFY message, it linear phases without relying on any view-commits to m as the k-th transaction twin-path model (Gupta et al 2021a). in view v. After R view-commits to m, Further, P O E recognizes that no one R schedules speculative execution of size fits all systems (Singh et al 2008). m. Consequently, m will be executed by Hence, its design is independent of R after all preceding transactions are the choice of underlying cryptographic executed. After execution, R informs signature scheme. This implies that the the client of the order of execution P O E protocol can employ both sym- and of any execution result. A client metric and asymmetric-cryptographic considers its transaction successfully signature schemes depending on the executed after it receives identical application environment. response messages from 2 f + 1 distinct The design of P O E is built on three replicas. key insights. First, P O E prevents use of Aardvark. The design philosophy any twin-path paradigm as switching behind Aardvark is distinct in compari- from fast to slow-path requires depen- son to existing BFT protocols (Clement dence on timeouts, which degrades et al 2009b). It aims at building a system performance. Second, P O E robust BFT protocol that can continue allows replicas to speculatively execute performing under failures. Hence, in the requests but facilitates rollbacks the failure-free cases, Aardvark attains in case of inconsistencies. Final, P O E lower throughput than a majority of the allows out-of-order processing, which existing BFT protocols. eliminates any bottlenecks associated In Aardvark, prior to sending its with sequential consensus protocols. request to the primary, the client signs For the sake of brevity, we will de- the request using both digital signatures scribe P O E built on top of threshold sig- and message authentication codes. natures. In P O E, the client C initiates This prevents malicious clients from execution by sending its request m to the performing a denial-of-service attack primary P. To initiate replication and as it is expensive for client to sign each execution of m as the k-th transaction, message twice. Aardvark also employs the primary proposes m to all replicas by a point-to-point network rather than broadcasting a P ROPOSE message. the multicast network for exchange of After a replica R receives a P ROPOSE messages among clients and replicas. message from P, it checks whether at The key intuition behind such a choice least 2 f other replicas also received the is to disallow a faulty client or replica same proposal from P. To perform this from blocking the complete network.
Blockchain Transaction Processing 11 Aardvark also periodically changes the malicious primaries, it also wastes primary replica. Each replica tracks the excessive bandwidth by requiring throughput of the current primary and all the instances to order the same suggests replacing the primary when set of requests. Resilient Concurrent there is a decrease in its throughput. Consensus (henceforth referred to as To perform such tracking, each replica R CC) paradigm resolves this issue by sets a timer and measures the rate of parallelizing the consensus (Gupta et al primary’s responses. 2019b, 2021b). In specific, R CC runs RBFT. The key intuition behind the at each replica multiple instances of a design of R BFT is to facilitate detection primary-backup protocol. of clever malicious primaries (Aublin The key challenge with the design et al 2013). R BFT extends Aardvark and of primary-backup protocols is their re- aims to detect those malicious primaries, liance on the primary. This dependence which cannot be detected by simple can severely affect the throughput and timers suggested by Aardvark. scalability of these protocols. The pri- In Aardvark, a clever primary can mary replica not only receives all client avoid detection by delaying messages requests but is also responsible for en- just slightly below the timeout threshold. suring consensus is reached on the order Such a primary can throttle the system for these requests among all other repli- throughput without risking eviction. cas. If the primary fails to ensure consen- To tackle this challenge, R BFT insists sus, then all remaining replicas need to running f + 1 independent instances of replace this primary. This replacement the Aardvark protocol on each replica. process is necessary as, without it, non- One of these instances is designated as faulty replicas may never converge. Un- the master while other instances act as fortunately, primary replacement is not backups. Irrespective of the designation cheap, as it requires pausing consensus of an instance, all the instances order all on all outstanding requests until the pri- the requests. However, only the master mary is replaced. instance executes the requests. R CC aims at making a BFT consen- The key task of the backup instances sus primary agnostic. To achieve such a is to monitor the performance of the property, R CC advocates running z par- master instance. If any backup instance allel instances at each replica. Further, observes a degradation of the system R CC ensures that each instance is man- throughput at the master, it broadcasts a aged by a distinct replica. Using par- message to elect a new primary. Further, allelization, R CC ensures that the non- to guarantee at least one of the f + 1 faulty replicas are always accepting and instances is led by a non-faulty replica, ordering client requests, this indepen- R BFT requires each instance to be led dent of any malicious behavior or attack. by a distinct replica. In comparison to We now present the design of R CC both P BFT and Aardvark, R BFT requires paradigm that parallelizes the seminal an additional phase, which is used to P BFT consensus protocol. For the sake propagate the client requests across all of explanation, we assume R CC works the replicas. in rounds. Each round of R CC includes RCC. Although R BFT successfully three stages: parallel consensus, unifi- utilizes redundancy to detect clever cation, and execution. The notion of a
12 Suyash Gupta, Mohammad Sadoghi round helps in generating a common or- At the start of each round, each der and recovering from instance failures cluster chooses a single transaction of a but it does not prevent individual pri- local client. Next, each cluster locally maries from working independently. replicates its chosen transaction in a Prior to any round, R CC requires Byzantine fault-tolerant manner using each replica to prepare to run z instances P BFT. At the end of successful local of P BFT protocol in parallel. A round replication, P BFT guarantees that each r begins when the primary of each in- non-faulty replica can prove successful stance proposes a client request. Firstly, local replication via a commit certificate. in the parallel consensus stage, each Next, each cluster shares the locally- instance runs P BFT on its client request. replicated transaction along with Secondly, in the unification stage, the its commit certificate with all other replica waits for all its z instances to clusters. To minimize inter-cluster complete replication (reach consensus communication, we use a novel op- on their respective requests). If every in- timistic global sharing protocol. Our stance successfully replicates a request, optimistic global sharing protocol has a then a common order for execution of global phase in which clusters exchange these requests is determined. If one or locally-replicated transactions, followed more instances are unable to replicate by a local phase in which clusters requests, then the primaries for those distribute any received transactions instances must be faulty and recovery locally among all local replicas. Finally, is initiated. Finally, in the execution after receiving all transactions that are stage, each replica executes all the client locally-replicated in other clusters, each requests in the common order. replica in each cluster can determin- GeoBFT. Existing BFT protocols do istically order all these transactions not distinguish between the local and and proceed with their execution. After global communication, which is a nec- execution, the replicas in each cluster essary requirement to enable geo-scale inform only local clients of the outcome deployment of a blockchain system. To of the execution of their transactions resolve this challenge, Geo-Scale Byzan- (e.g., confirm execution or return any tine Fault-Tolerant consensus protocol execution results). (henceforth referred to as G EO B FT) that uses topological information to group all replicas in a single region into a single cluster (Gupta et al 2020b). Likewise, Permissionless Consensus G EO B FT assigns each client to a single cluster. This clustering helps in attaining Permissionless applications inspired by high throughput and scalability in geo- Nakamoto’s Bitcoin (Satoshi 2008) ad- scale deployments. G EO B FT operates in vocate a public blockchain system where rounds, and in each round, every clus- any replica can participate in the consen- ter will be able to propose a single client sus. Hence, the identity of a participat- request for execution. Each round con- ing replica can be protected. This design sists of the three steps: local replication, property requires the underlying consen- global sharing, and ordering and execu- sus protocol used to order the transac- tion, which we further detail next. tions to expend the resources of a partic-
Blockchain Transaction Processing 13 ipant. In specific, each participant needs resources where several nodes work to spend some of its resources if it wants together to compute the hash. Moreover, to propose the next block. If such a re- a node has to be given incentives to source consumption is not enforced, then participate in the P OW consensus. If the a malicious participant can create multi- incentives are not sufficient, then nodes ple pseudonymous identities and subvert may decline creating the next block, the system, also known as the Sybil at- which in turn can either stall the system tack (Douceur 2002). or compromise its security. Proof-of-Work. Bitcoin relies on the Another issue with the P OW consen- Proof-of-Work (henceforth referred to sus protocol is that it can lead to tricky as P OW) protocol to achieve consensus situations where it is hard to determine among a set of replica. P OW protocol the next block in the chain. For instance, builds on top of a simple intuition “What two nodes N1 and N2 may solve the com- is mathematically hard to compute but plex puzzle at the same time. In such a easy to verify?” Hence, P OW protocol case, it is possible that one half of the requires the computation to be expen- remaining participants may receive a so- sive, that is, it should deplete some lution from N1 before N2 while the other resources of the prover. half receives solution from N2 before N1 . In P OW protocol, the participating To handle this scenario, some form of nodes compete among themselves to resolution mechanism is needed, which propose the next block by solving a would lead to wastage of resources of complex puzzle. In nature, several either N1 or N2 as both of their blocks computationally hard problems exist, cannot be appended to the chain. Notice such as Diophantine Equation, RSA that any new block added to the chain in- Factorization, One-way Hash Functions, cludes the hash of the previous block. and so on. Among these hard problems, Proof-of-Stake. In P OW protocol, following the Nakamoto’s vision, P OW miners have to deplete their computa- protocol is associated with the computa- tional resources in order to earn the right tion of one-way hash functions such as to create the next block. Each miner computation of a 256-bit SHA3 value. who controls a fraction s of the total When a node N successfully computes computational power, has a probability this hash value, it disseminates this so- nearly equal to s to create the next block. lution to all other nodes for verification. Proof-of-Stake (henceforth referred Any node can verify this solution to to as P O S) presents a principle that check N’s claim. contrasts the resource usage philoso- The main critic behind P OW’s design phy of P OW. In a blockchain system is that leads to excessive wastage of employing P O S protocol, a replica energy. Permissionless applications that possessing a higher stake than the other employ P OW consensus have to set replicas gets a chance to create a new large targets to prevent Sybil attacks. block (Bentov et al 2016). In specific, Further, P OW’s design facilitates unfair the probability a replica possessing practices—higher the computational a fraction s of the total stakes in the capabilities a node has higher are its system creates the next block is s. The chances of solving the complex puzzle. key security rationale behind P O S is that Such a design promotes pooling of replicas who have some stake involved
14 Suyash Gupta, Mohammad Sadoghi in the system are also well-suited to quirement; chain-based P O S algorithms ensure its security. are inherently synchronous, while PPCoin or PeerCoin (King and BFT-style P O S is partially synchronous. Nadal 2012) is often regarded as the Another key challenge for P O S-based first implementation of P O S. The key designs is an attack by rational stake- motivation behind PPCoin’s design was holders. A rational replica would always to implement a crypto-currency that aim at maximizing its profit, an expected does not require participating replicas to behavior in a democracy in correspon- spend its resources in performing large dence with the Nash equilibrium Bentov computations. Initial P O S-based design et al (2016). Rational replicas can affect were based on the notion of coinage. In the security of P O S, as in at attempt to specific, a replica’s ability to create the maximize their gains, they may partici- next block is determined on its value pate in multiple chains. of coinage. Coinage is calculated on A rational miner could get blocks the basis of number of days a replica from distinct forks of the blockchain. has held some coins or stake. To pre- To maximize its returns, a miner would vent Sybil attacks, P O S-based systems attempt to propose the next block for require replicas to algorithm requires a each such fork. As miners don’t lose node to spend its coinage if it wants to any actual resources (like computational propose the next block. energy in P OW), so they are free to Initial implementations of the P O S propose blocks on different chains. protocol lacked the fairness criterion. This could lead to an ever-expanding This is evident as the replica with the divergent network. highest stake gets the chance to propose Proof-of-Authority. A variation of the next block. Although a high stake P O S algorithm to be employed in hybrid replica looses its coinage once it creates blockchain topologies is termed as the next block, it may create the subse- Proof-of-Authority (henceforth referred quent block if its stake is much larger in as P OA) (Parity Technologies 2018). value than that of the other replicas. The key idea is to designate a set of To resolve this issue, a chain-based nodes as the authorities or leaders. These variant of P O S algorithm has been pro- authorities are entrusted with the task posed. The chain-based P O S protocol of creating new blocks and validating employs a psuedo-random algorithm to the transactions. P OA marks a block as select a validator, which then creates a part of the blockchain if it is signed by new block and adds it to the existing majority of the authorized nodes. The chain of blocks. The frequency of incentive model in P OA highlights that selecting the validator is set to some it is in the interest of an authority node pre-defined time interval. Another vari- to maintain its reputation. In case an ant of P O S algorithm follows BFT-style authority acts malicious, it can loose its consensus. In this design, the replicas status and periodic incentives. Hence, participate in a BFT protocol to select P OA does not select nodes based on the next valid block. Here, validators are their claimed stakes. given right to propose the next block, Proof-of-Space. A consensus al- at random. The key difference between gorithm orthogonal to the design these algorithms is the synchrony re- proposed by P OW is proof-of-space
Blockchain Transaction Processing 15 or proof-of-capacity (henceforth re- and validate the same by demonstrating ferred as P O C) (Ateniese et al 2014; a proof-of-work, that is generating a Dziembowski et al 2015). block. The process of generating the P O C expects nodes to provide a proof next block is non-trivial and requires that they have sufficient “storage” to large computational resources. Hence, solve a computational problem. P O C al- the miners are given incentives (such as gorithm targets computational problems Bitcoins) for dedicating their resources such as hard-to-pebble graphs (Dziem- and generating the block. Each miner bowski et al 2015) that need large maintains locally an updated copy of the amount of memory storage to solve complete blockchain and the associated the problem. In the P O C algorithm, ledgers for every Bitcoin user. the verifier first expects a prover to To ensure Bitcoin system remains commit to a labeling of the graph, and fair towards all the machines, the dif- then it queries the prover for random ficulty of proof-of-work challenge is locations in the committed graph. The periodically increased. Prior works have key intuition behind this approach is illustrated that Bitcoin is vulnerable to that unless the prover has sufficient 51% attack, which can lead to double storage, it would not pass the verifica- spending (Rosenfeld 2014). The in- tion. SpaceMint (Park et al 2015)—a tensity of such attacks increases when P O C-based cryptocurrency—claims multiple forks of the longest chain are that P O C based approaches are more created. To avoid these attacks, Bitcoin resource efficient in comparison to P OW developers suggest the clients to wait as storage consumes less energy. for their block to be confirmed before they mark the Bitcoins as transferred. This wait ensures that the specific block is a little deep (nearly six blocks) in the Blockchain Systems longest chain (Rosenfeld 2014). Bitcoin critics also argue that its proof-of-work We now briefly look at the design of consumes huge energy 4 and may not be some of the state-of-the-art blockchain a viable solution for future. applications and fabrics. The key aim Ethereum (Wood 2015) is of this section is to illustrate the differ- blockchain framework that permits ent design practices adopted by existing users to create their own applica- blockchain systems. tions (smart-contracts) on top of the Bitcoin (Satoshi 2008) is regarded Ethereum Virtual Machine (EVM). as the first ever blockchain application. Ethereum utilizes the notion of smart It is a cryptographically secure digital contracts to facilitate development currency designed with the aim of dis- of new operations. It also supports a rupting the traditional institutionalized digital cryptocurrency, ether, which is monetary exchange. Bitcoin acts as the used to incentivize the developers to token of transfer between two parties create correct applications. One of the undergoing a monetary transaction. The key advantage of Ethereum is that it underlying blockchain system is a net- 4 As per some claims one Bitcoin transaction work of nodes (also known as miners) that take a set of client transactions consumes power equivalent to that required by 1.5 American homes per day.
16 Suyash Gupta, Mohammad Sadoghi supports a Turing complete language nodes. Ripple’s consensus algorithm to generate new applications on top introduces the notion of a Unified Node of EVM. At the time of writing this List (UNL), which is a subset of the article, Ethereum employs a variant of network. Each server communicates P OW protocol to achieve consensus with the nodes in its UNL for reaching a among its miners. Ethereum makes consensus. The servers exchange the set its miners solve challenges that were of transactions they received from the not only computational intensive, but clients and propose those transactions also memory intensive. This design to their respective UNL for vote. If a prevented existence of miners who transaction receives 80% of the votes, utilized specially designed hardware for it is marked permanent. Notice that if compute intensive applications. the generated UNL groups are a clique In future, Ethereum Foundation aims then forks of the longest chain could to switch to a variant of P O S protocol co-exist. Hence, UNLs are created in to reach consensus among its replicas. a manner that they share some set of The modified protocol is referred to nodes. Another noteworthy observation as Casper (Buterin and Griffith 2017). about Ripple protocol is that each client Casper introduces the notion of final- needs to select a set of validators or ity, that is, it ensures that one chain unique nodes that they trust. These becomes permanent in time. It also validators utilize the ripple consensus introduces the notion of accountability, algorithm to verify the transactions. which penalizes any validator that at- Hyperledger (Cachin 2016) is a tempts the nothing-at-stake attack. The suite of resources aimed at modeling penalty leveraged on such a validator is industry standard blockchain applica- equivalent to negating all his stakes. tions. It provides a series of Application Parity (Parity Technologies 2018) Programming Interfaces (APIs) for de- is an application designed on top of velopers to create their own non-public Ethereum. It provides an interface for blockchain applications. Hyperledger its users to interact with the Ethereum provides implementations of blockchain blockchain. Parity allows its blockchain systems that uses RBFT and other vari- community to use either Proof-of-Work ants of the PBFT consensus algorithm. and Proof-of-Authority to reach consen- It also facilitates use and development sus in their applications. Hence, if some of smart contracts. It is important to users select P OA consensus, then Parity understand that the design philosophy of provides mechanisms for setting up the Hyperledger leans towards blockchain authority nodes. applications that require existence of Ripple (Schwartz et al 2014) is con- non-public networks, and so, they do sidered as third largest cryptocurrency not need a compute intensive consensus. after Bitcoin and Ethereum in terms ResilientDB (Gupta et al 2020c; of market cap. It employs a consensus Rahnama et al 2020) is a state-of-the-art algorithm which is a simple variant of permissioned blockchain fabric, which existing traditional BFT algorithms. is designed with the aim of foster- Ripple requires number of failures f to ing academic and industry research. be bounded as follows: ≤ (n − 1)/5 + 1, R ESILIENT DB also acts as a reliable where n represents the total number of test-bed to implement and evaluate
Blockchain Transaction Processing 17 SECURE LAYER EXECUTION LAYER SIGNING THREADS TOOLKIT QUEUES HASHING BFT CONSENSUS TOOLKIT NETWORK STORAGE LAYER METADATA BLOCKCHAIN Fig. 4: Architecture of R ESILIENT DB. enterprise-grade blockchain applica- 160 ResilientDB Zyzzyva tions5 . R ESILIENT DB evolved from Throughput (KTxns/s) 140 the ExpoDB platform (Sadoghi 2017; 120 Gupta and Sadoghi 2018) which is 100 an experimental research platform to 80 design and test emerging database tech- nologies, agreement and concurrency 60 control protocols. 40 4 8 16 32 In Figure 4, we illustrate the overall Number of Replicas architecture of R ESILIENT DB, which lays down an efficient client-server Fig. 5: Two permissioned applications architecture. At the application layer, employing distinct BFT protocols (80K we allow multiple clients to co-exist, clients per experiment). each of which creates its own requests. For this purpose, they can either em- ploy an existing benchmark suite or the secure layer provides support for design a Smart Contract suiting to cryptographic constructs. the active application. Next, clients R ESILIENT DB is written entirely in and replicas use the transport layer C++ and provides a graphical user inter- to exchange messages across the net- face to ease user interaction with the sys- work. R ESILIENT DB also provides a tem. Further, it also provide a Docker- storage layer where all the metadata ized deployment that allows any user to corresponding to a request and the experience and test the R ESILIENT DB blockchain is stored. At each replica, fabric (comprising of multiple replicas there is an execution layer where the and clients) on its local machine. underlying consensus protocol is run The key motivation behind R E - on the client request, and the request is SILIENT DB’s design was to show that a ordered and executed. During ordering, system-centric permissioned blockchain fabric can outperform a protocol-centric 5 R ESILIENT DB is open-sourced at blockchain fabric even if the former https://resilientdb.com and code is available at is made to employ a slow consensus https://github.com/resilientdb.
18 Suyash Gupta, Mohammad Sadoghi protocol. To prove this claim, we refer to observation behind such a design is that Figure 5, which compares the through- the nodes selected by the algorithm are put achieved by two permissioned predicted to be non-malicious. Machine fabrics. In this figure, R ESILIENT DB learning techniques can play an impor- employs the slow P BFT protocol while tant role in eliminating the human bias the other fabric adopts the practices sug- and inexperience. To learn which nodes gested in the paper BFTSmart (Bessani can act as block creators, a feature set, et al 2014) and employs the single-phase representative of the nodes, needs to be linear Z YZZYVA protocol. Despite this defined. Some interesting features can disadvantageous choice of consensus be: geographical distance, cost of com- protocol, R ESILIENT DB achieves a munication, available computational throughput of 175K transactions per resources, available memory storage second, scales up to 32 replicas, and and so on. These features would help in attains up to 79% more throughput. generating the dataset that would help to train and test the underlying machine learning model. This model would be ran against new nodes that wish to join Future Directions for Research the associated blockchain application. The programming languages and Although blockchain technology is just software engineering communities have a decade old, it gained majority of its developed several works that provide momentum in the last five years. This semantic guarantees to a language or an allows us to render different elements application (Wilcox et al 2015; Leroy of the blockchain systems and achieve 2009; Kumar et al 2014). These works higher performance and throughput. have tried to formally verify (Keller Some of the plausible directions to 1976; Leroy 2009) the system using the develop efficient blockchain systems principles of programming languages are: (i) reducing the communication and techniques such as finite state messages, (ii) defining efficient block automata, temporal logic and model structure, (iii) improving the consensus checking (Grumberg and Long 1994; algorithm, and (iv) designing secure Baier and Katoen 2008). We believe light-weight cryptographic functions similar analysis can be performed in the Statistical and machine learning context of blockchain applications. The- approaches have presented interesting orem provers (such as Z3 (De Moura solutions to automate key processes and Bjørner 2008)) and proof assis- such as Face Recognition (Zhao et al tants (such as COQ (Bertot 2006)) 2003), Image classification (Krizhevsky could prove useful to define a certified et al 2012), Speech Recognition (Graves blockchain application. A certified et al 2013) and so on. The tools can be blockchain application can help in stat- leveraged to facilitate easy and efficient ing theoretical bounds on the resources consensus. The intuition behind this required to generate a block. Similarly, approach is to allow learning algorithms some of the blockchain consensus has to select nodes, which are fit to act as been shown to suffer from Denial of Ser- a block creator and prune the rest from vice attacks (Bonneau et al 2015), and a the list of possible creators. The key formally verified blockchain application
Blockchain Transaction Processing 19 can help realize such guarantees, if the Cryptocurrencies. In: Proceedings of the underlying application provides such a 2015 IEEE Symposium on Security and Pri- claim. vacy, IEEE Computer Society, Washington, DC, USA, SP ’15, pp 104–121 Buterin V, Griffith V (2017) Casper the Friendly Finality Gadget. CoRR abs/1710.09437, URL http://arxiv.org/abs/ References 1710.09437, 1710.09437 Cachin C (2016) Architecture of the Hyper- Abraham I, Gueta G, Malkhi D, Alvisi L, Kotla ledger blockchain fabric. In: Workshop on R, Martin J (2017) Revisiting fast practical Distributed Cryptocurrencies and Consen- byzantine fault tolerance. URL https:// sus Ledgers, DCCL 2016 arxiv.org/abs/1712.01367 Cachin C, Vukolic M (2017) Blockchain Androulaki E, Barger A, Bortnikov V, Cachin Consensus Protocols in the Wild. CoRR C, Christidis K, Caro AD, Enyeart D, Ferris abs/1707.01873 C, Laventman G, Manevich Y, Muralidha- Castro M, Liskov B (1999) Practical byzantine ran S, Murthy C, Nguyen B, Sethi M, Singh fault tolerance. In: Proceedings of the Third G, Smith K, Sorniotti A, Stathakopoulou Symposium on Operating Systems Design C, Vukolic M, Cocco SW, Yellick J (2018) and Implementation, USENIX Association, Hyperledger fabric: A distributed operating Berkeley, CA, USA, OSDI ’99, pp 173–186 system for permissioned blockchains. CoRR Clement A, Kapritsos M, Lee S, Wang Y, Alvisi abs/1801.10228, URL http://arxiv. L, Dahlin M, Riche T (2009a) Upright clus- org/abs/1801.10228 ter services. In: Proceedings of the ACM Ateniese G, Bonacina I, Faonio A, Galesi N SIGOPS 22nd Symposium on Operating (2014) Proofs of space: When space is of the Systems Principles, ACM, SOSP, pp 277– essence. In: Abdalla M, De Prisco R (eds) 290, DOI 10.1145/1629575.1629602 Security and Cryptography for Networks, Clement A, Wong E, Alvisi L, Dahlin M, Springer International Publishing, pp 538– Marchetti M (2009b) Making byzantine 557 fault tolerant systems tolerate byzantine Aublin PL, Mokhtar SB, Quéma V (2013) faults. In: Proceedings of the 6th USENIX RBFT: Redundant Byzantine Fault Toler- Symposium on Networked Systems Design ance. In: Proceedings of the 2013 IEEE 33rd and Implementation, USENIX Association, International Conference on Distributed NSDI, pp 153–168 Computing Systems, IEEE Computer Soci- De Moura L, Bjørner N (2008) Z3: An Effi- ety, ICDCS ’13, pp 297–306 cient SMT Solver, Springer Berlin Heidel- Baier C, Katoen JP (2008) Principles of Model berg, Berlin, Heidelberg, pp 337–340 Checking (Representation and Mind Series). Decker C, Wattenhofer R (2013) Information The MIT Press Propagation in the Bitcoin Network. In: 13th Bentov I, Gabizon A, Mizrahi A (2016) Cryp- IEEE International Conference on Peer-to- tocurrencies without proof of work. In: Peer Computing (P2P), Trento, Italy Clark J, Meiklejohn S, Ryan PY, Wallach D, Douceur JJ (2002) The sybil attack. In: Pro- Brenner M, Rohloff K (eds) Financial Cryp- ceedings of 1st International Workshop on tography and Data Security, Springer Berlin Peer-to-Peer Systems (IPTPS) Heidelberg, Berlin, Heidelberg, pp 142–157 Dziembowski S, Faust S, Kolmogorov V, Bertot Y (2006) Coq in a Hurry. CoRR Pietrzak K (2015) Proofs of space. In: Ad- abs/cs/0603118, URL http://arxiv. vances in Cryptology – CRYPTO 2015, org/abs/cs/0603118, cs/0603118 Springer Berlin Heidelberg, pp 585–605 Bessani A, Sousa J, Alchieri EEP (2014) State Garay J, Kiayias A, Leonardos N (2015) The machine replication for the masses with bft- Bitcoin Backbone Protocol: Analysis and smart. In: DSN Applications, Springer Berlin Heidelberg, Bonneau J, Miller A, Clark J, Narayanan A, Berlin, Heidelberg, pp 281–310 Kroll JA, Felten EW (2015) SoK: Research Golan Gueta G, Abraham I, Grossman S, Perspectives and Challenges for Bitcoin and Malkhi D, Pinkas B, Reiter M, Seredin- schi D, Tamir O, Tomescu A (2019) Sbft:
You can also read