Towards a Toolchain for Exploiting Smart Contracts on the Ethereum Blockchain

 
CONTINUE READING
Towards a Toolchain
 for Exploiting Smart Contracts
  on the Ethereum Blockchain
                               by
                      Sebastian Kindler
               M.A., University of Bayreuth, 2011

Thesis Submitted in Partial Fulfillment of the Requirements for the
 Degree of Bachelor of Science in the Computer Science Program
                  Faculty of Computer Science

              Supervisor: Prof. Dr. Stefan Traub
          Second Assessor: Prof. Dr. Markus Schäffter
             External Assessor: Dr. Henning Kopp

               Ulm University of Applied Sciences
                       March 22, 2019
Abstract

The present work introduces the reader to the Ethereum blockchain. First, on a con-
ceptual level, explaining general blockchain concepts, and viewing the Ethereum
blockchain in particular from different perspectives. Second, on a practical level,
the main components that make up the Ethereum blockchain are explained in detail.
In preparation for the objective of the present work, which is the analysis of EVM
bytecode from an attacker’s perspective, smart contracts are introduced. Both, on
the level of EVM bytecode and Solidity source code. In addition, critical assem-
bly instructions relevant to the exploitation of smart contracts are explained in
detail. Equipped with a definition of what constitutes a vulnerable contract, further
practical and theoretical aspects are discussed: The present work introduces re-
quirements for a possible smart contract analysis toolchain. The requirements are
viewed individually, and theoretical focus is put on automated bytecode analysis
and symbolic execution as this is the underlying technique of automated smart
contract analysis tools. The importance of semantics is highlighted with respect
to designing automated tools for smart contract exploitation. At the end, a min-
imal toolchain is presented, which allows beginners to efficiently analyze smart
contracts and develop exploits.

                                         i
Contents

Introduction                                                                       1

1   Preliminaries                                                                  3

    1.1   Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     3

    1.2   Ethereum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    10

          1.2.1   Ethereum from Different Perspectives . . . . . . . . . . .      10

          1.2.2   Ethereum World State σ . . . . . . . . . . . . . . . . . .      13

          1.2.3   Ethereum Account Types . . . . . . . . . . . . . . . . . .      15

          1.2.4   Ethereum Transactions . . . . . . . . . . . . . . . . . . .     16

          1.2.5   Ethereum Virtual Machine (EVM) . . . . . . . . . . . . .        23

          1.2.6   Ethereum Peer-to-Peer Network . . . . . . . . . . . . . .       26

    1.3   Smart Contracts . . . . . . . . . . . . . . . . . . . . . . . . . . .   28

          1.3.1   Smart Contracts at EVM Bytecode Level . . . . . . . . .         29

          1.3.2   Solidity and the Structure of Smart Contracts . . . . . . .     35

    1.4   Vulnerability of Smart Contracts . . . . . . . . . . . . . . . . . .    43

          1.4.1   Critical Bytecode Instructions in Smart Contracts . . . . .     43

          1.4.2   Exploitation of Critical Instructions . . . . . . . . . . . .   46

          1.4.3   Defining Vulnerable Smart Contracts and Exploits . . . .        49

                                         ii
2   Towards a Smart Contract Exploit Development Toolchain                      50

    2.1   Requirements Analysis . . . . . . . . . . . . . . . . . . . . . . .   50

    2.2   Requirement 1: EVM Bytecode Deployment . . . . . . . . . . . .        53

    2.3   Requirement 2: Manual EVM Bytecode Analysis (Tools) . . . . .         55

    2.4   Requirement 3: Automated Analysis (Theory) . . . . . . . . . . .      57

          2.4.1   Symbolic Execution . . . . . . . . . . . . . . . . . . . .    57

    2.5   Requirement 4: Automated Exploit Development . . . . . . . . .        70

    2.6   Toolchain for Automated EVM Bytecode Analysis and Exploit
          Development . . . . . . . . . . . . . . . . . . . . . . . . . . . .   73

Conclusions                                                                     74

                                        iii
Introduction

The concept of smart contracts was introduced in 1994 by cryptographer Nick
Szabo [46], who offers the following definition:

       A smart contract is a computerized transaction protocol that executes
       the terms of a contract. The general objectives of smart contract design
       are to satisfy common contractual conditions (such as payment terms,
       liens, confidentiality, and even enforcement), minimize exceptions
       both malicious and accidental, and minimize the need for trusted
       intermediaries. Related economic goals include lowering fraud loss,
       arbitration and enforcement costs, and other transaction costs1 .

A smart contract is as binding as a legal contract. The bytecode of a smart contract
constitutes the contractual conditions, to which users subject themselves when
they execute the respective smart contract. However, unlike a legal contract, a
smart contract can neither be circumvented nor fought in court. As programs,
and from the perspective of the end user, smart contracts execute precisely the
way they are designed. In this sense, Ethereum blockchain technology is an
implementation of a decentralized crypto-law system [51]. In contrast to national
legal systems, Ethereum non-contract account owners do not decide by which
law they want to abide, but rather by which law they want to be bound. Once
called via a message call transaction, smart contract execution cannot be stopped,
and the contractual conditions are binding in the absolute sense. However, if
program code is what constitutes the law, then programming errors are part of
the law as well. Hence, by definition, the abuse of programming errors in a
   1
     The term transaction costs goes back to the article The problem of Social Cost by Ronald
Coase [10], the theses of which were later summarized as the Coase theorem [12]. Transaction costs
have a negative connotation and refer to the time and effort as well as the resources that are required
to negotiate the exchange of legal entitlements. According to the interpretation [12] of Coase’s
proposal, from the perspective of efficiency, the original allocation of resources is of no concern as
long as transactions of legal entitlements are costless. Reducing transaction costs facilitates the
efficient exchange of legal entitlements and increases cooperation between competing parties.

                                                  1
decentralized system such as Ethereum cannot constitute a violation of the system’s
crypto-law. Any condemnation of attacks against error-prone smart contracts
builds on the remnants of thinking in centralized legal systems. A decentralized
system thus eradicates such thinking. Public blockchain implementations such as
Ethereum are transparent but trustless environments. However, the trust people
put in Ethereum does not depend on centralized legal institutions. Rather, people
put their trust in algorithms [35] the security of blockchain technology is build
on. Regarding the consensus algorithm, i.e., the proof-of-work, such trust may be
justified. However, complete trust in the correct execution of arbitrary programs
seems ill-placed. Especially, when these programs manage ’real’ people’s money:
Ethereum smart contracts own Ether worth millions of US dollar, and heists [33] on
the Ethereum blockchain have shown how vulnerable and insecure smart contracts
can be. To comprehend the severity of smart contract vulnerabilities as well as the
importance of a toolchain for smart contract vulnerability analysis, the subsequent
work serves as a thorough introduction to the Ethereum blockchain. The present
work introduces the reader to the Ethereum blockchain. First, on a conceptual level,
explaining general blockchain concepts, and viewing the Ethereum blockchain
in particular from different perspectives. Second, on a practical level, the main
components that make up the Ethereum blockchain are explained in detail. In
preparation for the objective of the present work, which is the analysis of EVM
bytecode from an attacker’s perspective, smart contracts are introduced. Both,
on the level of EVM bytecode and Solidity source code. In addition, critical
assembly instructions relevant to the exploitation of smart contracts are explained
in detail. Equipped with a definition of what constitutes a vulnerable contract,
further practical and theoretical aspects are discussed: The present work introduces
requirements for a possible smart contract analysis toolchain. The requirements are
viewed individually, and theoretical focus is put on automated bytecode analysis
and symbolic execution as this is the underlying technique of automated smart
contract analysis tools. The importance of semantics is highlighted with respect to
designing automated tools for smart contract exploitation. At the end, a minimal
toolchain is presented, which allows beginners to efficiently analyze smart contracts
and develop exploits.

                                         2
1     Preliminaries

1.1    Blockchain

Blockchain as an append-only linked list

The term blockchain refers to a data structure, which can be loosely described as
an append-only linked list, whose data content and sequence of data elements are
immutable. In comparison, a standard singly linked list is a linear sequence of
individual data elements, each of which contains some data and a reference to the
next element. Thus, the elements themselves implement the list by referencing the
address of the respective next element as shown in Figure 1. Each of the elements
in the singly linked list can be modified, moved within the sequence or be deleted.
Moreover, new elements can be inserted at any point in the sequence: at the
beginning, the middle or the end. Thus, a singly linked list is neither append-only
nor is it immutable with regards to data content and sequence of data elements.

      address : 0x0004            address : 0x000A            address : 0x0032

      data     12                 data        35              data     17
      next     0x000A             next        0x0032          next     N ull

      F irst element              Second element              Last element

Figure 1: A singly linked list consisting of three elements, each of which stores an
integer value and references the next element by pointing to its address.

In contrast, a blockchain is designed as an append-only linked list: the linear
sequence of previously added data elements is immutable, so is the data stored in
the data elements. The data elements on a blockchain are referred to as blocks.
Each block is comprised of two sections: (1) a block header that contains various
pieces of information particular to a block on the blockchain, and (2) a data section,

                                          3
which contains the data, i.e., records of information. Just like the elements added
to a linked list, the blocks on a blockchain contain a reference to a neighboring
block on the blockchain. The reference to the neighboring block is stored in the
block header. However, instead of referencing the next data element, each block
references the previous block in the sequence, i.e., the parent block. Instead of
using a pointer to the address of the next element in the sequence, each block
contains the hash value of its parent block as depicted in Figure 2.

       block header :              block header :              block header :
       [...]                       [...]                       [...]
                                  fHash (Block #0)            fHash (Block #1)

       data section :              data section :              data section :
       dataA                       dataB                       dataC

      Block #0                   Block #1                    Block #2

Figure 2: In a blockchain, the elements of the linear sequence are linked through
the hash value of the respective precursor. The hash function fHash () takes the
entire block, i.e., block header and data section, as input and produces a hash value
as output. The placeholder [...] in the block header represents additional pieces of
information that are particular to a block.

Blockchain cryptographically protected by hash values

The properties of hash values allow for data validation and provide a means
for checking the integrity of the data stored on the blockchain: Hash functions
take an arbitrarily large size of data as input and map it to a fixed-sized output,
e.g., a 256-bit hash value. In addition, good hash functions map data in a way
that exhibits the so-called avalanche effect: If only one bit of the input data is
inverted, each of the bits of the output hash value will change with a probability
of fifty percent [50, p. 524]. This makes the output of hash functions entirely
unpredictable. Only identical blocks of data will produce the same hash value.

                                         4
Thus, hash functions create a fingerprint of the respective data, which can be
used to check the data’s integrity. In this way, each block references its precursor
and cryptographically contains the identity of the precursor’s data in the form
of a fingerprint, i.e., the hash value. When a block is added to the end of the
sequence, the hash value of the last block in the blockchain becomes part of the
new block’s header information, and subsequently determines the new block’s
hash value. Modifying, appending or deleting even one character in a block’s data,
therefore, would change that block’s hash value significantly as shown in Figure
3. Hence, linking blocks on the blockchain through their hash values is a way of
providing a means for detecting manipulation of the data stored on the blockchain.
Hypothetically, if someone were to manipulate the records of information stored in
a block, this person would have to re-calculate the hash values of all subsequent
blocks to escape detection. In any other case, the integrity of the data would be
violated, and the blockchain be broken.

   Blocks of data                                  Resulting hash values

”dataA 0xe35f 47d 1 ”

                                              0x60cbc1a87c2c7bad994784ded812af 98

”dataA 0xe35f 47d 2 ”         fMD5 (data)    0xb3ae55f 566e756ef a3af 8ebda65d6332

                                              0x0cd26af 0131478ae2be6caeead727502

”dataA 0xe35f 47d 3 ”

Figure 3: Modifying only one character of a piece of data, e.g., a string, changes
that data’s resulting hash value, e.g., 128-bit MD5 hash value, significantly.

Hashes as proof-of-work

Calculating a hash value with the respective hash function does not require a lot
of computational resources. Anybody with access to the data of the blockchain
described so far could manipulate the records of information in one or several

                                         5
blocks and re-calculate the hash values of all subsequent blocks in the sequence.
To prevent such data manipulation, blockchain technology employs a mechanism,
which demands that the hash value of a block be below a specified target value. By
introducing such a difficulty level, the mechanism imposes a significant computa-
tional effort on whoever wants to add a block to the blockchain. The desired hash
value can only be found with brute force by continuously changing the block’s
data: For this purpose, a nonce is added to the block header every time the hash
value is being calculated. This process is repeated until it produces a hash value
smaller than the difficulty level. The nonce that produced the acceptable hash value
then remains part of the block header as seen in Figure 4.

       block header :              block header :             block header :
       [...]                       [...]                      [...]
                                  fHash (Block #0)            fHash (Block #1)
                                  nonce1                      nonce2

       data section :              data section :             data section :
       dataA                       dataB                      dataC

      Block #0                   Block #1                    Block #2

Figure 4: The nonce is part of the block header. As such it determines the resulting
hash value of the block.

Hence, finding a nonce that yields a hash value below the difficulty level poses a
severe computational problem, which can only be solved with brute force. Anybody,
who wants to find the acceptable hash value to add a new block to the blockchain,
has to commit extensive computational resources to this work. The result of
such computational efforts is called proof-of-work [51, p. 6]: The nonce and the
resulting hash value below the specified difficulty level are proof of someone’s
computational work. The mechanism or process of solving the proof-of-work is
called mining. Blockchain users, who commit computational resources to solve

                                         6
the proof-of-work for a block, are called miners. To put this work in perspective:
From February 2018 until November 2019, the average daily hash rate for a block
on the Ethereum blockchain used to be around 250,000 Gigahashes per second
(250k GH/s)2 . A powerful GPU, e.g., the GeForce GTX 1080 Ti, is capable of
calculating 31.3 · 106 hashes per second (31.3 MH/s or 0.0313 GH/s)3 . A miner
who solves the proof-of-work in 10 seconds, while the specific block difficulty
is approximately 3.13 · 1015 , has to commit computing equipment capable of
a hash rate of 313,000,000 MH/s hashes per second (313k GH/s). That is 10
million times the hash power of the GTX 1080. From this follows, that the mining
mechanism disproportionately impedes the insertion of a new block in the middle
of the blockchain or the manipulation of the records of information in existing
blocks, as the proof-of-work would have to be solved for each subsequent block.
Moreover, computing power translates into electrical energy consumption, which
puts a real cost burden on miners, and therefore discourages them from committing
their computational resources to data manipulation.

Proof-of-work as a consensus mechanism in a decentralized network

The proof-of-work also serves as a consensus mechanism: A blockchain is not
stored on a centralized server like a database within a client-server architecture as
shown in Figure 5.
   2
   Cfr. etherscan.io/chart/hashrate. (Last visited February 2, 2019.)
   3
   Cfr. www.techspot.com/article/1438-ethereum-mining-gpu-benchma
rk/. (Last visited February 2, 2019.)

                                         7
Figure 5: Centralized network based          Figure 6: Decentralized peer-to-peer
 on the client-server architecture as it is   network as it is used for blockchain tech-
 used for databases and web services.         nology.

Blockchain technology demands the generation of a decentralized organization as
shown in Figure 6, i.e., a peer-to-peer system consisting of devices, which support
the same blockchain protocol, the same processes. These processes transform
configured devices into network nodes. Interaction between the nodes is symmetric:
Each node acts simultaneously as a client and a server [47]. The blockchain network
distinguishes between full nodes and light nodes. Full nodes store a complete copy
of the blockchain, while light nodes only download the block headers. The mining
process requires that each miner host a copy of the entire blockchain on their
node, i.e., they have to run a full node. There is no trust between the nodes.
Hence, the consensus mechanism, i.e., the proof-of-work, allows nodes to verify
the blockchain data they receive from other nodes. If a node has validated a newly
created record of information or change to the blockchain, i.e., the proof-of-work
for a newly added block, the node will then propagate the data further to other
nodes. Each node can decide for itself, whether the received data is valid or not.
The final decision on which block is appended to the blockchain is the result of
comparing data with other nodes. The information about verified blocks has to be
spread over the network.

Broadcasting information in a peer-to-peer network

To this end, two types of data have to be broadcasted on the blockchain’s peer-

                                          8
to-peer network: (1) newly created records of information, and (2) the resulting
changes to the blockchain. Full nodes redistribute information about changes to
the blockchain as well as newly created data, i.e., records of information, on the
network, while light nodes may serve as endpoints that only broadcast new records
of information created by their owners.

Miners decide, which records of information they want to include in the block they
are about to mine: They collect the new records of information broadcasted to the
network into new blocks, which they then try to mine so that they may be added to
the blockchain. Thus, there is some incentive for a rogue miner not to propagate,
i.e., broadcast, newly created records of information and collect them in a new
block to be mined in secrecy. However, as even endpoint nodes broadcast to more
than one node, such behavior would only work with records of information created
by the rogue miner. In the end, such behavior would be futile if the rogue miner
did not also have the resources to finish the proof-of-work before anybody else’s
proof-of-work.

Incentive for sustaining the blockchain network

With the respective blockchain software and adequate hardware, the blockchain
network is accessible to anyone. While light nodes may run on less powerful
devices, miners require powerful computing equipment to solve the proof-of-work
and add new blocks to the blockchain. Hence, miners continuously compete with
each other over the proof-of-work for the next block in the sequence.

The reason why miners commit their computing power in the first place is that
blockchain technology is intrinsically linked to cryptocurrencies, whose value
today is pegged to real-world fiat currencies. A blockchain serves as a public
ledger, which stores the transactions between accounts. The transactions are the
records of information, which miners group in blocks. Accounts are identified by
hexadecimal numbers, i.e., the account’s address. Transactions describe monetary
value transfers from one address to another, i.e., from one account to another
account. The real-world value of such transfers depends on the exchange rate
between the blockchain’s cryptocurrency and other real-world fiat currencies, e.g.,

                                        9
the US Dollar or the Euro. The miner, who first solves the proof-of-work for a
new block, receives compensation in the blockchain’s respective currency, which
has real-world value. In addition, senders pay a small fee for each transaction they
send to another account. Miners thus use their hash power to compete against all
the other miners on the blockchain network.

1.2     Ethereum

1.2.1   Ethereum from Different Perspectives

Ethereum can be viewed from three different perspectives: (1) from a theoretical
point of view as a whole, (2) according to its implementation, and (3) its practical
application and meaning.

Blockchain as a transaction-based state machine

The Yellow Paper [51] states that Ethereum as a whole is a transaction-based state
machine as shown in Figure 7. From this perspective, Ethereum is defined through
a so-called world state, formally denoted as σ. As shown in Figure 7, only valid
transactions can cause the world state to transition from one state to another: e.g.,
σt −→ σt+1 .

                                         10
T : transaction
       σ : world state

                           T1 , T2                                T3
            σt                                 σt+1                                σt+2

Figure 7: Ethereum can be considered as a transaction-based state machine [51],
which changes from one state to another. The transitions of the world state σ are
caused by valid transactions.

Blockchain as a record of state-changing causes

From an implementation perspective, however, transactions as such do not change
the world state. Transactions have to be grouped into a block first. Only valid
transactions of a valid block4 cause a change in the world state. From this perspec-
tive, Ethereum is a sequence of blocks chained together through the backward hash
reference as shown in Figure 8. The blocks contain the records of the causes for
change in state.
   4
     The blockchain is a sequence of valid blocks. However, there are different types of blocks,
which are part of the Ethereum blockchain [2]: The first block in a blockchain is called the (1)
genesis block. Valid blocks are appended to valid blocks, starting with the genesis block. However,
not all block become canonical blocks. Miners must cease working on a block as soon as the
network has validated another block. An unfinished block is called a (2) stale block, i.e., a block
that had to be discarded by the miner because a competing miner found the proof-of-work for
the next block in the blockchain first. In the case that two competing miners finish almost at the
same time, the finished but rejected block becomes an (3) uncle block, also referred to as ommer
block [51]. On the Ethereum blockchain, miners receive rewards for both, valid blocks and uncle
blocks. A parallel chain of uncle blocks can grow substantially if parts of the peer-to-peer network
believe it to be the canonical chain. For the same reason, these parallel chains can be split up as
well.

                                                11
B : Block
 T : Transaction
 σ : World state
                            Bb                           Bb+1

                       fHash (Bb−1 )                  fHash (Bb )
                            T1                             T3
                            T2

             σt                           σt+1                           σt+2

Figure 8: Transactions are grouped in blocks. On the block-level, transitions of the
world state are caused through the addition of finalized blocks to the blockchain
[51]. Each block can contain a series of transactions. Child block Bb+1 is linked
to its parent block Bb because the child block contains the hash, e.g., fHash (Bb ),
of the parent block in its block header. This linkage is depicted by the arrow,
connecting the two blocks.

Blockchain as a ledger

In practical terms, Ethereum constitutes a ledger composed of all valid transactions
between accounts grouped in blocks as shown in Figure 9. The blocks can be
viewed as the pages of a ledger, on which the transactions contained in a block are
recorded.

                                        12
B: Block

                                                              Bb+2

                                                       Bb+1

                                                Bb

Figure 9: The Ethereum blockchain viewed as a ledger. With each validated block
the ledger’s ’volume’ increases, containing the record of every valid transaction.

1.2.2      Ethereum World State σ

Mapping between addresses and account states

The world state σ [51] contains all Ethereum accounts as objects. An account is
the mapping of an address a to the state of the account, i.e., the account state. Thus,
each state σt of the world state maps all Ethereum addresses to their respective
account states. The account state is denoted as the tuple σ[a] and contains four
fields as shown in Figure 10: (1) nonce (σ[a]n ), (2) balance (σ[a]b ), (3) storageRoot
(σ[a]s ), and (4) codeHash (σ[a]c ). The nonce is a scalar value, which counts the
transactions sent from this account’s address. The balance is a scalar value, which
represents the accrued amount of money owned by the address. While Ether is
Ethereum’s native currency, the balance is calculated in Wei, the smallest unit of
Ethereum’s currency. One Ether is equivalent to 1018 Wei. The storageRoot is a
256-bit hash5 value of the data stored in the account storage in the Ethereum state
database. The codeHash is the 256-bit hash value of the bytecode that belongs to
the account, and which is stored in the Ethereum state database. Bytecode on the
Ethereum blockchain is immutable. Hence, the value of the codeHash field will
never change.
   5
       Hash here refers to the Keccak 256-bit hash used on Ethereum.

                                                13
World state σ

                                    State database

                  Address a         Account state σ[a]      Storage
               160-bit identifier         nonce

                                         balance
                                                           Bytecode
                                       storageRoot           6060...

                                        codeHash

Figure 10: The world state σ maps an address to an account state. While the
transactions that lead to some world state σt are stored in the respective blocks on
the blockchain, the data structure for the mapping of addresses to account states,
as well as the account’s storage and bytecode, is stored in the Ethereum state
database [51].

Data structure of the Ethereum state database

The data structure of the Ethereum state database constitutes a Merkle Patricia
tree [13] [29], a combination of a (1) Merkle tree and a (2) Radix tree. A Merkle
tree builds up from the leave nodes at the bottom via intermediate nodes to the root
node at the top: The leave nodes at the bottom contain the data. The intermediate
nodes contain the hash of their child nodes. Child nodes are either intermediate
nodes or leave nodes. The root node at the top also contains the hash of its child
nodes, which are intermediate nodes. The hash stored in the root node of the
Merkle tree is called the root hash. The radix tree is built on top of many Merkle
trees. The leave nodes of the radix tree represent the root nodes of the Merkle trees.
The root node of the radix tree and its intermediate nodes contain each a single
character from these Merkle tree root hashes. Following down the path of a radix

                                         14
tree, while searching for a specific Merkle tree root hash, leads to the root hash of
the corresponding Merkle tree.

The root hash of a Merkle tree for the state database is contained in each block [51].
In this way, the world state and the blockchain are cryptographically linked to
each other. The data at the bottom of the data structure is thus verifiable for each
individual state because the respective block, which caused the state, also contains
that state’s root hash.

The hash algorithm that is used in Ethereum is a 256-bit Keccak hash, also referred
to as SHA3 hash [51]. Furthermore, all data in Ethereum is serialized by the data
format RLP (Recursive Length Prefix) [22]. Data can be either a string (byte array)
or a byte array of byte arrays (strings).

1.2.3   Ethereum Account Types

From this look at the Ethereum account state, it becomes evident that Ethereum,
viewed from its practical aspects, is more than a ledger that records transactions
between accounts. Ethereum supports a decentralized computer: The Ethereum
virtual machine (EVM) allows for the execution of bytecode. Accounts can own
EVM bytecode, which is stored in the Ethereum state database. A transaction
addressed to an account, which owns EVM bytecode, triggers the execution of that
EVM bytecode. To this end, Ethereum supports two types of accounts [17]: (1)
non-contract accounts, and (2) contract accounts. A non-contract account is an
externally owned account (EOA). Access to a non-contract account is controlled by
a private key. Non-contract accounts do not contain EVM bytecode. Therefore, the
fields storageRoot and codeHash remain empty as shown in Figure 11. By contrast,
as shown in Figure 12, a contract account is publically available and contains EVM
bytecode, which can be executed on the Ethereum virtual machine (EVM). Hence,
a contract account is controlled by its EVM bytecode, while bytecode execution is
triggered by transactions sent to the contract account.

                                         15
Address           Account state                Address           Account state   Storage
       160-bit identifier      nonce                  160-bit identifier      nonce

                              balance                                        balance
                                                                                           Bytecode
                            storageRoot                                    storageRoot      6060...

                             codeHash                                       codeHash

Figure 11: Externally owned account                 Figure 12: Contract account (CA) with
(EOA). An EOA does not own EVM                      EVM bytecode and storage for contract
bytecode.                                           data.

1.2.4       Ethereum Transactions

Transactions

Only owners of non-contract accounts (EOA) can send transactions from their
account’s address to the address of other accounts, including contract accounts
(CA). Sending transactions to another account’s address allows for transfer of
value, i.e., Ether, to the respective account. Transactions sent from a non-contract
account’s address to the address of a contract account cause the execution of the
EVM bytecode, such an account owns.

Fees (Gas)

The processing of transactions, i.e., the mining of blocks, requires computing
power, which translates into real-world costs for electrical power. Therefore,
Senders are charged a fee for their transactions. The fee is credited to the account
of the miner, who first delivers the proof-of-work [2]. On Ethereum, fees for
computing power are charged in gas6 . The minimum fee required for a transaction
to be processed at all is 21,000 gas. The price of gas in Wei for processing a
   6
     If Ethereum were an engine, then transaction fees would be the fuel, i.e., the gas, that power
it [17].

                                               16
transaction depends on how much senders are willing to pay7 , as well as the miners’
price preferences. Miners, who have more hashing power, also have higher ask
prices. Hence, senders can incentivize miners to process their transactions more
rapidly by simply increasing their price offer for gas [44].

The amount of gas purchased for transactions to addresses of contract accounts
is a more subtle calculation, as the sender of that transaction has to pay a fee for
code execution as well. However, different EVM bytecode instructions consume
varying amounts of gas [17]. Senders who want EVM bytecode to be executed
on the Ethereum blockchain must, therefore, purchase a sufficient amount of gas
with their transaction. Any Ether for unused units of gas is refundable, and it
is possible to set a limit for how much gas can be purchased in one transaction.
However, should the amount of purchased gas not be sufficient, then the transaction
is reverted. Used gas is non-refundable. In any case, there is an upper limit as to
how much gas can be purchased in one transaction8 .

The price for gas (in Wei) that needs to be paid for transactions, which result in
EVM bytecode execution, depends on the amount of gas, each EVM bytecode
instruction consumes. Transaction senders pay the miners in Ether (Wei) for
the gas, their transactions consume during EVM bytecode execution. While the
amount of gas to be paid depends on the type of EVM bytecode instruction that is
being executed, and, therefore, is fixed, the price for gas depends on a free market
economy [17]: Miners decide which transactions they want to mine, and at what
price they sell their gas. They can refuse to mine transactions from senders that
offer gas prices below their minimum acceptance level. In Ethereum, the price
for gas determines how fast a transaction is being processed [44]. Senders, on
the other hand, decide how much they are willing to pay for gas, i.e., for faster
processing of their transactions.
   7
      At the time of writing, the average minimum price for one gas varied between 2 · 109 Wei
(2 Gwei) and 3 · 109 Wei (3 Gwei), which, in fiat currency, amounts to a transaction fee of $0.006
and $0.009, respectively, at slower processing times [44].
    8
      The upper limit ofgas for one transaction ensures that any EVM bytecode that is executed on
the Ethereum virtual machine will terminate; either on its own or because the transaction did not
purchase enough gas, and EVM bytecode execution ran out of gas.

                                               17
Types of transactions and transaction fields

According to the Yellow Paper [51], there are two types of transactions: (1)
transactions that result in a message call from a non-contract account (EOA)
to another account, e.g., contract (CA) or non-contract account (EOA), and (2)
transactions that result in the creation of a new contract account, i.e., contract
creation. As transactions originate outside the blockchain and the world state (state
database), transactions must be signed with the private key of the EOA’s address,
the transactions are being sent from. Figure 13 shows how different transactions
affect the world state: (1) The owner of EOA1 sends a transaction from the address
of EOA1 (address1 ) to the address of EOA2 (address2 ). The transaction is signed
with the private key belonging to address1 . As a result, when the transaction
is executed, a message call is sent from EOA1 to EOA2 . Transactions between
non-contract accounts are used to transfer value, i.e., Ether. (2) Another signed
transaction is sent from the address of EOA1 (address1 ); this time to a the address
of the contract account CA3 (address3 . This transaction results in a message call
from EOA1 to CA3 . Subsequently, the EVM bytecode owned by CA3 is executed.
In addition, it is possible to transfer value to a contract this way. (3) A signed
transaction, which is sent from the address of EOA1 (address1 ) without specifying
the recipient’s address (null address), results in the subsequent contract creation of
CA4 .

                                         18
Signed transaction Tf rom address1

                      World state σt                                         World state σt+1

                                                                             Message call
  1     Address2                       EOA2                        EOA1                         EOA2

                                                                                            Value transfer

                                                                             Message call
  2     Address3                       CA3                         EOA1                         CA3

                                                                                       Bytecode execution
                                                                                            Value transfer

  3    Null address                                                EOA1                         CA4

                                                                                       Contract creation

Figure 13: (1) Signed transaction resulting in a message call from the sender’s
account (EOA1 )to EOA2 , and a value transfer from EOA1 to EOA2 . (2) Signed
transaction resulting in a message call from the sender’s account (EOA1 ) to CA3 ,
and subsequent EVM bytecode execution, as well as possible value transfer from
EOA1 to CA3 . (3) Signed transaction resulting in the subsequent contract creation
of CA4 . Note: The world state σt+1 here does not refer to the final state, but to
a substate, in which message calls and contract creation are executed. The final
state only contains the result of these message calls as well as the newly created
contracts.

Transaction fields

Therefore, sending a valid transaction requires several fields of information as
shown in Figure 14. Both transactions, those resulting in message calls and those
resulting in contract creation, require six common fields, which are defined as
follows [51]: (1) The nonce is the number of transactions sent from the address
of a non-contract account (EOA). (2) The gasPrice refers to the monetary value
in Wei the sender of the transaction is willing to pay per unit of gas. (3) The
gasLimit refers to the maximum units of gas, the sender wants to purchase for the
transaction. The price for gas is paid up-front, and only unused gas is refundable.

                                                       19
(4) The to field requires the account address (160-bit identifier) of the recipient if
the transaction is to result in a message call to either another EOA or a CA. By
contrast, for contract creation, the to field must remain empty. As a result, a new
contract account is created, and its address is returned. (5) The value field contains
the monetary value defined in Wei, which is to be credited to the recipient’s account.
(6) The values v, r, and s are the result of ECDSA [6] signing the transaction with
the private key of the sender’s account address9 . Ethereum transactions do not
contain a from field because the sender’s account address is recoverable from the
outputs v, r, and s10 . Transactions for the purpose of contract creation further use
(7) the init field, which is an unlimited size byte array containing (a) the contract’s
loader code (constructor) as well as (b) the contract’s EVM runtime bytecode
(body). The loader code is executed only once at contract creation and returns the
contract’s EVM bytecode11 , which is stored in the Ethereum state database [49].
Each message call sent to this contract afterwards triggers the execution of the
contract’s EVM bytecode. Transactions, which cause such message calls, use (8)
the data field, an unlimited size byte array. The payload of the data field consists
of function-identifying data and the respective parameter values, and is interpreted
as bytecode to be executed by the EVM.
   9
      Cfr. github.com/ethereum/EIPs/blob/master/EIPS/eip-155.md. (Last vis-
ited March 10, 2019.)
   10
      The recovery of the sender’s account address is described formally in Appendix F of the Yellow
Paper [51].
   11
      This code is often referred to as EVM runtime bytecode.

                                                20
Transaction T

           1       nonce

           2      gasPrice
                                       Empty in contract creation transaction.
           3      gasLimit

           4          to

           5        value
                                       Used in contract creation transaction.
           6        v, r, s
                                       Used in message call transactions.
           7         init

           8        data

Figure 14: An Ethereum transaction T consists of six fields, common to both,
contract creation and message call transactions, and two fields (gray) that are
specific to contract creation and message call transactions, respectively.

Transaction execution

Transactions sent from non-contract accounts change the world state σ. As this
change cannot be reverted, transactions need to be validated first [51]: (1) A
transaction must be a well-formed recursive length prefix (RLP). (2) Transactions
must have a valid signature, i.e., the recovered address from the ECDSA signature
must be a valid address. (3) The nonce of the transaction must be the same as
that of the sender’s account state. (4) The gasLimit must be at least equal to the
minimum amount of gas required for a transaction to a non-contract account, e.g.,
21,000 gas. (5) The balance of the sender’s account state must own sufficient
funds to cover at least the up-front payment for purchasing the 21.000 units of gas
necessary for any transaction sent from a non-contract account. After successful
validation, a transaction is resolved according to the contents of its fields either to
a message call or contract creation. Successfully executed transactions are stored
on the blockchain, while the resulting changes, they caused, are stored in the state
database12 .
  12
    Each block on the Ethereum blockchain contains the hash of the root of the state database, i.e.
the world state σt+1 resulting from the execution of all the transactions grouped in the respective
block.

                                               21
Message call and contract creation

Valid transactions result either in message calls between account states or in
contract creation as seen in Figure 13, both of which require a set of parameters for
execution. For a message call, execution needs the following parameters [51]: (1)
the sender, i.e., the address of the account, from which the message call originates13 ;
(2) the transaction originator, i.e., the address of the non-contract account (EOA),
from which the transaction originated, and which is retrieved from the transaction’s
ECDSA signature; (3) the recipient, i.e., the address of the account, to which the
message call is being sent; (4) the address of the contract account14 , whose EVM
bytecode is to be executed; (5) available gas; (6) the gas price; (7) the value that is
being transferred; (8) input data, which is a byte array of arbitrary length; (9) the
current depth of stack of message calls and contract creations; (10) the permissions
to modify the world state. A message call either results in value transfer, EVM
bytecode execution or both. Additionally, in the case of EVM bytecode execution,
a message call can result in further message calls or additional contract creation.
Message calls, which directly result from a transaction, are called top-level message
calls, while message calls resulting from a top-level message call are called inner
message calls [23]. Message calls are not stored in the state database because they
are the deterministic result of executing transactions. The resulting final world
state, however, is stored in the state database.

For contract creation, the following parameters are needed [51]: (1) the sender,
(2) the transaction originator, (3) available gas, (4) gas price, (5) endowment, i.e.,
value, (6) a byte array of arbitrary length with the new contract’s initialization
bytecode (loader code), (7) the current depth of stack of message calls and contract
creations, and (8) the permissions to modify the world state. Contract creation
determines the address of the new contract account, sets the nonce of the contract
account to one, transfers any value (endowment) to the account’s balance, sets
its storage to empty, and initializes the account by executing the loader code
(constructor), which returns the contract’s EVM bytecode (body) that is stored
  13
      The sender of a message call is not always the transaction orginator, e.g. when a message call
is sent from a CA to another CA or an EOA.
   14
      In most cases, this address of the contract account is the same as the recipient.

                                                22
in the state database. The corresponding hashes to the contract’s storage data
and the EVM bytecode, storageRoot and codeHash respectively, are stored in the
contract’s account state as well. Code execution can result in further message calls
or additional contract creation. Again, as contract creation is the result of executing
a transaction, only the new contract as part of the resulting final world state is
stored in the state database.

1.2.5   Ethereum Virtual Machine (EVM)

The Ethereum virtual machine (EVM) as shown in Figure 15 executes the bytecode
owned by a contract account (CA). The EVM is a simple stack machine, which
consists of (1) a stack, (2) memory (RAM), and (3) a program counter for the
RAM. (4) The account storage, which is located in the state database, contains
generally-accessible persistent contract data. The location, where the EVM byte-
code resides in the state database, serves as (5) a virtual ROM for the EVM, from
which it is loaded into RAM. Bytecode executed in the EVM can only manipu-
late the stack, the memory, and the account storage. The word size of stack and
memory is 256 bits (32 bytes). While the stack is limited to 1024 words, memory
is an infinitely expandable word-addressed byte array15 . The account storage is a
word-addressed word-array, which contains key-value pairs. Key and value each
have a word size of 256 bit.

Machine state µ

Similar to the world state σ, which maps addresses to account states, there is a
state, which keeps track of the volatile aspects of the EVM: the machine state. The
  15
     While in storage a word-sized (256-bit) key addresses a word-size value, in memory, a word-
sized address points to a single byte. Reading from memory is word-sized, while it is possible to
write either 256-bit or 8-bit values to memory. However, memory expansion can only be achieved
in word-sized steps. Writing to a higher address in memory will first expand the memory in
word-sized steps until the called address is included. Then, memory is expanded further until
sufficient word-sized space is allocated to write the data to memory. Each step of word-sized
memory allocation, as well as writing to memory costs gas.

                                              23
machine state is defined by the tuple µ, which includes (1) the available gas (g), (2)
the program counter (pc), (3) the contents of RAM (m), (4) the active number of
words in RAM (i), and (5) the contents of the stack (s).

As the execution of EVM bytecode instructions consumes gas, the machine state µ
tracks the change in available gas: g −→ g 0 . Code execution stops if the program
counter reaches the end of the EVM bytecode in RAM. Exceptional halting only
occurs if either there is insufficient gas16 available, or the program experiences an
exception due to unusual stack behavior. When execution stops, the machine state
µ has transitioned to µ0 .

Ethereum execution environment

The EVM’s execution environment computes the transition σt −→ σt+1 17 , which
depends on the machine state’s transition µ −→ µ0 , as well as the value of the
remaining gas g 0 . Hence, in order to compute σt+1 , the execution environment has
to know the world state σ, the machine state µ, and the available gas g provided
by the transaction. The execution environment also needs to be provided with
additional information, which is defined by the tuple I 18 .
  16
       Ethereum’s gas limit ensures that code execution always terminates, i.e., infinite loops are
impossible by design [51].
    17
       In the Yellow Paper [51], the resulting world state σt+1 is denoted as σ 0 .
    18
       Ia : the address of the account that owns the code to be executed; Io : the sender address of the
transaction that triggered the code execution; Ip : the price of gas, which may vary and determines
the gas available for code execution; Id : the byte array that contains the input data for the code
execution; Is : the account address that caused the code execution, which may not be the same as
Io ; Iv : the monetary value, in Wei, sent with the transaction; Ib : the byte array with the byte code
that is to be executed; IH : the block header of the current block; Ie : a number that states how many
contract accounts are being called for execution or how many contract account creations are to
be executed; Iw : the necessary permission to make modifications to the state, e.g., to the account
storage.

                                                  24
Execution environment I

                                                        World state σ

                                  EVM                          State database

              Machine state µ
                                             Account storage
                   Counter (pc)
                                                 (RAM)
          Stack         i         RAM

                                             EVM bytecode
                                               (ROM)

           Gas                                    6060...               6060...

Figure 15: Ethereum virtual machine (EVM) [51]. The contract’s persistent storage
and read-only EVM bytecode are physically located in the Ethereum state database.
However, during contract execution, both, the storage and the EVM bytecode
constitute logical components of the EVM. Before execution, the EVM bytecode is
loaded into memory (RAM). Input data for the contract’s EVM bytecode is loaded
either on the stack or into memory. The world state σ, the machine state µ, the
available gas and the necessary additional information I constitute the execution
environment, where the bytecode execution is performed.

                                        25
1.2.6    Ethereum Peer-to-Peer Network

The Ethereum peer-to-peer network is comprised of a set of network protocols
referred to as devp2p [15]. Users run Ethereum implementations 19 on their
physical machines. The software implements devp2p and turns user machines
into peer-to-peer network nodes. In addition, the software is an implementation of
the formal specifications for the Ethereum protocol20 as defined in the Yellow Paper
[51]. Thus, machines running an Ethereum implementation turn into Ethereum
peer-to-peer nodes as shown in Figure 16: (1) Ethereum full nodes host a copy of
the entire blockchain and the Ethereum state database, which represents the current
world state σ. Full nodes verify blocks and transactions, and relay them to the
network [40]. Miners are required to run full nodes. (2) Ethereum light nodes [21],
on the other hand, synchronize with full nodes and download the latest blockchain’s
block headers21 . As light nodes do not host a copy of the state database, i.e., the
world state, they retrieve data from full nodes on demand. Additionally, light
nodes require full nodes to relay the transactions their users send to the Ethereum
network [40].
  19
      An Ethereum implementation refers to an official implementation of the Ethereum protocol [18].
Such software allows users to turn their machines into Ethereum nodes, as well as to connect to
their node with an Ethereum client instance. Ethereum implementations are usually referred to as
Ethereum clients.
   20
      By contrast, the present work has explained these specifications in a descriptive manner.
   21
      Current Ethereum implementations have a list of trusted full node peers built into their code [40].
Hence, users have to trust the developers, who created the software.

                                                  26
Full node

                                     World state σ
       Light node
                                                                            P2P connection

                                                            World state σ

                     World state σ
   User

                                          World state σ                     P2P connection

Figure 16: Ethereum peer-to-peer network comprised of decentralized nodes: (1)
full nodes, which host complete copies of both, the Ethereum state database (world
state σ) and the Ethereum blockchain, and (2) light nodes, which depend on full
nodes for downloading blockchain information, e.g. block headers and world state
data. In addition, light nodes depend on full nodes to replay transactions, sent from
light nodes, to other full nodes in the Ethereum peer-to-peer network. Ethereum
nodes communicate with each other over a P2P connection, provided by Ethereum
implementations running on physical machines.

Ethereum implementations

Users can choose from various Ethereum implementations. Some implementations
allow users to turn their machines into Ethereum nodes, and provide Ethereum
clients, which allow users to connect to their local Ethereum nodes, e.g., Geth or
Parity. Other implementations, such as Mist, Metamask, the Remix IDE or Truffle
only provide Ethereum clients. Most clients use the JavaScript Web3 API [19]
to exchange data with their local nodes via the JSON RPC protocol22 , i.e., JSON
remote procedure calls [20]. Ethereum nodes are part of the Ethereum peer-to-peer
network, where they exchange data via P2P connections as seen in Figure 17.

  22
     The JavaScript Web3 API (JavaScript API [19]) is a wrapper around the JSON RPC API
[20].

                                                     27
JSON RPC connection

          P2P connection                                 Ethereum peer-to-peer network

                       User machine

       Client                          Node

                                     World state σ
            Web3 API

                          Web3 API

                                       devp2p                        User machine                                  User machine                                  User machine

                                                     Client                          Node          Client                          Node          Client                          Node

                                                                                   World state σ                                 World state σ                                 World state σ
                                                          Web3 API

                                                                        Web3 API

                                                                                                        Web3 API

                                                                                                                      Web3 API

                                                                                                                                                      Web3 API

                                                                                                                                                                    Web3 API
            Operation system                                                         devp2p                                        devp2p                                        devp2p

                                                          Operation system                              Operation system                              Operation system

Figure 17: Most Ethereum clients connect to a local Ethereum node on a user
machine via the JSON RPC protocol, using either the JSON RPC API [20] or
the JavaScript Web3 API [19]. Ethereum nodes, whether full nodes, hosting
complete copies of both, the state database and the Ethereum blockchain, or light
nodes, which only download such data on demand, connect with each other via
a P2P connection. All Ethereum nodes as a whole constitute the decentralized
Ethereum peer-to-peer network.

1.3   Smart Contracts

Contract accounts that own EVM bytecode are also called smart contracts. This
part of the present work highlights the link between source code, EVM bytecode
and assembly code (opcode).

                                                         28
1.3.1    Smart Contracts at EVM Bytecode Level

Solidity source code

Ethereum smart contracts23 are programs written in the high-level programming
language Solidity, which must be compiled to EVM bytecode for deployment
and execution. Listing 1 shows example source code of a contract, which was
created with the browser version of the Remix IDE. Solidity’s syntax is similar
to ECMAScript, i.e., JavaScript. However, the language has been purposefully
adapted to the 256-bit architecture of the EVM stack-based machine.

         pragma solidity ^0.5.5;

         contract AddContract{

                uint256 public result;

                function addNumbers(uint256 a, uint b) public {
                    result = a + b;
                }
         }

                           Listing 1: Example of Solidity code.

EVM bytecode

Solidity source code must be compiled to EVM bytecode instructions consisting
of byte-sized hexadecimal values [39] in big endian order, i.e., network byte
order [51]. EVM bytecode is the machine language, which the EVM processes
during execution. The Solidity compiler is called solc, and its installed version
corresponds to the version of Solidity the compiler can compile. The source code
from Listing 1 compiles to the following EVM bytecode instructions shown in
Listing 2:
  23
    Ethereum is not the only blockchain environment that offers smart contract capabilities. Bitcoin
has offered smart contract capabilities from its very beginnings in 2009 [43]. Unlike Ethereum,
Bitcoin offers a scripting system called Script, which is not Turing-complete [1].

                                                29
6080604052 34801561001057600080
       fd5b5060c78061001f6000396000f3
       fe
         6080604052 348015600f57600080
       fd5b506004361060325760003560e0
       1c806365372147146037578063ef9f
       c50b146053575b600080fd5b603d60
       88565b604051808281526020019150
       5060405180910390f35b6086600480
       36036040811015606757600080fd5b
       810190808035906020019092919080
       359060200190929190505050608e56
       5b005b60005481565b808201600081
       905550505056fea165
                         627a7a72 3058
       20ab0685d5291ce6b9b1a1ea3ca99b
       6b44a588bd1655f704921a9f64f080
       18d2dd0029

       Listing 2: The source code from Listing 1 compiled to EVM bytecode.

The EVM bytecode from Listing 2, which is a continuous string of hexadecimal
values, consists of three parts [49] separated for illustration only: (1) The loader
code (constructor), which is used at contract initialization. The loader code returns
(2) the body, i.e., the contract’s EVM runtime bytecode, which is executed every
time, the contract receives a message call. The EVM runtime bytecode is stored
in the Ethereum database and is immutable. (3) The Patricia Merkle trie hash,
which is used to retrieve data from the storage associated with the contract after
initialization. This hash is also called the Swarm hash or bzzhash [24] because of
the hash’s magic number, i.e., 0x627a7a72, which is ASCII for bzzr.

Function signatures and semantics

In Ethereum, a function signature refers to the human readable text representation
of a Solidity source code function, e.g., addNumbers(uint256,uint256)24
[16]. However, at bytecode level, only the first four bytes of the Keccak256
  24
     Human readable canonical representation of the function signature only uses argument types
[16].

                                              30
hash (SHA3) are used to identify functions. These four-byte signatures are called
function selectors, e.g., 0xEF9FC50B at (2E) in Listing 3. As the input of a
hash function cannot be retrieved from the hash function’s output, there is no
function to map function selectors back to human readable function signatures25 .
However, the Ethereum Function Signature Database [16] maps function selectors
to human readable function signatures that have been provided by users. Function
signatures provide valuable semantic insight for EVM bytecode analysis: (1)
Most programming best practices demand that a function’s name should carry
the function’s semantics. (2) The type of a function’s arguments provides some
information about what a contract considers valid input. Although the database
is an essential tool for identifying function signatures, it is by far not complete26 .
This is apart from the fact that names of functions and variables may not be in
English: The function signatures ergebnis() and result() carry the same
semantic content, albeit in two different human languages, German and English
respectively. However, the corresponding function selectors, 0x529B8E60 and
0x65372147, differ significantly.

EVM bytecode execution

During execution, the EVM bytecode is loaded from the state database into the
execution environment, where the EVM bytecode constitutes the EVM’s ROM.
The EVM sets the program counter at the beginning of the EVM bytecode and
begins to execute each instruction at the current program counter location. As the
EVM does not possess registers, input arguments and parameters for instructions
are pushed onto the stack. The program counter is incremented after an instruction
has been executed. Jumps can change the program counter to any location within
the contract’s EVM bytecode, moving it to the respective jump destination. The
instruction JUMP [7] moves the program counter to a location in the EVM bytecode.
This location is given by the last element pushed onto the stack. The instruction
JUMPI [7], i.e., conditional jump, only moves the program counter to the jump
destination if some condition is met. The condition is the second to last element
  25
    In any case, a function selector is not a complete hash, but misses the rest of its 224 bits
  26
    Cfr. gist.github.com/holiman/563da876c4ce15629f57ffdc4046383b.
(Last visited March 20, 2019.)

                                                31
You can also read