Preventing an $8M attack on Ethereum's bZx deFI platform with property-based testing

(Fri, Aug 7, 2020)
By Mikkel Milo and Bas Spitters
Category : Technical
Tag : Blockchain Testing hack
Notes: This post was originally published on Medium.

On September 12, the bZx decentralized finance (deFI) platform suffered from an attack where the attacker was able to obtain control over $8 million worth of cryptocurrency.

The fault was due a single misplaced line in their iToken smart contract’s transferFrom method, allowing an attacker to duplicate their tokens, consequently allowing them to increase their token balance arbitrarily. The bug remained hidden, though, during an extensive 19 person-week auditing by two security companies, Peckshield and CertiK. In this blog post we will show how the bug could have been discovered, and the attack prevented, using property-based testing.

What is deFI?

deFI platforms are currently one of the biggest applications of blockchains (some approximate its value at 12B$). The key goal of deFI is to provide financial services where the middle man (e.g. a bank) is cut out. In particular, they allow for secure transfers of digital assets between untrusting parties. The particular functionalities are implemented through smart contracts, small programs executed on the blockchain, typically intended to implement real world contracts such as transactions of assets.

Summary of the attack

iTokens are interest-earning assets that the bZx platform users receive and redeem for crypto-cryptocurrencies deposited into lending pools. As such, they are a rewarding mechanism for interacting with the bZx platform.

iTokens are intended to be compliant with Ethereum’s ERC-20 token standard. The iToken bug was hidden in the token balance updates during a token transfer. Consider the following function:

function transferFrom(sender,receiver,nrTokens) {
     ...
1:   uint256 balanceSender = balances[sender]
2:   uint256 balanceReceiver = balances[receiver]
3:   balances[sender] = balanceSender - nrTokens
4:   balances[receiver] = balanceReceiver + nrTokens
     ...
}

In lines 1-2 the current balances of the sender and receiver are stored in termporary variables. Next in lines 3-4 their balances are updated by respectively subtracting or adding the number of tokens to be transferred to their balances. The attacker would call this endpoint with sender = receiver and nrTokens > 0. This unintentionally increases the token balance of sender by nrTokens. Of course, the expected result of a self transfer is that the balance should remain unchanged regardless of the value of nrTokens. To see why, note that balanceReceiver is calculated before the senders balance is updated. The fix is easy: simply move line 2 between lines 3-4.

Naturally, a manual test, or even a simple unit test, could have discovered this bug. For example, a unit test asserting that the balance of some address, say 0xF, is unchanged after a call to transferFrom(0xF,0xF,1) would indeed discover the bug.

However, with large code bases it may not be feasible or economically viable to have 100% unit test coverage. Besides, such a test implies knowledge of the particular edge case where sender = receiver, and clearly transferFrom had not been implemented with this edge case in consideration.

We will show how the bug could have been discovered using property-based testing without particular knowledge about the edge case. We rely only on the fact that iTokens are intended to be ERC-20 compliant. But first we briefly introduce property-based testing.

What is property-based testing?

Property-based testing (PBT) is a powerful, generative software testing technique based on stating logical, executable properties, and executing them on randomly generated data. The latter is an important ingredient, as it allows for automatically executing a test on potentially thousands of arbitrarily generated input. This contrasts with the usual unit-/integration tests, where test data is typically manually constructed, and hence normally only a few test cases are executed. Consequently, the bug discovering capability and assurance provided by a property-based test is much higher than other testing methods.

As an example, suppose we have implemented a remove(x,l) function on lists, that (hopefully) removes the element x from the list l. We now want to test this function. Below is an example of an obvious correctness property to test, written in pseudo code, and specialized to integer elements.

forAll (x : int) (l : list int),
  x in l -> x not in remove(x,l)

Naturally, this property is only testable if we have some way of generating “arbitrary” integers, and lists of integers. PBT libraries provide built-in random generators for most common data types, and provide mechanisms for deriving generators of user deFIned data types.

Note that the generators need not be entirely random. In the example above, it would actually be better to only generate l such that it contains x, as this would lead to no trivially satisfied tests.

Smart contracts are a prime target for PBT for several reasons:

  • Smart contracts often deal with valuable assets (e.g. cryptocurrencies)
  • Once deployed, smart contracts are very difficult to patch
  • Blockchains present a unique adversarial environment, where both the software and the network are open

Hence, ensuring correctness is imperative. Of course, testing can never provide an absolute guarantee of correctness, as famously stated by Edsger Dijkstra:

“Program testing can be a very effective way to show the presence of bugs, but it is hopelessly inadequate for showing their absence."
~ Edsger Dijkstra, 1970

Only formal verification can achieve this absolute guarantee, but in situations where verification is not feasible (property-based) testing is an excellent, cost-efficient alternative.

There exists several PBT (or PBT-like) frameworks for the Etherum blockchain (e.g. Brownie and Echidna). However, they would likely not have found the bug.

The Concordium Blockchain Research Center at Aarhus University has developed ConCert, a formally verified smart contract execution framework, which also includes a PBT library. The tests in the following section were executed in this framework on a port of (the central parts of) the iToken contract.

Testing the iToken contract

Let us now return to the question of what property to test on the iToken contract. The ERC-20 specification states that any ERC-20 compliant token must implement a totalSupply() function that “returns the amount of tokens in existence”. It is up to the particular implementation whether this amount is constant (i.e. set during token deployment), or if it is dynamic (by providing mint and burn operations for creating and destroying tokens, respectively). Regardless, totalSupply() must always represent the ground truth.

An obvious way of asserting that this property is satisfied is by computing the sum of all balances in the balances map (which is stored in the iToken contract’s state), and asserting that it is equal to totalSupply() for any contract state. More formally:

forall tokenState,
  let balances_sum := ... in
  balances_sum =? tokenState.totalSupply()

Of course, it may be the case that totalSupply() is calculated in the exact same way as balances_sum (although that would be a terrible approach!) if the supply can change over time with mint and burn operations. In this case we would of course never discover the bug, since the equality would be trivially satisfied. We could also state another property: if an incoming call to the contract is not a mint or burn action, then totalSupply() (or alternatively, sum_balances) must remain unchanged before and after the call. More formally we can state this in a hoare-triple style with pre- and post conditions:

{fun msg => not (is_mint msg or is_burn msg)}
iToken.receive
{fun oldState newState => oldState.totalSupply() =? newState.totalSupply()}

This property says that given an arbitrary msg, if it is neither a mint or burn action, then after executing this action, the total supply should not change. Here we assume iToken.receive is some function that takes an incoming msg (like a transferFrom, mint, or burn action), and updates the contract’s internal state.

We claim that testing at least one of these properties would discover the bug. Furthermore, notice how they follow from the ERC-20 specification. In particular the first property is a direct formalisation of the specification of totalSupply(), and the second is a natural specialization, relying on the knowledge that only certain actions (mint and burn) should change the total supply. It is also important to note that neither property is stated in terms of transferFrom, nor uses any knowledge about the edge case where sender and receiver are the same address.

The two properties above were tested in ConCert’s PBT framework on a port of the iToken contract. Both tests failed and provided a counterexample with a self-transfer in transferFrom. The source code is available on Github. To our knowledge, only ConCert’s PBT framework has the expressiveness to test these two properties. Other existing PBT frameworks for smart contracts seem restricted to simpler, assertional properties on specific endpoints.

Closing thoughts

We have seen how to use property-based testing to discover the bug that made the recent $8M bZx attack possible. We emphasize that the tests do not use any posterior knowledge of the attack, since the properties tested follow directly from the ERC-20 specification that the offending contract was supposed to satisfy. In general, whenever there is a specification/standard, property-based testing is an easy, obvious, and cost-efficient choice to potentially discover bugs in the implementation with respect to the specification.