Finding Real-World Bugs in Smart Contract Interactions with Property-Based Testing

(Mon, Dec 14, 2020)

By Mikkel Milo and Bas Spitters

Category : Technical

Tag : Blockchain Testing Formal Methods

Notes: This post was originally published on Medium.

Recently, Nomadic Labs made a blog post detailing their work on formally verifying the Dexter smart contract, a contract that provides decentralized exchange of digital assets.

They verified functional properties of the contract using Mi-Cho-Coq, a Coq framework for proving properties about Michelson contracts in Tezos.

A key limitation of Mi-Cho-Coq is that it does not allow for proving properties about contract-to-contract communications (also referred to as inter-contract communications, or just contract interactions). This limitation has some specific consequences, as they mention in their blog:

the Trail of Bits audit of earlier versions of Dexter revealed exploits related to inter-contract communication (which have been fixed), of a type which could not have been detected with Mi-Cho-Coq in its current form.

In this blog we will show how the ConCert framework, developed at CoBRA allows for testing contract interactions. This opens up for the possibility for discovering bugs at the protocol level. ConCert also supports formally verifying contract interactions, but we won’t cover this here.

As an example we will use ConCert’s testing framework to test a correctness property of an earlier version of Dexter’s token exchange protocol (ported to our testing framework in Coq), and discover a real-world bug which would allow for attackers to obtain unintended profit from these exchanges.

Tezos, Smart Contracts, and Coq

Decentralized finance (deFI) platforms are currently one of the biggest applications of blockchains (some approximate its value at 12B$). The key goal of deFI is to provide financial services where the middle man (e.g. a bank) is cut out. In particular, they allow for secure transfers of digital assets between untrusting parties. The particular functionalities are implemented through smart contracts, small programs executed on the blockchain, typically intended to implement real world contracts such as transactions of assets.

Tezos is one such blockchain. The native cryptocurrency for the Tezos blockchain is tez. Tezos is in part a response to the popular blockchain Ethereum, and in particular its history of security breaches through so-called re-entrancy attacks, such as the infamous $50M hack on “The DAO”, a crowdfunding smart contract. Tezos has a smart contract execution model that intends to mitigate these types of attacks. Tezos has been developed with formal verification in mind. One such tool for formal verification is the Coq proof assistant. Mi-Cho-Coq is a Coq framework for formally verifying functional correctness of Michelson programs, the smart contract language for Tezos.

What are ‘inter-contract communications’?

Smart contracts have a set of public entry-points that can be called by accounts on the blockchain. These define the behavior of the contract. In a sense, this is the public API of the contract. For example, the Dexter contract has, among others, an entry-point named tokensToTez, which exchanges a given number of tokens to some tez, according to a computed exchange rate. However, this exchange is not an atomic operation because Dexter is not responsible for managing tokens - token transfers are delegated to another contract. Hence, a token exchange occurs over multiple steps. Consider the following scenario:

Suppose a user Bob wants to exchange $X$ tokens to $Y$ tez using tokensToTez. Assuming the exchange is successful, the exchange protocol is shown in the figure below.

In the figure, Dexter receives Bob’s exchange request, calculates the exchange rate via an internal function getInputPrice, and finally emits two actions:

transfer $Y$ tez to Bob
call the tokens contract, requesting a transfer of $X$ tokens from Bob’s account to Dexter’s account.

Note: Tezos' execution model ensures that the two emitted actions are executed only in this order.

When the token contract receives the token transfer request, it sends a callback action to the requesting user, notifying them of the token transfer. Finally, the actual token balances are updated in the token contract.

Note: For the sake of the example, we have chosen a particular behavior of the token contract in this example. Token contracts may in general behave differently (e.g. they may omit the callback, or make multiple callbacks to various parties)

Because the exchange occurs over multiple steps, the correctness of the exchange protocol cannot be verified or tested by simply comparing the Dexter contract’s state before and after the call to tokensToTez. Indeed, a semantics for inter-contract executions is necessary for verification, as they mention in the blog post. For testing, we will need to simulate/mock a blockchain state with the necessary contracts deployed, and a fine-grained execution model that allows for inspecting the blockchain state (and hence the contract states) after each transaction. This is essentially what we have done in ConCert’s property-based testing framework.

ConCert’s Smart Contract Execution Model & Testing Framework

ConCert, like Mi-Cho-Coq, is a smart contract verification framework in Coq. However, rather than settling for a specific contract language a-priori (like Michelson in the case of Mi-Cho-Coq), smart contracts in ConCert are written as functions in Coq’s functional language Gallina, and can be verified directly in Coq. ConCert then provides transpilation to existing languages such as Liquidity, CameLIGO, and Rust, where the core parts of the translation is certified. Liquidity and CameLIGO are both smart contract languages, and Rust has a smart contract component that is used by Concordium, which is a partner in BAN.

ConCert provides an executable formalization of smart contract executions on an abstract blockchain, which allows for both proving and testing properties about smart contracts.

In ConCert, a block is internally represented with a record containing information such as

currently deployed contracts and their states
accounts and their balances
a queue of actions to be executed in this block
current chain height
block creator & block reward

Actions can either be

contract deployments
chain asset transfers between accounts
or contract calls

A block is successfully created if none of the queued actions fail when executed. A blockchain in ConCert is then a proof-carrying sequence of successfully created blocks.

Contracts are represented in a functional way with an init and a receive function. init sets the contract’s initial state when the contract is deployed. receive handles calls to the contract and updates the state. More specifically, receive has (roughly) the type:

Msg -> State -> option (Action * State)

meaning it takes the payload data (Msg) of the call, the current State of the contract, and emits a list of new actions to be performed, along with the updated state of the contract. Note that the return type is an option. If receive returns None, the execution is interpreted as a failure, and hence block creation will fail.

With this model of execution, we have built a property-based testing framework where test data generation is based on automatically generating “arbitrary” contract calls containing “arbitrary” Msgs, and populating blocks with these generated calls. The pseudo-code below illustrates how an arbitrary blockchain state is generated, assuming we have access to a function GenArbitraryContractCalls.

function GenChain(current_chain, max_length) :=
  while current_chain.length < max_length {
    calls := GenArbitraryContractCalls(current_chain)
    // add new block to chain with calls in action queue
    // check if successfull
    match add_block(current_chain, calls) with
    | Some new_chain => 
    	current_chain := new_chain 
    | None => break
    end
  } 
  return current_chain
// Example usage:
// generate a blockchain with <= 5 blocks from some initial blockchain
GenChain(initial_chain, 5)

The obvious question is now: how can we generate arbitrary contract calls?

Suppose we wish to test a token contract whose Msg type is

Msg :=
| transfer of (address * address * nat)
| approve of (address * address * nat)

that is, it has two endpoints; one for transferring a number of tokens (as given by the nat) between the two given addresses, and one for approving an address to spend a given number of tokens on behalf of another address. For example, approve(alice, bob, 2) would allow Alice to spend two tokens on behalf of Bob.

Generating an arbitrary instance of Msg then amounts to either generating a transfer or an approve, and populating these with more-or-less random parameters. We can derive such a generator automatically using Coq’s QuickChick library for property-based testing, or we could implement it manually using the generator combinators provided by QuickChick.

If we have such generators present for all the deployed contracts on the blockchain, it is not hard to imagine how one could reasonably implement the GenArbitraryContractCalls function in the pseudo-code above.

With this, we can generate valid (according to the execution model) blockchain traces, populated with reasonably random inter-contract calls.

Testing Dexter for correct exchange rates

We will now show our testing framework in action to discover a bug in (an older version of) Dexter’s exchange protocol, which are only discoverable due to our capability of testing on results of inter-contract communications. The source file for this experiment can be found here.

Dexter exchanges tokens to tez according to the function $$ getInputPrice(Tokens, Token_{reserve}, Tez_{reserve}) = \ \frac{Tokens \cdot 997 \cdot Tez_{reserve}}{Token_{reserve} \cdot 1000 + Tokens \cdot 997} $$ We will not explain why it is defined like this. It suffices to say that, under normal operation, this way of calculating exchange rates should ensure that Dexter defines a so-called *constant-product market*. In layman’s terms, this essentially means that the market (price) cannot be significantly manipulated unless a party is in possession of most of the market’s assets.

More information about constant-product markets, their properties, and necessary conditions, can be found in this paper.

We want to test that all successful exchanges have been priced correctly, according to $getInputPrice$. Given a list of tokens to be exchanged, we can easily calculate the correct exchanges with the function below, which takes a list of token exchanges, and returns the corresponding exchange prices in a list.

function price_of_exchanges (tokens : list N) token_reserve asset_reserve := 
 match tokens with
 | ts::l' =>
   let price := 
     getInputPrice ts token_reserve asset_reserve in
   price :: price_of_exchanges l' (ts + token_reserve)   
   							   (asset_reserve - price) 
 | [] => []
end.

Now we can simply record all exchanges in each block of the generated test blockchains, and assert that the actual prices match the expected prices. There is some boilerplate code involved in stating this property in Coq, which is largely uninteresting for the purpose of this post, so we won’t show the code here. The definition is roughly 20 lines of Coq code (excluding logging calls for debugging) and can be found here.

Using QuickChick we test the property. QuickChick immediately reports a counterexample consisting of two exchanges within one block; an exchange of 27 tokens, followed by an exchange of 11 tokens. In this test, the dexter contract had a token reserve of 60 before these exchanges, and tez reserve of 30. The expected prices, according to price_of_exchanges, and the recorded prices are shown in the table below.

	tokens sold	Expected Price	Recorded price
1st exchange	27	9tez	9tez
2nd exchange	11	2tez	3tez

This means that an adversary is able to artificially create favorable exchange rates by splitting their exchange into multiple exchanges.

So we were able to detect a bug in this (older) implementation of Dexter using our testing framework - Great! But what causes the incorrect calculation of exchange rate?

The issue is a combination of Tezos' breadth-first execution model, and how Dexter maintains its view of the its current token reserve.

Recall that when dexter retrieves an exchange request, it first calculates the price of the exchange, and then sends out two actions: the tez transfer, and the token transfer. However, because of Tezos' breadth-first execution, Dexter will process the second exchange before these two actions have been executed. Hence, the chronological order in which these two exchanges will be processed, from start to finish, is:

Dexter calculates price of 1st exchange, and updates its local view of its token reserve
Dexter calculates price of 2nd exchange, and updates its local view of its token reserve
tez of 1st exchange is transferred to the requesting party
tokens of 1st exchange is transferred to Dexter
tez of 2nd exchange is transferred to the requesting party
tokens of 2nd exchange is transferred to Dexter

From this it is clear that in the second step, Dexter will have an incorrect view of its token reserve, since the actual token transfer is carried out two steps later. As a consequence, the price of the second exchange is calculated incorrectly, and unfortunately in such a way that it is favorable for the requesting party.

Note: this bug has been fixed in the current, live version of Dexter

This example illustrates how our approach to testing smart contracts allows for testing inter-contract communications and, in fact, nearly any (decidable) property on ConCert’s abstract representation of blockchains. This is possible because, rather than testing contract endpoints in isolation (i.e. comparing the contract state before and after the call), we instead generate entire blockchain states, populated with pseudo-randomly generated contract calls. Naturally, this is more computationally expensive, but the benefits in terms of expressibility are clear, as we have demonstrated here.