Memory Pool for Fun and Profit
In 2019, professional trading of crypto assets with on-chain intelligence has become omni-present. Primarily, signals are sourced from network metrics or transaction movement such as on-chain inflows/deposits and outflows/withdrawals from and to exchange hot wallets. The Memory Pool, the 'waiting room' for transactions, has all sent and relayed transactions long before they are mined in a block. Interestingly, given an estimate on how the network and miners behave, one can predict transactions included in the next few blocks with a 90% probability. In this article, we will dive into transaction propagation and Memory Pool fundamentals to introduce the idea of Predictive Consensus.
In order to understand what the Memory Pool is, we first have to understand how transactions are propagated throughout the Bitcoin P2P network. Let's assume you send a transaction from your mobile wallet, e.g. Bitwallet to one or multiple of a set of full Bitcoin nodes in the network. These nodes in turn, relay the transactions further to other peers, so it eventually reaches all nodes in the network. In case of centralized wallet providers, the mobile wallet sends it to a centralized server first, and then to Bitcoin nodes in the network.
When a node in the network receives a new transaction (tx) message, the node validates the transaction based on protocol rules. The protocol rules wiki page offers an overview of most of the rules, however, the source of truth is in the specific code. If the rules are violated, then the node does not relay the transaction to connected peers. Once the transaction has been validated, the initial state of the transaction is pending/unconfirmed and is added locally, on each node, to a set of pending transactions, called the Memory Pool.
At this point, we have to differentiate between two different kinds of nodes in the Bitcoin network: a normal full node, and a miner node. For a normal full node, the Memory Pool serves as the only purpose of relaying transactions to peers and as a final goal to miner nodes. For a miner node however, the Memory Pool plays a very key role. Miners, as businesses, are incentivized by block reward and transaction fees to mine new blocks. Solving the proof-of-work (PoW) puzzle is often described as a lottery, giving the miner who solves it the ability to craft the next block in the chain of blocks. The block template only has to pass the consensus rules. The miner can freely decide the coinbase transaction beneficiary of the block reward, mostly the miner wallet, and what transactions to confirm.
The miner typically sorts transactions from the Mempool with the highest "fee rate" paid first, in order to maximize the miner's profit on the mined block. The fee rate is fee divided by size. Fee rate is chosen over fee, because space in a block is limited. The Mempool grows and shrinks, from 0 MB up to 50 MB (last seen July 2019). A visualization can be found at mempool.space, and a more detailed historical timeline is on Johoe's Bitcoin Mempool Statistics. The Mempool size is heavily influenced by hash rate, difficulty (adjustments), block time, price spikes and price drops. We will discuss these interesting observations in detail in the upcoming articles.
If we look at the valid transactions within the Mempool and their given "fee rate" paid, we can, based on an estimate of network behavior, predict what transactions will be mined into the next, second next, etc. blocks. One such model is Feesim, whose goal is to estimate the fee rate that a transaction needs to pay in order to be confirmed within N blocks, and with a given probability P (currently 90% for next block). The estimation algorithm is based on block discovery and transaction arrivals as Poisson processes and the assumption that miners select transactions greedily by fee rate, subject to a maximum block size and minimum fee rate. However, this is just one out of many different models: P2SH Fee estimation is showing ten different fee estimators and their estimates over time.
In the context of trading with on-chain (or rather pre-chain) data, classifying a high-value inflow/deposit from a long holding Bitcoin wallet into an exchange address ten minutes (or more) before it is mined in a block is a strong sell-signal. This and other signals based on Mempool depend on a highly resilient low-latency data pipeline sourcing from a cluster of Bitcoin nodes all over the world, proprietary datasets and fine-tuned parameters. At TokenAnalyst, we're actively working on combining our low-latency inflow/outflow signals efforts with predictive Mempool models to offer strong signals before they are widely available.
Frontrunning on decentralized exchanges on the Ethereum blockchain is pretty common. Let us digress a little: A decentralized exchange (DEX) is a purely decentralized smart contract taking bid and ask orders and matches these entirely on-chain. Ethereum like Bitcoin has a Mempool of pending transactions. A transaction on Ethereum cannot only be a value transaction from one wallet to another, but a smart contract execution optionally with parameters, such as position and order type. Being able to see orders in the Mempool before they are mined in a block allows a market actor to take advantage by overbidding in gas (gas is the payment for execution in Ethereum, similar to Bitcoin's transaction fee model) and ultimately frontrunning a certain other transaction / order.
It is important to recall, that the winning miner typically chooses transactions of the Mempool in the descending sorting order by gas paid. The estimated profit has to outweigh the cost of frontrunning. The paper "Flash Boys 2.0: Frontrunning, Transaction Reordering, and Consensus Instability in Decentralized Exchanges" is a good starting read into the topic, for data check out FrontrunMe