You’ve probably heard that Monad has "great tech.” Fair enough, but what does that even mean in very practical terms? At Pragma, we've been tracking Monad since their testnet launch. The numbers have been consistently impressive, and as of December 2025 the mainnet delivers 5,000-10,000 TPS supported by nearly 200 validators across the globe.

Numbers alone don't tell the full story, however. What matters is whether all this re-engineering was worth the effort, and whether the technical improvements came at the cost of decentralization. After all, anyone can build a fast blockchain and put it on one node for max performance. That's called a database.

After burning the midnight oil studying the codebase and digging into the technical architecture, we think Monad represents a genuine step forward in blockchain performance. To understand why, we need to start with how the Monad team approached the problem in the first place.

The High-Level Approach

The Monad team viewed the EVM architecture as a black box that accepts inputs and generates outputs in a uniform format, defined by the JSON-RPC API specification that all Ethereum clients implement. Unlike other EVM chains built on Geth (Go), Reth (Rust), or Erigon, the Monad team decided to throw the black box out the window and re-implement the EVM from scratch in C++ and Rust. They only kept the interface itself.

"If it walks like a duck and quacks like a duck, it must be a duck." This is duck typing, a principle from dynamic programming languages that's been around for decades. Duck typing postulates that what matters is what something can do, not what it is.

As a side note, and since we're already talking about programming paradigms, a monad is also a core concept in functional programming. In languages like Haskell, monads provide a way to compose operations and manage side effects while keeping code mathematically pure. The Monad team most certainly knew this when they chose the name for their project.

In the EVM world, if you can "quack" you're good. If you respond to eth_call, eth_getBalance, and the other standard methods, clients don't care what's underneath. Geth is written in Go, Monad is written in C++ and Rust, but they both speak JSON-RPC to clients. To a wallet making RPC calls, both just quack.

Beyond the EVM rewrite in faster languages, we identified core innovations that work together to deliver the advertised breakthrough performance. Each one tackles a different bottleneck, and the changes span across the entire stack. To understand the complete picture, let's cover them one by one, starting with consensus.

MonadBFT, a Lightning-Fast Consensus Protocol

MonadBFT is Monad's consensus mechanism, delivering 400ms block times and 800ms finality while solving the tail-forking problem that affects other BFT systems.

Why Is Consensus So Important in Blockchains?

Blockchains run on distributed validators scattered around the world, none of which fully trust each other. Yet, they still need to agree on which blocks enter the chain and in what order. Consensus is what makes this possible.

For consensus to work when some validators are offline, laggy, or actively malicious, the algorithm must be Byzantine Fault Tolerant. Byzantine Fault Tolerance keeps the chain moving forward as long as two-thirds of validators are honest.

Ducks lay eggs, consensus orders them into a chain.

How Ethereum Handles Consensus

For Ethereum, consensus currently uses Gasper, which combines Casper FFG for finality and LMD GHOST for fork choice. Every 12 seconds Ethereum produces a new block, with validators voting on whether it becomes part of the canonical chain. These votes accumulate over time in epochs, each lasting roughly 6.5 minutes.

After 2 epochs, a checkpoint becomes finalized and protocol-level finality guarantees it won't be reverted. In practice, many applications don't wait that long and treat blocks as "final enough" after just 1-2 confirmations, accepting a tiny reorg risk for faster UX.

This 2 epoch / 13-minute gap is why your favorite CEX makes you wait to withdraw funds. While they are happy to display a deposit almost instantly for a smoother user experience, they cannot risk letting you withdraw until the blockchain proves the transaction is irreversible. Exchanges need mathematical certainty before moving their own liquidity.

MonadBFT’s Approach to Finality

MonadBFT's 400ms blocks and 800ms finality represent roughly 30x improvement in block time and over 960x improvement in finality versus Ethereum. MonadBFT builds on a family of consensus protocols called HotStuff. HotStuff has rotating leaders who propose blocks, and validators vote on each proposal. When enough validators vote (a supermajority), the block is locked in.

HotStuff's breakthrough is achieving two properties that previous BFT protocols can't combine. The first is responsiveness. The protocol operates at actual network speed without waiting for fixed timeouts like Ethereum's 12-second slots.

The second is better message efficiency. In older protocols like PBFT, every validator has to talk to every other validator. That creates a message storm where the network chokes under its own communication overhead and nearly collapses once you have too many validators. In HotStuff, validators just talk to the leader, who combines these votes into a single threshold signature. As a result we have far fewer messages, which means the protocol can handle more validators without choking on communication overhead.

But even stock HotStuff processes blocks sequentially. A block gets proposed, validators vote through three phases, and only after finalization can the leader start proposing the next block. Chained HotStuff improves on this by employing pipelining, which allows multiple blocks to be in different consensus phases at the same time. While block 100 is collecting its final votes, block 101 is already being validated, and block 102 is being proposed. Think of an assembly line where one car is getting painted while the next is getting its engine installed and the one after that is having its frame welded.

The result is that validators keep processing continuously instead of sitting idle between blocks, which delivers much higher throughput. Pipelining, however, introduces tail-forking, a problem that most HotStuff-based systems haven't solved yet.

The Tail-Forking Problem (And Why You Should Care)

So you're a validator named Alice and you just proposed a block. Other validators vote on it, and boom, you've got a supermajority. Since you got enough votes, your block should be part of the chain, right?

Not quite. In pipelined BFT systems, those votes get sent to the next leader, Bob, who aggregates them into a Quorum Certificate (QC) and includes it in his own proposal, moving Alice's block toward finality. If Bob is offline, malicious, or just decides not to include Alice's QC, her block gets orphaned despite having majority support, with the chain moving on as if that block never existed. When valid blocks get abandoned at the tail of the chain because the next leader refuses to build on them, that's tail-forking.

Alice loses her block rewards despite doing everything right, and users see their transactions confirmed only to watch them disappear. Your trade went through! (Wait, no it didn't.) Without penalties for skipping blocks, validators can collude to drop blocks and manipulate transaction ordering for profit. They can, for instance, reorder trades to front-run users or extract maximum value from each block. Every pipelined BFT system before MonadBFT has been vulnerable to tail-forking.

MonadBFT fixes this with reproposal and No-Endorsement Certificates (NECs).

Reproposal means the next leader can't skip ahead when the current leader fails to include the previous block's QC. Before proposing anything new, they check if the previous block got enough votes. If it did, they repropose it first. So if Bob goes offline for any reason and never includes Alice's QC, Carol, the next leader, has to finalize Alice's block before she can propose her own. Bob failed, but Alice gets her rewards regardless.

But what if Alice's block genuinely wasn't ready? Maybe Alice never broadcast it properly in the first place. Carol needs to figure out what actually happened, so she polls validators to see if anyone received Alice's block. If enough validators confirm they never saw it, Carol collects their responses into a No-Endorsement Certificate. This certificate proves the block was never available to begin with. Carol can now skip Alice's block, and going forward nobody can accuse her of acting maliciously since she has the proof.

These rules only activate during failures and normal operation stays fast. Once you see the QC for a block, you can treat it as speculatively confirmed. The block will finalize either way, whether the next leader includes it normally or reproposal kicks in.

Monad targets 400ms block times, but the consensus networking finishes well under that. Ethereum, meanwhile, waits a fixed 12 seconds between blocks. Getting BFT consensus right at this speed is genuinely challenging. Most chains choose between having lots of validators or fast finality because doing both is hard. MonadBFT handles both, but fast consensus alone doesn't make a fast blockchain. Those finalized blocks still need to propagate quickly across the entire validator network, which brings us to how Monad optimizes block propagation.

Propagating the Latest Blocks to Everyone Quickly with RaptorCast

With consensus running at network speed, block propagation becomes the next bottleneck. When a leader creates a block, they need to distribute it to every validator quickly. A block holding 10,000 transactions runs about 2MB. If the leader has to upload 2MB to each of 100+ validators, that's 200MB+ of outbound traffic per block, which would easily bottleneck at the leader's upload speed.

The naive fix would be to have the leader gossip the newest block to a few peers, who rebroadcast it to others, with every node on the network eventually receiving a copy. A naive gossip approach doesn't work at Monad speeds, however. Every hop through the network adds latency. If a hot block has to bounce through three or four validators before reaching the far side of the network, you've already easily eaten up your 400ms budget just on propagation. Gossip is also wasteful since nodes end up receiving the same data multiple times with nobody coordinating who already has what. When up to a third of validators might be Byzantine, routing through intermediate nodes opens up obvious attack vectors. What stops a malicious validator from delaying blocks, withholding them, or just dropping them entirely?

RaptorCast solves this with two tricks. First, erasure coding. Splitting a 2MB block into 100 chunks and sending one to each validator sounds reasonable until a few packets get lost over the Atlantic. Now validators are stuck with 97 chunks out of 100 and no way to fill in the gaps. With erasure coding, instead of just dividing the data, you create redundant chunks that contain enough information about each other that you can rebuild the whole block from any sufficiently large subset of chunks.

Take two numbers, 3 and 7, and also send their sum, 10. Now you have three pieces of data where any two can reconstruct the third. If the 3 goes missing, subtract 7 from 10. If the 7 goes missing, subtract 3 from 10. If the sum goes missing, add 3 and 7.

Erasure coding works the same way but at scale. You start with K original chunks and generate N total chunks, where N is larger than K. Any K of those chunks can reconstruct the original data, regardless of which ones you receive.

Recall our original example, where the validator with 97 chunks could not reconstruct the block. With erasure coding, they can now reconstruct the full block because the missing data is encoded in the chunks they already have.

Second, a two-hop broadcast tree. The leader divides up the chunks among validators based on stake weight. Each validator receives their assigned chunks directly from the leader, then broadcasts those chunks to all other validators. So if there are 100 validators and 5000 chunks, each validator receives about 50 chunks from the leader and broadcasts those 50 chunks to the other 99 validators. Every validator ends up with all the chunks, but the leader only had to upload each chunk once.

Taking a step back, we see that consensus now runs fast with MonadBFT and blocks propagate rapidly with RaptorCast. But when do transactions actually get executed, and how does the process differ from vanilla Ethereum?

Decoupling Consensus From Execution

Transaction execution is a major bottleneck in traditional EVM chains. In Ethereum, one validator is chosen as the leader to propose the next block. To create that block, the leader first runs all the transactions and calculates the new state root, which is a hash that summarizes the new blockchain state after applying those transactions. Only then can they send the block to other validators for consensus. Once received, those validators re-run every transaction to check that they end up with the same state root before voting. So before a block is finalized, transactions get executed twice. Once by the leader to build the block, and again by every validator before they can approve it.

This is also why Ethereum's gas limit is so conservative. All of this execution takes time, blocking consensus. As such, the gas limit has to be set low enough that worst-case execution completes quickly, leaving enough room for the block to be proposed, distributed, re-executed by validators, and finalized within 12 seconds. The gas limit isn't a hardware limitation but rather a timing constraint imposed by having transaction execution in the critical path.

What if the leader could propose a block without running every transaction first? What if validators could just vote immediately without re-running them? The block would get finalized based on the transaction order alone, and everyone could execute afterwards. Well, Monad does exactly that. Validators first agree on the order of transactions without actually executing them. They still perform essential checks, such as validating signatures and nonces and ensuring the sender has enough gas. Only after a block is finalized do all validators execute its transactions, and they do this while the consensus process is already working on the next block. With execution out of the critical path, the gas limit constraint disappears. Monad can pack far more transactions into each block because execution no longer needs to complete before consensus proceeds.

Since execution happens after consensus, a newly proposed block doesn't have its own state root yet. Each block includes a delayed state root from three blocks back instead, letting nodes verify they're all computing the same results.

If a validator's local execution produces a different result, this mismatch is detected and the node rolls back to the last known good state, re-executing transactions from that point. In practice this rarely happens since everyone is running the same transactions in the same order.

Using state from three blocks ago means an account might spend its balance before execution catches up. The Reserve Balance mechanism addresses this by forcing each account to set aside a portion of its balance exclusively for gas fees. The leader tracks pending transactions from each account and only includes new ones if their gas costs fit within this reserve. If a transaction would drop the account balance below this reserve at execution time, it simply reverts, making it impossible to spam transactions you can't actually pay for.

Now that execution is decoupled from consensus, how do we actually run transactions as efficiently as possible?

Optimistic Parallel Execution

To understand Monad's approach, consider how Ethereum goes about transaction execution. In Ethereum, transactions are processed one after another in a single thread. If transaction #2 doesn't touch any of the same state as transaction #1, it still waits in line.

CryptoKitties was a famous app that caused massive gas cost spikes back in 2017. The game became so popular it accounted for 25% of all Ethereum traffic, and with every transaction waiting in line, the entire network congested.

In the spirit of squeezing the absolute most out of every component, Monad obviously improves on this baseline. Once transactions are ordered, they run optimistically in parallel across multiple CPU cores. Optimistically means the runtime assumes transactions won't conflict and runs them concurrently. Each execution produces a pending result listing which storage slots were read (inputs) and written (outputs), and these pending results then get merged in the original transaction order. Say transaction #1 and transaction #2 both run in parallel, each spending 100 USDC from Alice's account. Transaction #1 merges first, reads Alice's balance as 1000, and updates it to 900. When transaction #2 tries to merge, the system compares its inputs against the outputs of transaction #1. Transaction #2 assumed Alice had 1000, but she now has 900. The inputs don't match, so transaction #2 re-executes against Alice's new balance of 900, correctly updating it to 800.

Our contrived example shows that in the worst case, the same transaction gets executed twice. Re-execution is cheap because not all operations need to be repeated. Expensive computations like signature recovery don't depend on state and can be reused, and the storage slots accessed are already in memory from the first run. It's effectively a two-pass strategy, where the first pass surfaces storage dependencies in parallel and pulls data from SSD into cache, while the second pass re-executes against correct state using that cached data. Even when every transaction conflicts, the parallel execution wasn't wasted because all that storage data is now readily available in memory. Best case, when transactions touch different storage slots, you get linear speedup with core count.

Transaction execution is now parallel, but every transaction still needs data from the same underlying database. If that database can only handle one request at a time, it becomes the bottleneck.

MonadDb and Parallel State Access

All blockchain state in Ethereum lives in a database. When a contract checks your balance or updates a variable, that's a database read or write. On a busy chain, this happens millions of times per day.

Ethereum stores state in a tree structure called a Merkle Patricia Trie. Account data like balances, nonces, and contract storage live in the leaves of this tree, and the tree’s root hash summarizes the entire state. And in the EVM world, even if your data is a duck living inside a smart contract, it ultimately quacks its way up to the root hash.

In a Patricia–Merkle trie, even duck data ultimately resolves into a single root hash.

Why bother storing information in trees anyway? Because they let you cryptographically prove any piece of data exists without downloading the whole database. Want to prove Alice has 5 ETH? You just need a short chain of hashes, known as a Merkle Proof, which verifies the path from her account data up to the state root. Light clients and bridges rely on this efficiency to verify data securely and quickly.

This way of storing data in trees isn't exactly optimal, but remember the intro where we said all EVM chains need to "quack" the same way? Given that Monad implements Ethereum APIs, we need to maintain a similar structure.

Ethereum clients don't implement their own storage layer for the trie. Instead, they typically use generic key-value databases such as LevelDB or RocksDB, which maintain their own tree structures internally like LSM trees or B-trees. The practical consequence is that you end up with a tree inside a tree. Every time you want to read one value, you're navigating two layers of data structures, which is a source of unnecessary overhead. An efficient quacking duck can surely do better!

MonadDb stores data directly within the Patricia Trie itself, entirely eliminating the tree-within-a-tree representation overhead. However, the underlying data structure is only half the problem with traditional databases. Synchronous I/O is the other half. The system completely pauses every time it waits for a database request, bottlenecking parallel transaction execution, as all those concurrent storage reads and writes are stuck waiting in a single-file line.

To address this shortcoming, MonadDb uses asynchronous I/O through io_uring, a Linux kernel interface that lets threads issue I/O requests without stalling. The database runtime submits a request and immediately becomes available to accept another. While transaction A, for example, waits for a slow write, transactions B, C, and D keep executing their reads and may even return earlier than the first write request (how cool is that!).

MonadDb further optimizes I/O performance by bypassing the filesystem entirely to communicate directly with the SSD. Filesystems exist because general-purpose operating systems need to manage block allocation, track data locations, and handle many different applications accessing storage in unpredictable ways. MonadDb only needs to store data in a Patricia Trie representation and knows exactly how to access it. The filesystem abstraction only adds overhead without providing any measurable benefit.

The physics haven't changed, however. A single read still takes 40 to 100 microseconds, but with async I/O and direct SSD access many reads now happen in parallel without blocking each other. The database is no longer bottlenecking the other components.

So now we've parallelized transactions and optimized storage access. But what happens when one contract gets extremely popular? When CryptoKitties was at peak demand, the same contract was being called thousands of times per block, for example.

Just-In-Time Compilation

Smart contracts written in Solidity get compiled into EVM bytecode. The bytecode contains simple instructions like "add these two numbers" (ADD) or "store this value" (SSTORE). This bytecode is what actually lives on chain and runs on every node. The Solidity source code, on the other hand, never touches the chain, even when you add the contract's source code on Etherscan.

When someone calls your contract, the EVM works through the bytecode one instruction at a time. It fetches an opcode, checks if you have enough gas to run it, executes it, and moves on. You've seen this from the other side every time you set a max gas limit in your wallet. Each operation eats into that budget. If you've ever hit an "out of gas" error, that's the EVM running out of budget partway through your transaction and reverting everything. If you've used Python, this should feel familiar. The interpreter executes your code line by line rather than translating it into machine code first, much like the EVM.

Instead of interpreting EVM bytecode instruction by instruction, JIT (Just-In-Time) compilation translates it into native machine code, the actual 1s and 0s your CPU is built to execute. EVM bytecode runs on a virtual machine, while native code runs directly on your hardware. Java developers will recognize this from the JVM's HotSpot compiler, which tracks frequently called methods and compiles them to native code.

Monad uses the same idea, where new or rarely-used contracts run on a fast interpreter with minimal delay. Meanwhile, the system actively tracks which contracts consume the most gas, and when a contract gets "hot" (think Uniswap or a future Monad-native CryptoKitties equivalent), it enters a compile queue. The compiler translates EVM bytecode into optimized x86-64 machine code that runs directly on the CPU with no interpreter in between.

Putting it All Together

Monad's performance gains come from fundamental architectural changes to how blockchains handle data, consensus, and execution, not from flashy marketing or overly favorable benchmarks. All the components we've covered aren't independent optimizations but rather designed to work together, each one removing a bottleneck that would otherwise limit the others. MonadBFT produces blocks every 400ms, but fast consensus will have little impact if blocks take nearly as long to propagate. RaptorCast ensures blocks reach everyone within one network round-trip. Async execution lets validators vote on ordering without waiting for execution to finish. Parallel execution uses all available cores, but only if the database can handle concurrent reads and writes. MonadDb delivers that with async I/O.

Pull any piece out and the performance collapses. Replace MonadDb with LevelDB and your parallel execution threads block on synchronous disk reads. Skip RaptorCast and the leader's upload bandwidth caps your effective block size to a fraction of what it could be. Go back to 12-second block times, and you slow transaction throughput so much that JIT's execution gains cease to matter.

So Was the Re-Engineering Worth It?

Most fast chains hit high TPS by centralizing, running fewer validators with stricter hardware requirements clustered in the same data centers. Monad got there by solving the actual engineering problems. The result doesn't resemble EOS with 21 block producers or some proof-of-authority setup with three nodes, but rather a proof-of-stake system with a genuinely distributed validator set.

Sure, nodes need 32GB RAM and PCIe Gen4 NVMe SSDs, but that's still consumer hardware, keeping the network accessible to enthusiasts and independent operators. Years of research and careful engineering went into proving you don't have to choose between speed and decentralization. The Monad team asked what performance looks like when you redesign every layer from the ground up to work together. The answer is running on mainnet right now.

Pragma Ventures

The High-Level Approach

MonadBFT, a Lightning-Fast Consensus Protocol

Why Is Consensus So Important in Blockchains?

How Ethereum Handles Consensus

MonadBFT’s Approach to Finality

The Tail-Forking Problem (And Why You Should Care)

Propagating the Latest Blocks to Everyone Quickly with RaptorCast

Decoupling Consensus From Execution

Optimistic Parallel Execution

MonadDb and Parallel State Access

Just-In-Time Compilation

Putting it All Together

So Was the Re-Engineering Worth It?

No comments yet

The High-Level Approach

MonadBFT, a Lightning-Fast Consensus Protocol

Why Is Consensus So Important in Blockchains?

How Ethereum Handles Consensus

MonadBFT’s Approach to Finality

The Tail-Forking Problem (And Why You Should Care)

Propagating the Latest Blocks to Everyone Quickly with RaptorCast

Decoupling Consensus From Execution

Optimistic Parallel Execution

MonadDb and Parallel State Access

Just-In-Time Compilation

Putting it All Together

So Was the Re-Engineering Worth It?

Pragma Ventures

Monad, the Blockchain That Rewrote the EVM

Yarco Hayduk · December 2025

No comments yet