
FastPay: The Consensusless Protocol
IntroductionRecently there has been a lot of hype around the topic of parallel execution, and today we are going to dive into one of the approaches that can also be used to achieve it, the consensusless protocol called FastPay. You might not have heard this name before, but you may be surprised to learn that many projects such as Sui, Linera, and Pod have adopted its concepts to reduce latency, enable parallel processing, and allow horizontal scalability. In this post, we will explore the ide...

MonadBFT Explained Part 4: RaptorCast
IntroductionFinally, we are at the final blog in our MonadBFT series. In the previous blog part 3, I have explained the issues of naive block propagation which can lead to bottleneck and are not scalable. But the main focus was to explore and learn about the basics of fountain or rateless erasure code and Raptor code, starting from the LT codes and improving it with precoding. So I hope you get the idea of how Raptor codes work in general as in this blog we will build on the Raptor code and e...

Sui Series part 2: Sui Consensus V1 – Narwhal and Bullshark
1. IntroductionManaging transactions that involve owned objects, such as individual payments or transferring NFTs, can be handled independently without strict ordering requirements. On the other hand, transactions involving shared objects, such as liquidity pools or order books in decentralized exchanges, require a specific order of execution since multiple users can access and modify them at the same time. Sui leverages this distinction to treat owned and shared objects differently. For owne...
Welcome to Cryptolytic—a research channel focused on the infrastructure and theory behind crypto. ContributionDAO researcher house

FastPay: The Consensusless Protocol
IntroductionRecently there has been a lot of hype around the topic of parallel execution, and today we are going to dive into one of the approaches that can also be used to achieve it, the consensusless protocol called FastPay. You might not have heard this name before, but you may be surprised to learn that many projects such as Sui, Linera, and Pod have adopted its concepts to reduce latency, enable parallel processing, and allow horizontal scalability. In this post, we will explore the ide...

MonadBFT Explained Part 4: RaptorCast
IntroductionFinally, we are at the final blog in our MonadBFT series. In the previous blog part 3, I have explained the issues of naive block propagation which can lead to bottleneck and are not scalable. But the main focus was to explore and learn about the basics of fountain or rateless erasure code and Raptor code, starting from the LT codes and improving it with precoding. So I hope you get the idea of how Raptor codes work in general as in this blog we will build on the Raptor code and e...

Sui Series part 2: Sui Consensus V1 – Narwhal and Bullshark
1. IntroductionManaging transactions that involve owned objects, such as individual payments or transferring NFTs, can be handled independently without strict ordering requirements. On the other hand, transactions involving shared objects, such as liquidity pools or order books in decentralized exchanges, require a specific order of execution since multiple users can access and modify them at the same time. Sui leverages this distinction to treat owned and shared objects differently. For owne...
Welcome to Cryptolytic—a research channel focused on the infrastructure and theory behind crypto. ContributionDAO researcher house

Subscribe to Cryptolytic

Subscribe to Cryptolytic
Share Dialog
Share Dialog


<100 subscribers
<100 subscribers
Reaching consensus on the order of transactions within a distributed system is one of the fundamental problems that blockchains are trying to solve. Tendermint (currently known as CometBFT) is one of the novel solutions used in many blockchains as a consensus engine, especially in the Cosmos ecosystem. Tendermint is a Byzantine Fault Tolerant (BFT) consensus protocol used to order transactions in a distributed network to ensure that all honest nodes agree on the same value, even in the presence of adversarial nodes.
Typically, each blockchain uses a different consensus protocol that suits its purpose. Today, our focus lies in understanding the mechanics of Tendermint, which are relatively easy to understand compared with other protocols. In this article, you will learn how nodes in the network interact to achieve consensus in Tendermint. We hope this article gives all readers an intuition for how blockchains work, as well as a foundation for understanding other consensus protocols, which will be covered in future articles.
Note: This article was originally written in 2023, so some information may not be up to date.
A Byzantine Fault Tolerant (BFT) consensus protocol is designed to withstand malicious or faulty behavior from nodes within a network. These faulty nodes are referred to as Byzantine due to their similarity to the classical Byzantine Generals' Problem.
Tendermint is a BFT consensus algorithm that operates under a partially synchronous communication model. It can tolerate up to one-third Byzantine nodes (i.e., the system remains safe if , where n is the total number of nodes and is the maximum number of faulty nodes).
In simple terms, imagine a scenario where the network is attacked or unexpectedly suffers an internet or power outage, causing some nodes to become unresponsive or fail to cast votes—this would be considered Byzantine behavior. The partially synchronous model assumes that such disruptions will eventually stop, allowing the network to recover and resume normal operation after the attack.
You may wonder why the assumption is necessary. As a bonus, I’ve provided a rough proof at the end of this article.
Before delving into how Tendermint works, let's first introduce the key components that are important in the Tendermint protocol so that we can go through the protocol without losing continuity:
Block: A block stores a batch of transactions. It also contains other metadata, such as the previous block height, Merkle root of transactions, etc., but we’ll ignore that here.
Block height: Block height is a unique, incrementing index specifying each block's position within a blockchain.
Nodes (validator nodes): Nodes are responsible for maintaining a full copy of the historical record of blocks and transactions. They actively engage in voting for block validity and reaching consensus on a block at each height. Each node can be identified by its public key.
Round: Nodes in the Tendermint protocol might be required to participate in multiple rounds before agreeing on a block at a specific height.
Proposer: A proposer, also known as a leader node, is responsible for proposing a block to other nodes. At each height, validators take turns proposing new blocks in each round. As a result, there will be only one unique proposer, and all nodes will know who the current leader is.

At a high level, Tendermint uses a round-based voting mechanism to confirm a block at each height. Nodes in the protocol exchange three types of messages before committing to a block, as follows:
PROPOSAL: A proposal for a block that a leader, or proposer, proposes at a given height and round, in the hope that the network will eventually commit the block. A PROPOSAL message includes the proposed block , the height , and the round , represented as .
PREVOTE: The first stage of voting. A PREVOTE message includes the identity of the voter , the block being voted on , the height , and the round . We represent this as . A node can move to the next voting phase only after receiving PREVOTE messages from at least two-thirds of all nodes.
Based on the previously described message exchanges, the protocol can be divided into three stages: propose, prevote, and precommit. Additionally, the protocol relies on three timeouts:
proposeTimeout: This timer activates when a node starts a new round. If the node fails to send a PREVOTE message within the specified time interval dT, the is triggered.
prevoteTimeout: Once a node sends its initial PREVOTE message, the timer starts. It is triggered if, within the time frame dT, the node doesn't receive supporting PREVOTE messages for a specific block from at least two-thirds of the nodes.
precommitTimeout: Similar to , but used with PRECOMMIT messages.
These timeouts allow a node to proceed to the next phase in case it gets stuck or fails to make progress in a given stage.

Assuming we are at the start of height h, each node begins with the following initial variables:
: the block that has received two-thirds of PREVOTE votes (initially nil)
: the round associated with lockedBlock
: the current height
The local state of a node can be represented by the tuple .
In this round, the proposer proposes a block and broadcasts with its signature to the other nodes. Upon receiving , a non-leader node performs two checks:
It verifies that all transactions in are valid.
It confirms that the message is correctly signed by proposer .
If both checks pass, the node updates its local state to and broadcasts to other nodes.
If node i doesn’t receive any valid PROPOSAL message within the time interval dT, the is triggered. In that case, it broadcasts a message and updates its local state to .
Once a node receives PREVOTE messages for block from at least two-thirds of the nodes, it locks on , updates its local state to , and advances to the second stage. It then broadcasts to signal its support for committing the block.
If a node does not receive enough PRECOMMIT messages for before the expires, it proceeds to the precommit stage and broadcasts instead.
Depending on its local state, the node’s state becomes either:
, or
Note that in round r, a node must precommit to the same block it prevoted for. For example, if it prevotes for nil, it must also precommit to nil in the same round.
To achieve consensus and commit the current block , a node waits until it receives at least two-thirds of the PRECOMMIT messages from other nodes in support of . Once the node decides on , it proceeds to the next height with round and resets its local variables.
However, if the node fails to reach a decision before the timeout window dT begins (which starts counting down after it sends its first PRECOMMIT message), the is triggered. This causes the node to move to the next round, , and reset its stage to propose phase.
Note that a node skips directly to the precommit phase if it receives two-thirds of PREVOTE messages for . Similarly, it moves to the next round if it receives two-thirds of PRECOMMIT messages for .
The flow above describes how nodes behave in the first round . However, as shown in the last stage, a node may enter a new round if no block is committed in the current one. Let’s now explore how the protocol behaves in subsequent rounds.
During the propose phase of round , the proposer checks its local state to identify the most recent block that successfully passed the prevote stage. Let's denote this block as . It is stored in the proposer's state as lockedBlock, with the state represented as , where .
If such a block exists, the proposer reuses it and broadcasts the message .
If there is no such block (i.e., ), the proposer is free to propose a new block and sends .
When a node receives a PROPOSAL message for block during round , it evaluates the proposal under the following conditions:
The node accepts the proposal (i.e., broadcasts and moves to the precommit phase) if its local lockedBlock is either nil or equal to .
If the node’s does not match the proposed block , it does not support the proposal and instead waits for the proposeTimeout to trigger. When that happens, it broadcasts and proceeds to the prevote phase without voting for any block.
This locking mechanism plays a crucial role in preserving safety in the consensus process. (See [1], Section 3.2.3 for a more detailed explanation.)
Similar to the description for round , in round , nodes follow the same process:
If a node receives two-thirds of the PREVOTE messages supporting , it updates its local state, broadcasts , and proceeds to the precommit phase.
If it doesn't receive enough support and the expires, it still moves to the precommit phase and votes .
In the precommit phase, the node faces two possible outcomes:
It commits to if it receives at least two-thirds of the PRECOMMIT messages for the block.
Otherwise, if the is triggered without sufficient support, it advances to the next round, .
Tendermint’s design guarantees that, as long as fewer than one-third of the validators are Byzantine, all honest nodes will eventually agree on the same valid block and move to the next height. This property ensures the safety and liveness of the consensus protocol, even in the presence of faulty or malicious nodes.
The two stages of voting are required to ensure the consistency of the protocol. A single stage of voting would leave the system vulnerable to manipulation by malicious nodes. For example, a Byzantine node could partition the network and cause some honest nodes to commit to one block in round , while others commit to a different block in a later round.
The two-stage mechanism prevents this by introducing a soft commitment in the first stage. When an honest node receives two-thirds of the PREVOTE messages for a block, it becomes locked on that block but does not yet commit to it. This locking mechanism signals that the block may be finalized, allowing nodes to coordinate before moving to the commit phase.
In the second stage, the node confirms its commitment by broadcasting a PRECOMMIT message. However, if a node later sees sufficient evidence—two-thirds of PREVOTE messages for a different block in a higher round—it can update and follow the newer proposal, helping lagging nodes catch up safely.
Let’s now explore how this two-stage voting process ensures safety with an example: consider a one-stage protocol involving four nodes—A, B, C, and D—with A as the proposer and B as the Byzantine node.
Node A proposes block "X".
Nodes C and D immediately vote for block "X" upon receiving A’s proposal.
Byzantine node B manipulates the network so that only A receives votes from C and D, while C and D are prevented from seeing each other’s votes.
Node A, including its own vote and those from C and D, observes a two-thirds majority and commits to block "X".
Meanwhile, because B doesn’t vote and C and D can’t see each other’s responses, they are unable to gather a sufficient majority. As a result, they timeout and start a new round.
Node B becomes the new proposer and proposes block "Y".
Without any locking mechanism from a pre-vote phase, nodes C and D are free to vote for block "Y" and eventually commit to it.
As a result, node A commits to "X", while C and D commit to "Y", violating safety and creating a fork in the network.
Node A proposes block "X".
Nodes C and D immediately pre-vote for "X" upon receiving the proposal.
Byzantine node B disrupts the network, similar to the one-stage voting case, preventing C and D from seeing each other's pre-votes.
Node A receives pre-votes from C and D, locks on block "X", but does not yet commit to it.
After a timeout, nodes B, C, and D start a new round.
Node B proposes a new block "Y".
Since C and D are not locked on "X", they are free to pre-vote for "Y". B, C, and D then pre-vote for block "Y".
Later, node A sees that a majority of nodes have pre-voted for "Y", so it unlocks from "X" and locks onto "Y" instead.
This prevents the inconsistency seen in the one-stage voting scenario, where nodes could commit to different blocks.
Let’s say C and D both receive enough PREVOTE messages (at least two-thirds) for block "X", so they lock on "X" and proceed to the pre-commit phase.
Each of them sends a PRECOMMIT for "X".
At this point, Byzantine node B manipulates the network and prevents C and D from seeing each other’s PRECOMMITmessages.
Because they cannot gather two-thirds of PRECOMMITs, neither C nor D commits to "X".
However, since they are already locked on "X", they will not support any new proposal unless they see two-thirds of PREVOTEs for a different block in a future round.
Node B may propose a new block "Y" in the next round, but C and D will not vote for it. Unlocking "X" requires strong evidence (two-thirds PREVOTEs for a new block), which is impossible unless honest nodes change their votes — and they won’t, due to the locking rule.
This locking rule ensures that once honest nodes soft-commit to a block (with two-thirds of PREVOTE messages), they do not switch to a different one, preserving safety even in the presence of Byzantine behavior.
For a more in-depth explanation, [3] is a recommended resource to explore this topic further.
You’ve probably seen the 33% threshold appear in many blockchain whitepapers, articles, and of course, in this one. It’s closely tied to the partially synchronous model, which introduces this limit to ensure both safety and liveness in the presence of faulty or malicious nodes. Let’s sketch an intuitive proof here as a bonus for readers who aren’t sure where this number comes from. Feel free to skip this section if you’ve already seen it elsewhere.
This sketch is inspired by the works cited in [1] and [2]. You might have seen the threshold expressed as , where:
is the total number of nodes in the network
is the number of Byzantine (i.e., faulty or malicious) nodes the protocol can tolerate
Rearranging this gives us , meaning the protocol is safe only if fewer than one-third of nodes are Byzantine. We’ll use the form here because it’s easier to reason about in this proof.
Suppose a protocol has nodes and can tolerate up to Byzantine nodes. Now imagine an honest node has been waiting but hasn't seen enough messages to make progress. This could happen due to network delays or because Byzantine nodes are deliberately withholding or delaying messages.
In such a situation, it is reasonable for the honest node to proceed once it hears from nodes. This is a safe number because, even in the worst case, the remaining nodes may simply be unresponsive or malicious.
Now consider what could happen in the worst case:
Suppose honest nodes are delayed and unable to broadcast their messages in time.
As a result, a node may receive messages from only nodes. Among those, are from Byzantine nodes.
That means the number of honest votes the node actually receives is at most .
Now, the best course of action for an honest node is to make a decision based on the majority (50%) of the votes it receives from honest nodes and f Byzantine nodes — without being deceived by the Byzantine nodes.
We require:
More than 50% of the total number of votes received must come from honest nodes. We can express this requirement in mathematical form as:
$$\begin{aligned}
\text{50% of the total number of votes received} &> \text{number of Byzantine nodes} \
\frac{1}{2} × \text{(remaining honest nodes + Byzantine nodes)} &> f\
\frac{1}{2}(n - 2f + f) &> f\
\frac{1}{2}(n - f) &> f\
n - f &> 2f\
n &> 3f
\end{aligned}$$
Which gives us the result:
This is the origin of the famous 33% Byzantine fault threshold.
Please note that this is just a sketch, not a rigorous proof. [1] and [2] are the recommended resources for the proof.
[1] https://atrium.lib.uoguelph.ca/items/5459099e-67aa-4a23-83ae-d3471d8d8336
[2] https://github.com/timroughgarden/fob21/blob/main/l/l6.pdf
[3] https://github.com/timroughgarden/fob21/blob/main/l/l7.pdf
Reaching consensus on the order of transactions within a distributed system is one of the fundamental problems that blockchains are trying to solve. Tendermint (currently known as CometBFT) is one of the novel solutions used in many blockchains as a consensus engine, especially in the Cosmos ecosystem. Tendermint is a Byzantine Fault Tolerant (BFT) consensus protocol used to order transactions in a distributed network to ensure that all honest nodes agree on the same value, even in the presence of adversarial nodes.
Typically, each blockchain uses a different consensus protocol that suits its purpose. Today, our focus lies in understanding the mechanics of Tendermint, which are relatively easy to understand compared with other protocols. In this article, you will learn how nodes in the network interact to achieve consensus in Tendermint. We hope this article gives all readers an intuition for how blockchains work, as well as a foundation for understanding other consensus protocols, which will be covered in future articles.
Note: This article was originally written in 2023, so some information may not be up to date.
A Byzantine Fault Tolerant (BFT) consensus protocol is designed to withstand malicious or faulty behavior from nodes within a network. These faulty nodes are referred to as Byzantine due to their similarity to the classical Byzantine Generals' Problem.
Tendermint is a BFT consensus algorithm that operates under a partially synchronous communication model. It can tolerate up to one-third Byzantine nodes (i.e., the system remains safe if , where n is the total number of nodes and is the maximum number of faulty nodes).
In simple terms, imagine a scenario where the network is attacked or unexpectedly suffers an internet or power outage, causing some nodes to become unresponsive or fail to cast votes—this would be considered Byzantine behavior. The partially synchronous model assumes that such disruptions will eventually stop, allowing the network to recover and resume normal operation after the attack.
You may wonder why the assumption is necessary. As a bonus, I’ve provided a rough proof at the end of this article.
Before delving into how Tendermint works, let's first introduce the key components that are important in the Tendermint protocol so that we can go through the protocol without losing continuity:
Block: A block stores a batch of transactions. It also contains other metadata, such as the previous block height, Merkle root of transactions, etc., but we’ll ignore that here.
Block height: Block height is a unique, incrementing index specifying each block's position within a blockchain.
Nodes (validator nodes): Nodes are responsible for maintaining a full copy of the historical record of blocks and transactions. They actively engage in voting for block validity and reaching consensus on a block at each height. Each node can be identified by its public key.
Round: Nodes in the Tendermint protocol might be required to participate in multiple rounds before agreeing on a block at a specific height.
Proposer: A proposer, also known as a leader node, is responsible for proposing a block to other nodes. At each height, validators take turns proposing new blocks in each round. As a result, there will be only one unique proposer, and all nodes will know who the current leader is.

At a high level, Tendermint uses a round-based voting mechanism to confirm a block at each height. Nodes in the protocol exchange three types of messages before committing to a block, as follows:
PROPOSAL: A proposal for a block that a leader, or proposer, proposes at a given height and round, in the hope that the network will eventually commit the block. A PROPOSAL message includes the proposed block , the height , and the round , represented as .
PREVOTE: The first stage of voting. A PREVOTE message includes the identity of the voter , the block being voted on , the height , and the round . We represent this as . A node can move to the next voting phase only after receiving PREVOTE messages from at least two-thirds of all nodes.
Based on the previously described message exchanges, the protocol can be divided into three stages: propose, prevote, and precommit. Additionally, the protocol relies on three timeouts:
proposeTimeout: This timer activates when a node starts a new round. If the node fails to send a PREVOTE message within the specified time interval dT, the is triggered.
prevoteTimeout: Once a node sends its initial PREVOTE message, the timer starts. It is triggered if, within the time frame dT, the node doesn't receive supporting PREVOTE messages for a specific block from at least two-thirds of the nodes.
precommitTimeout: Similar to , but used with PRECOMMIT messages.
These timeouts allow a node to proceed to the next phase in case it gets stuck or fails to make progress in a given stage.

Assuming we are at the start of height h, each node begins with the following initial variables:
: the block that has received two-thirds of PREVOTE votes (initially nil)
: the round associated with lockedBlock
: the current height
The local state of a node can be represented by the tuple .
In this round, the proposer proposes a block and broadcasts with its signature to the other nodes. Upon receiving , a non-leader node performs two checks:
It verifies that all transactions in are valid.
It confirms that the message is correctly signed by proposer .
If both checks pass, the node updates its local state to and broadcasts to other nodes.
If node i doesn’t receive any valid PROPOSAL message within the time interval dT, the is triggered. In that case, it broadcasts a message and updates its local state to .
Once a node receives PREVOTE messages for block from at least two-thirds of the nodes, it locks on , updates its local state to , and advances to the second stage. It then broadcasts to signal its support for committing the block.
If a node does not receive enough PRECOMMIT messages for before the expires, it proceeds to the precommit stage and broadcasts instead.
Depending on its local state, the node’s state becomes either:
, or
Note that in round r, a node must precommit to the same block it prevoted for. For example, if it prevotes for nil, it must also precommit to nil in the same round.
To achieve consensus and commit the current block , a node waits until it receives at least two-thirds of the PRECOMMIT messages from other nodes in support of . Once the node decides on , it proceeds to the next height with round and resets its local variables.
However, if the node fails to reach a decision before the timeout window dT begins (which starts counting down after it sends its first PRECOMMIT message), the is triggered. This causes the node to move to the next round, , and reset its stage to propose phase.
Note that a node skips directly to the precommit phase if it receives two-thirds of PREVOTE messages for . Similarly, it moves to the next round if it receives two-thirds of PRECOMMIT messages for .
The flow above describes how nodes behave in the first round . However, as shown in the last stage, a node may enter a new round if no block is committed in the current one. Let’s now explore how the protocol behaves in subsequent rounds.
During the propose phase of round , the proposer checks its local state to identify the most recent block that successfully passed the prevote stage. Let's denote this block as . It is stored in the proposer's state as lockedBlock, with the state represented as , where .
If such a block exists, the proposer reuses it and broadcasts the message .
If there is no such block (i.e., ), the proposer is free to propose a new block and sends .
When a node receives a PROPOSAL message for block during round , it evaluates the proposal under the following conditions:
The node accepts the proposal (i.e., broadcasts and moves to the precommit phase) if its local lockedBlock is either nil or equal to .
If the node’s does not match the proposed block , it does not support the proposal and instead waits for the proposeTimeout to trigger. When that happens, it broadcasts and proceeds to the prevote phase without voting for any block.
This locking mechanism plays a crucial role in preserving safety in the consensus process. (See [1], Section 3.2.3 for a more detailed explanation.)
Similar to the description for round , in round , nodes follow the same process:
If a node receives two-thirds of the PREVOTE messages supporting , it updates its local state, broadcasts , and proceeds to the precommit phase.
If it doesn't receive enough support and the expires, it still moves to the precommit phase and votes .
In the precommit phase, the node faces two possible outcomes:
It commits to if it receives at least two-thirds of the PRECOMMIT messages for the block.
Otherwise, if the is triggered without sufficient support, it advances to the next round, .
Tendermint’s design guarantees that, as long as fewer than one-third of the validators are Byzantine, all honest nodes will eventually agree on the same valid block and move to the next height. This property ensures the safety and liveness of the consensus protocol, even in the presence of faulty or malicious nodes.
The two stages of voting are required to ensure the consistency of the protocol. A single stage of voting would leave the system vulnerable to manipulation by malicious nodes. For example, a Byzantine node could partition the network and cause some honest nodes to commit to one block in round , while others commit to a different block in a later round.
The two-stage mechanism prevents this by introducing a soft commitment in the first stage. When an honest node receives two-thirds of the PREVOTE messages for a block, it becomes locked on that block but does not yet commit to it. This locking mechanism signals that the block may be finalized, allowing nodes to coordinate before moving to the commit phase.
In the second stage, the node confirms its commitment by broadcasting a PRECOMMIT message. However, if a node later sees sufficient evidence—two-thirds of PREVOTE messages for a different block in a higher round—it can update and follow the newer proposal, helping lagging nodes catch up safely.
Let’s now explore how this two-stage voting process ensures safety with an example: consider a one-stage protocol involving four nodes—A, B, C, and D—with A as the proposer and B as the Byzantine node.
Node A proposes block "X".
Nodes C and D immediately vote for block "X" upon receiving A’s proposal.
Byzantine node B manipulates the network so that only A receives votes from C and D, while C and D are prevented from seeing each other’s votes.
Node A, including its own vote and those from C and D, observes a two-thirds majority and commits to block "X".
Meanwhile, because B doesn’t vote and C and D can’t see each other’s responses, they are unable to gather a sufficient majority. As a result, they timeout and start a new round.
Node B becomes the new proposer and proposes block "Y".
Without any locking mechanism from a pre-vote phase, nodes C and D are free to vote for block "Y" and eventually commit to it.
As a result, node A commits to "X", while C and D commit to "Y", violating safety and creating a fork in the network.
Node A proposes block "X".
Nodes C and D immediately pre-vote for "X" upon receiving the proposal.
Byzantine node B disrupts the network, similar to the one-stage voting case, preventing C and D from seeing each other's pre-votes.
Node A receives pre-votes from C and D, locks on block "X", but does not yet commit to it.
After a timeout, nodes B, C, and D start a new round.
Node B proposes a new block "Y".
Since C and D are not locked on "X", they are free to pre-vote for "Y". B, C, and D then pre-vote for block "Y".
Later, node A sees that a majority of nodes have pre-voted for "Y", so it unlocks from "X" and locks onto "Y" instead.
This prevents the inconsistency seen in the one-stage voting scenario, where nodes could commit to different blocks.
Let’s say C and D both receive enough PREVOTE messages (at least two-thirds) for block "X", so they lock on "X" and proceed to the pre-commit phase.
Each of them sends a PRECOMMIT for "X".
At this point, Byzantine node B manipulates the network and prevents C and D from seeing each other’s PRECOMMITmessages.
Because they cannot gather two-thirds of PRECOMMITs, neither C nor D commits to "X".
However, since they are already locked on "X", they will not support any new proposal unless they see two-thirds of PREVOTEs for a different block in a future round.
Node B may propose a new block "Y" in the next round, but C and D will not vote for it. Unlocking "X" requires strong evidence (two-thirds PREVOTEs for a new block), which is impossible unless honest nodes change their votes — and they won’t, due to the locking rule.
This locking rule ensures that once honest nodes soft-commit to a block (with two-thirds of PREVOTE messages), they do not switch to a different one, preserving safety even in the presence of Byzantine behavior.
For a more in-depth explanation, [3] is a recommended resource to explore this topic further.
You’ve probably seen the 33% threshold appear in many blockchain whitepapers, articles, and of course, in this one. It’s closely tied to the partially synchronous model, which introduces this limit to ensure both safety and liveness in the presence of faulty or malicious nodes. Let’s sketch an intuitive proof here as a bonus for readers who aren’t sure where this number comes from. Feel free to skip this section if you’ve already seen it elsewhere.
This sketch is inspired by the works cited in [1] and [2]. You might have seen the threshold expressed as , where:
is the total number of nodes in the network
is the number of Byzantine (i.e., faulty or malicious) nodes the protocol can tolerate
Rearranging this gives us , meaning the protocol is safe only if fewer than one-third of nodes are Byzantine. We’ll use the form here because it’s easier to reason about in this proof.
Suppose a protocol has nodes and can tolerate up to Byzantine nodes. Now imagine an honest node has been waiting but hasn't seen enough messages to make progress. This could happen due to network delays or because Byzantine nodes are deliberately withholding or delaying messages.
In such a situation, it is reasonable for the honest node to proceed once it hears from nodes. This is a safe number because, even in the worst case, the remaining nodes may simply be unresponsive or malicious.
Now consider what could happen in the worst case:
Suppose honest nodes are delayed and unable to broadcast their messages in time.
As a result, a node may receive messages from only nodes. Among those, are from Byzantine nodes.
That means the number of honest votes the node actually receives is at most .
Now, the best course of action for an honest node is to make a decision based on the majority (50%) of the votes it receives from honest nodes and f Byzantine nodes — without being deceived by the Byzantine nodes.
We require:
More than 50% of the total number of votes received must come from honest nodes. We can express this requirement in mathematical form as:
$$\begin{aligned}
\text{50% of the total number of votes received} &> \text{number of Byzantine nodes} \
\frac{1}{2} × \text{(remaining honest nodes + Byzantine nodes)} &> f\
\frac{1}{2}(n - 2f + f) &> f\
\frac{1}{2}(n - f) &> f\
n - f &> 2f\
n &> 3f
\end{aligned}$$
Which gives us the result:
This is the origin of the famous 33% Byzantine fault threshold.
Please note that this is just a sketch, not a rigorous proof. [1] and [2] are the recommended resources for the proof.
[1] https://atrium.lib.uoguelph.ca/items/5459099e-67aa-4a23-83ae-d3471d8d8336
[2] https://github.com/timroughgarden/fob21/blob/main/l/l6.pdf
[3] https://github.com/timroughgarden/fob21/blob/main/l/l7.pdf
PRECOMMIT: The second stage of voting. A PRECOMMIT message includes the node identity , the voted block , the height , and the round . It is represented as . A node can commit to and store block locally, then move on to the next height , only after receiving PRECOMMIT messages for block from at least two-thirds of the nodes.
: the current round
: the current stage
PRECOMMIT: The second stage of voting. A PRECOMMIT message includes the node identity , the voted block , the height , and the round . It is represented as . A node can commit to and store block locally, then move on to the next height , only after receiving PRECOMMIT messages for block from at least two-thirds of the nodes.
: the current round
: the current stage
No activity yet