
Base has reached Stage 1 Decentralization
TLDR: Base has achieved Stage 1 Decentralization, a critical milestone in our journey to build an open and global onchain economy. We’ve done this by launching permissionless fault proofs and increasing the decentralization of our contract upgrade process with a security council. We believe that decentralization is critical to deliver on our mission of building a global onchain economy and are proud to have achieved this milestone.What decentralization means for BaseBase’s mission is to build...

Building for the long-term: making Base faster, simpler, and more powerful
TLDR: We’re introducing new building blocks that make it faster, simpler, and more powerful to build on Base: Flashblocks, Smart Wallet Sub Accounts, and Base Appchains — plus a new home base for builders.Base is building for the long-termBase’s mission is to build a global onchain economy that increases innovation, creativity, and freedom. To further our mission, we need to continue making Base more powerful, easier to use, and faster than ever. We are focused on cultivating an ecosystem of ...

Expanding Global Access to Crypto with Onboard
TLDR: Coinbase Wallet has integrated Onboard P2P as an onramp option to make buying crypto easier around the world. Onboard lets anyone purchase crypto with local currency through a peer-to-peer exchange, without lengthy verification, and lower fees. Coinbase Wallet and Base are committed to building a global onchain economy that increases innovation, creativity, and freedom. To achieve this mission, we need to make getting onchain as easy as possible – in every country in the world. However,...
>460K subscribers



Base has reached Stage 1 Decentralization
TLDR: Base has achieved Stage 1 Decentralization, a critical milestone in our journey to build an open and global onchain economy. We’ve done this by launching permissionless fault proofs and increasing the decentralization of our contract upgrade process with a security council. We believe that decentralization is critical to deliver on our mission of building a global onchain economy and are proud to have achieved this milestone.What decentralization means for BaseBase’s mission is to build...

Building for the long-term: making Base faster, simpler, and more powerful
TLDR: We’re introducing new building blocks that make it faster, simpler, and more powerful to build on Base: Flashblocks, Smart Wallet Sub Accounts, and Base Appchains — plus a new home base for builders.Base is building for the long-termBase’s mission is to build a global onchain economy that increases innovation, creativity, and freedom. To further our mission, we need to continue making Base more powerful, easier to use, and faster than ever. We are focused on cultivating an ecosystem of ...

Expanding Global Access to Crypto with Onboard
TLDR: Coinbase Wallet has integrated Onboard P2P as an onramp option to make buying crypto easier around the world. Onboard lets anyone purchase crypto with local currency through a peer-to-peer exchange, without lengthy verification, and lower fees. Coinbase Wallet and Base are committed to building a global onchain economy that increases innovation, creativity, and freedom. To achieve this mission, we need to make getting onchain as easy as possible – in every country in the world. However,...
Share Dialog
Share Dialog
Lessons learned from Base’s recent block building outage
Base is committed to building in the open, including public retrospectives to share learnings when issues arise.
On 09/21/2024 at 15:14 UTC, Base Mainnet experienced a 17 minute block building outage. The integrity of the chain was not affected, all funds on Base were safe, and block production resumed after we mitigated the incident. This retrospective dives into the root cause, the impact, how we mitigated, and what we plan to improve moving forward.
The root cause of the block building outage was a misconfiguration on our sequencer cluster. When the current block producer became unhealthy, it was unable to successfully start block building on another instance. The incident was mitigated by manually starting block production on a correctly configured instance.

Block Production
No blocks were produced for 17 minutes, beginning at 15:14 UTC. Blocks 20071146 to 20071691 contain no user transactions, as they were created by the protocol after sequencing resumed.
Transaction Processing
Transactions are submitted to Base through the `eth_sendRawTransaction` RPC call, which places them in the mempool. During the incident, the mempool instances continued to function correctly. However, fewer transactions were submitted in that time frame, which can be seen in the graph below.
There was an immediate drop in both successful and failed `eth_sendRawTransaction` requests after the outage started, followed by a slow rise in failed requests. Our current hypothesis is that less transactions were submitted because applications were impaired by the halt in block production.

Once block production resumed, many of the transactions that were submitted during the incident were included in the blocks immediately following 20071691.
Background
Over the past year, Base has designed and built op-conductor to improve the reliability of block production. Our goal with building op-conductor is to increase the overall availability of the system, with a target of achieving 99.99% availability. Prior to op-conductor, any failure of the sequencer would result in an outage. op-conductor enables us to operate multiple sequencers and upon a failure start block production on a healthy instance.

On 9/20/2024, we migrated block production from the single sequencer to the op-conductor cluster. However the op-conductor instances were in a misconfigured state, where op-node was not submitting new unsafe block payloads to op-conductor.
Trigger
On 9/21/2024 at 15:14 UTC, the currently active sequencer experienced delays in block production. op-conductor correctly detected the issue and began the process to transfer leadership to another instance. As part of the leadership transfer, op-conductor stopped the local op-node from building blocks.
Due to the misconfiguration, the new block producer was unable to start production as the start operation requires the unsafe payload from op-conductor, which the previous leader did not write. This caused the cluster to enter a state in which no instance was able to become an active block producer.
Below is a log snippet containing one sample of a failed leadership transfer:

Mitigation
The incident was mitigated by reverting to the single sequencer topology while the op-conductor cluster configuration was fixed.
We implemented a bidirectional handshake between op-node and op-conductor at startup to ensure proper communication configuration.
Improve our internal configuration management process to prevent and detect misconfigurations.
Lessons learned from Base’s recent block building outage
Base is committed to building in the open, including public retrospectives to share learnings when issues arise.
On 09/21/2024 at 15:14 UTC, Base Mainnet experienced a 17 minute block building outage. The integrity of the chain was not affected, all funds on Base were safe, and block production resumed after we mitigated the incident. This retrospective dives into the root cause, the impact, how we mitigated, and what we plan to improve moving forward.
The root cause of the block building outage was a misconfiguration on our sequencer cluster. When the current block producer became unhealthy, it was unable to successfully start block building on another instance. The incident was mitigated by manually starting block production on a correctly configured instance.

Block Production
No blocks were produced for 17 minutes, beginning at 15:14 UTC. Blocks 20071146 to 20071691 contain no user transactions, as they were created by the protocol after sequencing resumed.
Transaction Processing
Transactions are submitted to Base through the `eth_sendRawTransaction` RPC call, which places them in the mempool. During the incident, the mempool instances continued to function correctly. However, fewer transactions were submitted in that time frame, which can be seen in the graph below.
There was an immediate drop in both successful and failed `eth_sendRawTransaction` requests after the outage started, followed by a slow rise in failed requests. Our current hypothesis is that less transactions were submitted because applications were impaired by the halt in block production.

Once block production resumed, many of the transactions that were submitted during the incident were included in the blocks immediately following 20071691.
Background
Over the past year, Base has designed and built op-conductor to improve the reliability of block production. Our goal with building op-conductor is to increase the overall availability of the system, with a target of achieving 99.99% availability. Prior to op-conductor, any failure of the sequencer would result in an outage. op-conductor enables us to operate multiple sequencers and upon a failure start block production on a healthy instance.

On 9/20/2024, we migrated block production from the single sequencer to the op-conductor cluster. However the op-conductor instances were in a misconfigured state, where op-node was not submitting new unsafe block payloads to op-conductor.
Trigger
On 9/21/2024 at 15:14 UTC, the currently active sequencer experienced delays in block production. op-conductor correctly detected the issue and began the process to transfer leadership to another instance. As part of the leadership transfer, op-conductor stopped the local op-node from building blocks.
Due to the misconfiguration, the new block producer was unable to start production as the start operation requires the unsafe payload from op-conductor, which the previous leader did not write. This caused the cluster to enter a state in which no instance was able to become an active block producer.
Below is a log snippet containing one sample of a failed leadership transfer:

Mitigation
The incident was mitigated by reverting to the single sequencer topology while the op-conductor cluster configuration was fixed.
We implemented a bidirectional handshake between op-node and op-conductor at startup to ensure proper communication configuration.
Improve our internal configuration management process to prevent and detect misconfigurations.
1 comment
good