Separating the Monolith

Subscribe to pseudotheos

>100 subscribers

Subscribe to pseudotheos

>100 subscribers

Scaling the execution layer via rollups, validiums, and volitions is only one half of a two-part solution. In order for a truly scalable and decentralized blockchain to exist, the data computed and stored on-chain must reside somewhere without high storage or computing requirements. Ten years from now, when significantly more people, institutions, and daily users interact with the blockchain in some way, more efficient state and data storage is required. Separating the monolith requires existing monolithic nodes to fragment into specialized task-specific nodes.

Defining the Problem(s)

Right now, an Ethereum full node’s size is growing at an average rate of 78% per year.

"just add more disks" is an easy answer, but it's one that reduces decentralization.

There are currently ~185 million unique Ethereum addresses. To onboard the world, the data must be stored in such a way where node overhead is minimal. The lower the compute requirements, the more nodes can be run, reducing centralization risk.

Many people have multiple addresses, so this number isn't representative of a 1:1 "Ethereum population" metric, but it's still a good benchmark.

Separating the Monolith

In the current monolithic blockchain spec, data and state reside alongside the execution and consensus layers inside the L1. Recent market events show how simply increasing throughput by increasing node workload is unsustainable at the best of times, and ineffective at the worst of times.

The solution to reach the Endgame and solve the scalability trilemma lies in separating the monolith.

In this simplified model of a modular blockchain, each layer is separated which allows for more flexibility and scalability without having to sacrifice decentralization.

Introducing the Data Availability (DA) Layer

The previous article on zkSNARKs vs zkSNARKs focused on the execution layer and the efforts being made to separate it from the monolith by using zero knowledge proofs to bring the execution off-chain. This article covers the other side: data availability and state.

State

As explained in a recent Twitter thread, Ethereum is a state machine.

pseudo 🇺🇦

@pseudotheos

Let's start 2022 off with the basics and dive into the Ethereum Virtual Machine (EVM), the current smart contract standard, and how it works. We'll also discuss how L2s open the door to a supercharged EVM!

Ethereum’s state holds all of the account balances, smart contracts, and storage. Without any adjustments, Ethereum’s state would grow forever, constantly increasing the hardware requirements to run a node until it becomes impossible. The fix is statelessness, but in order to understand it fully, we need to understand what an Ethereum block actually is under the hood.

Diving Deeper

Each part of an Ethereum block is a trie. Tries (pronounced like trees) are a super efficient way to store data. Ethereum’s state trie is stored alongside the transaction trie and execution trie, which each full node is required to handle by itself currently. Ethereum has two types of data: ephemeral data (account balances) and permanent data (transactions). A diagram of a block’s structure is below.

Technically, the state isn’t gone, but in Stateless Ethereum, it’s been moved from the node to another participant in the network. Instead of the node handling the full state, it holds a block witness- a packet of data that is needed to validate the transactions held in that block. By giving the node only the data it needs and nothing more, it reduces the overhead requirements massively. The below diagram visualizes the block lifecycle in the Stateless Ethereum network.

Data Availability Proofs

Much like zero knowledge proofs allow executions off-chain, data availability proofs allow for data to be stored in the same way. This was broken down at a high level in a twitter thread.

pseudo 🇺🇦

@pseudotheos

What is data availability? Why is it important for blockchains? A

breaking it down

The data availability layer, much like the execution layer, requires a mathematical proof which shows that even if the data is off-chain, it is not manipulated, censored, or modified in any way. The magic behind data availability proofs lies within erasure codes. Erasure codes are very common in computer science. Some non-blockchain implementations of erasure coding include satellite communications, CD-ROMs, and QR codes. The first blockchain-specific paper regarding data availability and erasure coding was co-authored by Mustafa Al-Bassam, Alberto Sonnino, and Vitalik Buterin in 2018. In short, it allows for a complete data set to be built from a partial data set in addition to the erasure code.

This means that even if a block producer manipulated a fraction of the data, the producer would be required to withhold over half of the entire set in order to censor due to the erasure code being able to reassemble the missing data. This allows provably secure off-chain data storage, drastically reducing a node’s storage requirements.

Conclusion

According to past research and current work being done by developers, solving the scalability trilemma is possible. Like everything, it’s easier said than done, but within a few years, the technology will become robust, uncensorable, and cheap to use.

Thanks for reading this article! This is the summation of a lot of research to help bridge the knowledge gap. Follow me on Twitter to get notified about future posts and let me know what you think. My content will remain 100% free, forever, and is licensed under CC BY-SA unless otherwise specified.

Diving into Ethereum's world state

Diving into Ethereum's world state Ethereum comprises of many constituent parts. This post seeks to de-construct Ethereum to provide you with an understanding of its data storage layer. We will ...

https://medium.com

Polygon Blog | Announcements, updates, and news

Get the latest ideas, announcements, partnerships, and Web3 news emerging from the Polygon ecosystem.

https://polygon.technology

Polygon Blog | Announcements, updates, and news

Data Availability & Scaling Blockchains

Data Availability & Scaling Blockchains Introduction With the high transaction fees on Ethereum, scalability has been top-of-mind for many cryptocurrency projects. A popular solution is to separate ...

https://medium.com