# Let’s Talk About Data Availability

By [Polygon Village](https://paragraph.com/@polygonvillage) · 2022-11-15

---

First.. what’s data availability?
=================================

According to [ethereum.org](https://ethereum.org/en/developers/docs/data-availability/), data availability (DA) is “the guarantee that the block proposer published all transaction data for a block and that the transaction data is available to other network participants.”

But how exactly do you guarantee data is available?

For most Layer 1s (L1s), it’s pretty straightforward. L1 nodes know transaction data is available by downloading and executing it themselves. This is how nodes verify blocks and is at the core of how blockchains work.

Layer 2s (L2s) change the paradigm. L2s (specifically [rollups](https://vitalik.ca/general/2021/01/05/rollup.html)) use fancy cryptographic proofs to guarantee blocks are valid without nodes having to execute every transaction. This unlocks massive [benefits](https://ethereum.org/en/developers/docs/scaling/#:~:text=Scaling%20overview,-As%20the%20number&text=The%20main%20goal%20of%20scalability,more%20on%20the%20Ethereum%20vision\).) and [new L1 designs](https://research.thetie.io/danksharding-ethereums-scalability-killer-post-merge/#:~:text=Danksharding%20gets%20its%20name%20from,called%20the%20%E2%80%9CScalability%20Killer%E2%80%9D.)!

But not so fast. [Rollups still need data to be available](https://www.paradigm.xyz/2022/08/das), just for different reasons.

![Rollup data availability](https://storage.googleapis.com/papyrus_images/752cb1d2b6a22086933106fdec5ccca5c88311f0831168654f17458752cd1b60.png)

Rollup data availability

So how do we scale? Seems like we are back to where we started.

DA layers
=========

Introducing DA layers

DA layers specialize in, as you might expect, assuring nodes that data is available. This can take different forms, including:

*   DA blockchains
    
*   DA committees
    
*   DA middleware
    
*   Data sharding
    

We’re only going to discuss the first two, but here are a few resources if you want to learn about [DA middleware](https://www.youtube.com/watch?v=OtUOXTqrSyg) and [data sharding](https://research.thetie.io/danksharding-ethereums-scalability-killer-post-merge/#:~:text=Danksharding%20gets%20its%20name%20from,called%20the%20%E2%80%9CScalability%20Killer%E2%80%9D.).

DA blockchains vs. DA committees
================================

Because it’s still very expensive to post data on Ethereum, most rollup teams are posting their data off-chain. This design technically classifies them as [validiums](https://ethereum.org/en/developers/docs/scaling/validium/).

Ethereum’s [data-sharding](https://notes.ethereum.org/@vbuterin/proto_danksharding_faq) roadmap solves the problem and enables cheap rollup data, but to be safe, let’s assume we’re a year away from the first major upgrade. In the meantime, rollup teams have two major options: DA committees and DA blockchains.

DA committees are selected entities that hold off-chain copies of the transaction data and promise to make it available in case of emergency. These [committees](https://medium.com/starkware/data-availability-e5564c416424) often [have 7-10 members](https://blog.celestia.org/ethereum-off-chain-data-availability-landscape/) and are a slight improvement over fully relying on the rollup operator.

![](https://storage.googleapis.com/papyrus_images/72975f855239864a48470a104765642ca7ec3847040c441dbaa60ba0d8265cef.png)

DA blockchains take the idea a few steps further by replacing small, permissioned committees with large, permissionless committees that have strong economic incentives to behave.

DA layers vs. data storage layers
=================================

A common mistake is thinking that data availability = data storage. However, this is not the case.

An easy way to think about the difference is on a time dimension.

DA layers make sure nodes can access data on a short time horizon. Their main goal is to smoothly progress blockchain state, and they typically do not make assurances about longer time horizons. [As ethereum.org puts it](https://ethereum.org/en/developers/docs/data-availability/), “data availability is relevant when a block is yet to pass consensus.”

In fact, DA layers might even discard the data after a few weeks. In Ethereum’s next major upgrade, this data will be pruned [after ~2 weeks](https://www.eip4844.com/).

Data storage layers make sure data is available on a longer time horizon and are closer to the cloud storage solutions most web2 developers are familiar with. Of course, it’s not hard to imagine web3 developers opting for decentralized versions like [Arweave](https://www.arweave.org/).

[![User Avatar](https://storage.googleapis.com/papyrus_images/9dd2d6f82d2f01f728b976af67bdbb32178d22ae1f320dcb6ee5895fe0ef77ba.jpg)](https://twitter.com/sanjaypshah)

[Sanjay Shah ⚡️](https://twitter.com/sanjaypshah)

[@sanjaypshah](https://twitter.com/sanjaypshah)

[![Twitter Logo](https://paragraph.com/editor/twitter/logo.png)](https://twitter.com/sanjaypshah/status/1580221148494561280)

15/ Here's a visual of the full data flow.  
  
With this, rollups get strong guarantees they can re-create the rollup state in an emergency.

![](https://storage.googleapis.com/papyrus_images/c218ed8c7568a5ae7ecfced2c1787ae6bdb506ee48dd8119a6a3d1d362e44992.jpg)

 [![Like Icon](https://paragraph.com/editor/twitter/heart.png) 23](https://twitter.com/sanjaypshah/status/1580221148494561280)[

10:37 AM • Oct 12, 2022

](https://twitter.com/sanjaypshah/status/1580221148494561280)

DA layer use cases
==================

There are many things that can be built on top of DA layers. Let’s touch on three:

*   [Validiums](https://blog.polygon.technology/from-rollup-to-validium-with-polygon-avail/)
    
*   [Sovereign rollups](https://blog.celestia.org/sovereign-rollup-chains/)
    
*     
    

As we mentioned earlier, validiums are common today. Even after Ethereum has implemented its own sharded DA layer, it’s likely that rollup teams will still use off-chain data to reduce costs. Developers have historically always pushed the boundaries of what’s possible.

Sovereign rollups not only use DA layers for data availability but also for consensus. Applications are likely good candidates to become sovereign rollups (rather than [smart contract rollups](https://members.delphidigital.io/reports/the-complete-guide-to-rollups/) or validiums) if they need full control over state transitions yet don’t want to worry about a validator set.

In his [recent talk](https://youtu.be/Cwbbxb987vE?t=152), [Balaji Srinivasan](https://twitter.com/balajis) envisions a future where “fiat information” competes with “crypto information.” He describes “[reliable data feeds](https://youtu.be/Cwbbxb987vE?t=343)” using crypto oracles like [Chainlink](https://chain.link/), where IRL metadata is posted on chain. That data could be posted onto DA layers.

![Source: Creating Sources of Definitive Truth With Blockchain Oracles](https://storage.googleapis.com/papyrus_images/77665c3c9885c49058050bd1bae3136c6e5b07d9e20beb3dd166588b1b0e17a0.png)

Source: Creating Sources of Definitive Truth With Blockchain Oracles

![Source: Creating Sources of Definitive Truth With Blockchain Oracles](https://storage.googleapis.com/papyrus_images/c37a4cf431f6a27c9657dfedf5b97a545b1948a7f0a5f64825ad978b5a2f8416.png)

Source: Creating Sources of Definitive Truth With Blockchain Oracles

DA layer endgames
=================

It’s the early days for DA layers. [Polygon Avail](https://polygon.technology/blog/polygon-avails-ability-to-scale-the-way-forward), [EigenDA](https://www.layrlabs.com/products), and [Celestia](https://blog.celestia.org/july-engineering-update/) are all still in testnet, and Ethereum data sharding is 1-3 years away, depending on the upgrade in question.

However, there’s plenty to look forward to. Let’s highlight what seems to be a common endgame across the board. Most teams envision something like this:

Progressively increasing block sizes and sharding them across the network

[![User Avatar](https://storage.googleapis.com/papyrus_images/16f57dd62ad32a32b55ec40291974b7c43ecfd679fef42d0ba87e24edc9db701.jpg)](https://twitter.com/LogarithmicRex)

[Logarithmic Rex](https://twitter.com/LogarithmicRex)

[@LogarithmicRex](https://twitter.com/LogarithmicRex)

[![Twitter Logo](https://paragraph.com/editor/twitter/logo.png)](https://twitter.com/LogarithmicRex/status/1585441338794745856)

(18/30) The transition from Proto-Danksharding to Danksharding involves two important changes:  
  
\- available blobs per block will increase from 1 to 64 (as of now)  
\- blob data will be distributed across the network, so that no single node needs to download them all

 [![Like Icon](https://paragraph.com/editor/twitter/heart.png) 9](https://twitter.com/LogarithmicRex/status/1585441338794745856)[

8:20 PM • Oct 26, 2022

](https://twitter.com/LogarithmicRex/status/1585441338794745856)

Relieving nodes of downloading full blocks using [KZG commitments](https://twitter.com/SalomonCrypto/status/1583705993300492288)

[![User Avatar](https://storage.googleapis.com/papyrus_images/c813bda36199a3fb8d122cdd0a00b5736fe91bd9787a720ca23dc9b0bb4d5f80.jpg)](https://twitter.com/apolynya)

[polynya](https://twitter.com/apolynya)

[@apolynya](https://twitter.com/apolynya)

[![Twitter Logo](https://paragraph.com/editor/twitter/logo.png)](https://twitter.com/apolynya/status/1512133442401554434)

Polygon Avail is going to be the first data availability sampling network that uses KZG proofs. Similar technology will power danksharding in the future, enabling rollups to hit limitless scale in tandem with increasing decentralization or security.

[![User Avatar](https://storage.googleapis.com/papyrus_images/3517189dc626be91248b11e865276fa8ba4721cadc6ffe578067645993ef363e.jpg)](https://twitter.com/0xPolygon)

[Polygon](https://twitter.com/0xPolygon)

[@0xPolygon](https://twitter.com/0xPolygon)

[![Twitter Logo](https://paragraph.com/editor/twitter/logo.png)](https://twitter.com/0xPolygon/status/1511709595060031491)

[#Polygon](https://twitter.com/hashtag/Polygon) is building a modular suite of scaling solutions to empower chains and dApps of any size.  
  
And today, we’re sharing our vision for @0xPolygonAvail, a new data availability blockchain that improves scalability across the board.  
  
Learn more ![👉](https://abs-0.twimg.com/emoji/v2/72x72/1f449.png) [bit.ly/Polygon-Avail](https://t.co/XMBtmI4W1C)

![](https://pbs.twimg.com/media/FPqr9qJaIAEVWe-.jpg)

 [![Like Icon](https://paragraph.com/editor/twitter/heart.png) 199](https://twitter.com/apolynya/status/1512133442401554434)[

1:21 PM • Apr 7, 2022

](https://twitter.com/apolynya/status/1512133442401554434)

Maintaining low verification costs with [data availability sampling](https://hackmd.io/@vbuterin/sharding_proposal#ELI5-data-availability-sampling).

[![User Avatar](https://storage.googleapis.com/papyrus_images/16f57dd62ad32a32b55ec40291974b7c43ecfd679fef42d0ba87e24edc9db701.jpg)](https://twitter.com/LogarithmicRex)

[Logarithmic Rex](https://twitter.com/LogarithmicRex)

[@LogarithmicRex](https://twitter.com/LogarithmicRex)

[![Twitter Logo](https://paragraph.com/editor/twitter/logo.png)](https://twitter.com/LogarithmicRex/status/1584559577307103232)

(11/21) Yes, no single node will download the entire block, but if we are careful about how we break up our blob and ensure our sampling is random enough, we can be confident the entire block is available.

![](https://storage.googleapis.com/papyrus_images/617a8e63131e4fa8d0cde10e4e8f648d29fe2cca57d10982e4c018ec803ba174.jpg)

 [![Like Icon](https://paragraph.com/editor/twitter/heart.png) 1](https://twitter.com/LogarithmicRex/status/1584559577307103232)[

9:56 AM • Oct 24, 2022

](https://twitter.com/LogarithmicRex/status/1584559577307103232)

Eventually, we get to a place where DA layers enable [high throughput](https://twitter.com/SalomonCrypto/status/1559402384526258176) applications while trust-minimized light clients [verify on mobile devices](https://twitter.com/musalbas/status/1480901457633239048?s=46&t=M0oqH6RTgCxC6ESgFxGrqw).

That’s right - performance and decentralization!

Wrapping up
===========

Hopefully, this article helped you gain more familiarity with data availability. The goal was to offer a broad overview and address common misperceptions about the topic.

There are many deep dives into how it works, so if you want to jump down the rabbit hole, here are some resources:

*   [Polygon blog](https://blog.polygon.technology/category/polygon-solutions/polygon-avail/)
    
*   [Haym Salomon threads](https://salomoncrypto.notion.site/Ethereum-9dfdf1b2cd334bd8b713b8f8a1f5f26b)
    
*   [Paradigm blog](https://www.paradigm.xyz/2022/08/das)
    

As always, this article is based on a snapshot in time, and web3 moves _very_ quickly. The technology and timelines mentioned might change.

To keep up with the latest, I recommend following along with sources like the [Polygon website](https://blog.polygon.technology/), the [Polygon DAO blog](https://mirror.xyz/polygonvillage.eth), and [The Village Times newsletter](https://polygonvillage.substack.com/). And to get involved, come join us at [Polygon DAO](https://discord.gg/zp7KMqvNhttps://discord.gg/zp7KMqvN).

---

*Originally published on [Polygon Village](https://paragraph.com/@polygonvillage/let-s-talk-about-data-availability)*
