EIP-7928 (Block-Level Access Lists) is the headliner¹ of the upcoming Glamsterdam upgrade, expected to activate mid-year. The EIP website summarizes it as a feature that unlocks “parallel transaction execution on Ethereum”. In this article we’ll see what that means, how the EIP works, and why it’s designed the way it is.

This will be a roundabout explanation: we’ll intentionally go through a false start to better understand the actual solution. If you only want a quick overview, I recommend watching this short video instead.

Block proposal and validation

In Ethereum, each new block is proposed by a randomly chosen validator and propagated to the rest of the network, where other nodes receive it and check that it’s valid. The most time-consuming part of this validation is re-executing all the transactions in the block to verify that the resulting state matches the expected one.

A block is proposed, propagated, and validated.

Transactions within a block have an order determined by the block builder. This order has to be respected when a block is verified; if it’s not, the resulting state could be different. For example, imagine a block with only two transactions, both calling the same contract:

**Example 1**: Two transactions that result in a different final state depending on their execution order.

Here we need to execute transactions in order to get the expected state. But this is not always the case. Take this other example:

**Example 2**: Two transactions that result in the same final state regardless of their execution order.

In this case, the order in which we execute the transactions doesn’t matter because the resulting state is the same. Re-executing these transactions sequentially would be wasteful: we could run both of them at the same time without affecting the result.

These examples illustrate that sometimes transactions can be parallelized and sometimes not. Can we use this somehow to have a faster re-execution?

Dependency graphs

We can describe Example 1 in the previous section by saying that tx 2 depends on tx 1. But what do we mean exactly by “depends”? For now let’s use this definition:

Given two transactions A and B, we say that B depends on A if:
A precedes B in the block.
A writes some state that B reads.

In Example 1, tx 2 reads a storage slot (x) that was previously written by tx 1, causing a dependency. Keep in mind that "state" can mean a contract's storage, as in that example, but it can also mean other parts of the world state, like the balance of an account or the code at some address.

Now let’s ask, what would happen if we could know in advance all the dependencies in a block? Ignore for now how we would figure that out; just assume we do. If we had that information, we could use it to have some degree of parallelization during execution. For example, given a block with these transactions:

A block with four transactions and only one dependency among them.

Then we could execute the first three transactions in parallel, and then² execute the last transaction:

Executing transactions in parallel using a dependency graph.

At this point you might be wondering how realistic this is for real blocks. Here's an actual dependency graph, taken from dependency.pics:

Dependency graph for block 23710031 (source).

In this case, the whole block could be re-executed in just four batches, which seems pretty good for a block with 161 transactions. But we might not always be that lucky:

Dependency graph for block 23710002 (source).

Here we have a long dependency chain, significantly reducing the speedup we could get from parallelization.

In summary, if we (somehow) had the dependency graph of a set of transactions, we could re-execute them with some degree of parallelization, but how much benefit we'd obtain from that would depend on the structure of the graph. It's easy to imagine a pathological case where each transaction depends on the previous one, making sequential re-execution inevitable.

Building dependency graphs

So far we’ve assumed that we have a dependency graph we can work with, but where does that dependency graph come from? How can we know which transactions depend on which?

We said that there is a dependency between two transactions if one of them writes some state that the other one has to read later. But since the EVM is Turing complete, there’s no way to know this in advance without actually executing them. Maybe we could take a conservative approach and say something like “if two transactions call the same contract, they have a dependency”, but it’s easy to show that this doesn’t work. In Example 2 we saw two transactions that call the same contract but can be executed independently:

Two transactions call the same contract but can be executed independently.

Worse still, we could have two transactions that call different contracts but still have a dependency between them:

Two transactions call different contracts but can't be executed independently.

We can’t escape the fact that we need to execute the transactions to build the dependency graph. But we want the dependency graph to speed up the execution of a block’s transactions…

This seems like an unsolvable problem, but let’s remember how block production works: a single validator proposes a block and the rest of the nodes in the network verify it. This means that the block proposer³ could execute the transactions sequentially, build the dependency graph in the process, and propagate the graph along with the block. The other validators then can use this data to parallelize re-execution.

To build the dependency graph, we just need to run the transactions sequentially and keep track of which state is read or written by which transactions.

Two transactions use the same piece of state.

In this example, tx i writes to state[x], which is later read by tx j, so we would add a dependency from j to i. If we did this for all transactions we'd get a list of dependencies for each one, representing the dependency graph of the block.

This approach seems to work. But we don't get full parallelization and the worst cases are still sequential. Can we do better?

Block-Level Access Lists

Let's change our strategy and ask instead: what would we need to be able to run all transactions in parallel, no matter what?

Let's look at the previous example again. Transaction j has a dependency because it reads state[x], which was modified by transaction i. But if j knew in advance what the value of state[x] would be when it needs it, then there's no reason why it couldn't be executed independently. And we are already keeping track of state changes to build the dependency graph. Why not propagate those instead?

In other words, for each transaction we can keep track of the state they write, and the final value of that state. For the example above, the resulting state diff would be:

{
  [i]: {
    [x]: 200
  }
}

With this information, we can execute all transactions in parallel using a simple algorithm for reading from the state:

def read(j, x):
    last = find_last_writer(j, x)  # last transaction before j that modified x
    if last:
        return state_diff[last][x]
    else:
        return state[x]

And that's it. That's what Block-Level Access Lists (BALs) do. Almost.

In reality, BALs include both the state changes and the list of reads. The previous example would be something like:

{
    [i]: {
        writes: {
            [x]: 200
        },
        reads: []
    },
    [j]: {
        writes: {},
        reads: [x]
    }
}

The reason for this is I/O. When we say "read state" we are talking about disk access, a relatively slow operation. If we know all the writes and reads in advance, then we know all the state that will be needed by all transactions, allowing us to prefetch the necessary data in one go before the start of the execution. This means we can execute all transactions in parallel and we don't need to perform any disk I/O while we do it.⁴

The actual BAL format is way more complex than what I shared. It looks more like this:

BlockAccessList = List[AccountChanges]

AccountChanges = [    Address,    List[SlotChanges],
    List[StorageKey],
    List[BalanceChange],
    List[NonceChange],
    List[CodeChange] 
]

SlotChanges = [StorageKey, List[StorageChange]]

# etc

This is because, as we mentioned before, "state" can mean several things: storage slots, balances, nonces, code, etc. You can see the full definition of a BAL in the EIP.

While the speedup we get from parallel execution + I/O prefetch seems great, there are downsides. We now have to share both the block and its BAL. This increases propagation times, which could, in principle, have a negative effect that is more significant than the performance wins we get from the BAL. The argument is that this trade-off is worth it. See for example Worse-case analysis for BALs.⁵

Loose ends

There are several other aspects of Block-Level Access Lists that are important (or at least interesting) but that weren’t essential to the explanation above.

User-facing changes

Outside the inner workings of the protocol, BALs don’t change things too much. As far as I know, the two main changes are in the block header and in the JSON-RPC layer.

The block header gets a new field, blockAccessListHash, with the hash of the BAL. This lets you check that the BAL of a given block is correct without having to re-execute the block, which can be useful in contexts like execution-less validation.
There is a new eth_getBlockAccessList method that can be used to get the BAL of a given block. BALs are not available forever though: clients are only required to keep them around for 3553 epochs (~2 weeks), after which they can be pruned. BAL hashes, being part of the block header, are of course kept forever.

Why faster re-execution?

During our discussion we have taken for granted that faster re-execution is good, so good that it's even worth using a headliner for it. But why is that the case? To be honest, this isn’t mentioned in the EIP. The only explanation I’ve seen is in the headliner proposal:

The community has expressed a clear desire: Ethereum L1 must scale to meet the needs of users and developers. BALs unlock performance gains critical for higher throughput and/or shorter slot times. They also pave the way for zkEVM-based light nodes (executionless + stateless), full nodes (executionless + stateful) and partially stateless execution.

Which seems fair enough.

BALs vs Optional Access Lists

The term “access list” already exists in Ethereum: transactions can include an optional access list⁶, which is a very different concept:

They work at the transaction level, not at the whole block level.
They are only about accesses and don’t include state diffs.
They don’t have to be complete. A transaction could read some state and not include it in its access list.
They don’t even have to be correct. A block is rejected if its BAL is not exactly the one it should be. A transaction’s access list could include items that are not actually accessed.
And, as it says on the tin, they are optional. You can send transactions without access lists.

It’s unclear to me how transaction-level access lists will be affected once BALs are added. I’ve seen some people implying they will no longer make sense long-term, but I don’t fully understand why.

A historical note

I find it very interesting to look at the earliest proposal⁷:

As you can see, the original idea only included reads and it was all about I/O prefetching. Writes were added later to enable parallel execution. In that sense, the way BALs developed is the opposite to how I explained them here. But since the “marketing” of the feature has focused a lot on parallel execution, I thought it made sense to start there. I also think it’s a better way to understand them.

This early iteration also explains why BALs have that name, which seems like a misnomer to me. It’s not 100% wrong, since a write is also an access, but it feels slightly off. In any case, it’s way too late to change it now.

Thanks for reading! In the next article, we’ll explore EIP-8024 (Backward compatible SWAPN, DUPN, EXCHANGE), another EIP that will (probably) be included in Glamsterdam, and the one that, fingers crossed, will kill Solidity’s infamous “stack too deep” error for good.

If you'd like to be notified when new articles are published, you can subscribe to the blog.

Headliners are the flagship features included in a network upgrade. To learn more about how they are decided, check this Ethereum Magicians post.
↩
We could also execute tx 4 immediately after tx 2 finishes. For simplicity we are using execution batches here.
↩
I’m playing loose with the terminology here. The block builder and the block proposer could be different entities, as is in fact usually the case. I’m using them as synonyms here because it simplifies the explanation.
↩
I don’t know which of these benefits has the bigger impact. It doesn’t seem obvious to me that parallel execution results in a bigger speedup than I/O prefetch. Hopefully we’ll have actual numbers on this at some point.
↩
The first time I was learning about BALs I was surprised no one mentioned state bloat. Surely that was another downside? But BALs don’t become a permanent part of the blockchain state and, as explained in the User-facing changes section, they can be pruned after ~2 weeks.
↩
Optional access lists were introduced in EIP-2930 mainly to allow “unbricking” contracts that could become unusable after certain gas repricings included in the same upgrade. The explanation is complex, but see this old (and outdated) article of mine if you are interested.
↩
Or at least the earliest linked document in the EIP website. Apparently, the idea had been toyed with before, for example in this 2021 post.
↩

Cethology

Understanding Block-Level Access Lists

Parallel execution? You’ve got to have BALs