
Analysis of Ethereum2 Consensus Clients
Ethereum is moving towards a major upgrade that aims at making the network more sustainable, with the transition from Proof-of-Work (PoW) to Proof-of-Stake, and more scalable with the introduction of data sharding. This process started with the deployment of the Beacon Chain in December 2020 and the next step called the Merge which is expected to happen later this year. In this article we look at how far the Ethereum 2 ecosystem has progressed in this transition and how ready is to move to th...

Validators or value-takers?
Diving into the pools and dark forests of PoS Ethereum“The panda will never fulfill his destiny, nor you yours, until you let go of the illusion of control.” - Master OogwayIntroductionIt is not often that fate provides us blockchain analysts with an event as pivotal and rich in data as the Ethereum merge. For this reason, we wasted no time merging (pun intended) minds from Metrika and Miga Labs to assemble a crack team of analysts and engineers ready to delve into this fount of data. Our int...

CL Client Rewards Analysis
When it comes to running a validator in the Ethereum ecosystem, especially after The Merge, it is important to measure its performance, as this will directly impact how many rewards it obtains. Therefore, we have analyzed how many rewards validators obtain, in order to get some hints of their performance in the network.IntroductionFrom a hardware perspective, running a validator in the Ethereum ecosystem requires, nowadays, two different clients. The execution layer (EL) client is in charge o...
Miga Labs is a research group specialized in next-generation Blockchain technology, focused on consensus protocols and p2p networks.

Analysis of Ethereum2 Consensus Clients
Ethereum is moving towards a major upgrade that aims at making the network more sustainable, with the transition from Proof-of-Work (PoW) to Proof-of-Stake, and more scalable with the introduction of data sharding. This process started with the deployment of the Beacon Chain in December 2020 and the next step called the Merge which is expected to happen later this year. In this article we look at how far the Ethereum 2 ecosystem has progressed in this transition and how ready is to move to th...

Validators or value-takers?
Diving into the pools and dark forests of PoS Ethereum“The panda will never fulfill his destiny, nor you yours, until you let go of the illusion of control.” - Master OogwayIntroductionIt is not often that fate provides us blockchain analysts with an event as pivotal and rich in data as the Ethereum merge. For this reason, we wasted no time merging (pun intended) minds from Metrika and Miga Labs to assemble a crack team of analysts and engineers ready to delve into this fount of data. Our int...

CL Client Rewards Analysis
When it comes to running a validator in the Ethereum ecosystem, especially after The Merge, it is important to measure its performance, as this will directly impact how many rewards it obtains. Therefore, we have analyzed how many rewards validators obtain, in order to get some hints of their performance in the network.IntroductionFrom a hardware perspective, running a validator in the Ethereum ecosystem requires, nowadays, two different clients. The execution layer (EL) client is in charge o...
Miga Labs is a research group specialized in next-generation Blockchain technology, focused on consensus protocols and p2p networks.

Subscribe to migalabs

Subscribe to migalabs
<100 subscribers
<100 subscribers
Share Dialog
Share Dialog


We are more than happy to finally release our Ethereum2 crawler! A tool that monitors and scans the Ethereum2 main network.
In this post we will explain the different challenges we have faced and strategies we have built to create this crawler.
During these months we have gained knowledge about the network and the performance of the peers participating in it. Our main goal is to get the gathered data and share it with the community, so everyone is able to use it to understand the ecosystem better.
“A tool that monitors and scans the Ethereum2 main network.”
Ant crawler is a tool that builds a peer status database by attempting to connect to all peers in the Ethereum2 mainnet, which is a peer-to-peer distributed network. Each peer is composed of a beacon node and, optionally, one or more validator nodes.

New peers are extracted from the discovery5 service, a component which keeps discovering new peers in the network. After a peer is discovered, a connection is attempted and their metadata (e.g: which client they use) is requested.
Keep in mind that, as explained, Ethereum2 is operated on top of a distributed network, so there is no single source of truth where to get this data from. Instead, we sample the network by connecting to each of the peers in it.
This could mean that some segment of the network is not scanned, due to peers refusing our connections, firewall rules, or even peer discovery issues.
With this, our intention is to build a consolidated database which can serve others to analyze and extract useful information for the ecosystem. We understand this is a changing environment so we try our best to understand what we do, in order to adapt and offer the most accurate data possible.
Our goal is to show information about the Ethereum2 mainnet.
Under the Ethereum network, we can find several networks (these include the mainnet and some testnets). However, we are only showing the metrics from the Ethereum2 mainnet, as this is the one in production and is the one used by the whole ecosystem as a primary chain.
One of the most important metrics we would like to expose to the community is client diversity, the ability to measure which percentage of the network is covered by each client.
As for now, there are 6 main Ethereum clients: Prysm, Lighthouse, Teku, Nimbus, Lodestar and Grandine. Each client has taken a different approach to implement the Ethereum2 specifications.
The current distribution shows a strong dominance of the Prysm client:

To do this, we attempt to connect to each of the discovered peers (represents a client) and then request this information. This successful request is what we call to identify the client.
Our main goal is to get all the gathered data and share it with the community, so everyone is able to use it to understand the ecosystem better.
Until now, our main goal was to identify new peers and store them in our database. However, there is an important thing to take into account: software updates.
All six clients usually deploy new updates of their software, which means clients need to be stopped, updated and then resumed. When this update happens, the ID of the peer is often refreshed, as the private key is most of the times generated again.
Also, it could happen that a node is temporarily down for external reasons.
This means that, even though we have identified a peer, we need to keep track of both: its status and its ID. This is why we have implemented a pruning strategy.
This strategy keeps connecting to the different peers at a certain frequency, identified, not identified and new ones. However, if we find out that a peer has been a certain period of time (e.g., one day) without being identified, we do not consider it for our dashboard metrics, as we are uncertain if it is still participating in the network (maybe it is alive but not answering) or if it is not alive anymore.
We call this state “deprecated”, when a peer has been more than 1024 minutes (about 17 hours) without being successfully identified.
Currently, the percentage of deprecated peers is:

This means that 70% of discovered peers have not been successfully identified for more than 17 hours according to our crawler. These are most likely old peers that have renewed their ID after a software client update. Some others have been down for a while and some were never identified.
When attempting to connect a peer, we have identified different error responses, which we have categorized as the following:

Each of the above errors give us a hint about the target peer. So, there are errors which suggest the peer is no longer active, or that the peer is temporarily down, or that there is a punctual error.
The behavior we have observed is that it can happen both, that the peer always responds with the same error answer, but also that the peer returns a different error every time we attempt to connect to it.
We have observed that “I/O timeout” represents around 40–50% of our peer database (~= 6000 / 14000). Currently, only 800 out of the 6000 have been identified.
In order to build a consolidated client distribution metric, we continuously explore the Ethereum2 network looking for new peers (through the Discovery5 component). But, we also monitor already found peers, as explained before.
As a first approach, one could try to continuously connect to all of them. However, we realized that this is not needed, as just checking their status regularly is largely enough. We are also taking this new approach so as not to flood the network unnecessarily.
So, we have implemented a certain delay to each different case based on the response. The main idea is not to collapse any peer with our requests, but still regularly monitor them. This way, peers are only connected after a certain delay (next connection time).
Therefore, for each error category, we assign a different delay depending on if we think it could be alive, or if we think it is probably down or malfunctioning.

We always connect with new peers first (blue category) and, after that, we put them in one of the other categories (depending on the response).
This way, every time we connect with a peer, we set a delay time representing the next time we will try to connect with it. If there are consecutive negative responses, the delay grows bigger every time as the probability of establishing a connection decreases (exponential decay).
The average error category distribution is the following:

70% of discovered peers have been more than 17 hours without being connected/identified.
In order to deploy a local instance of our crawler please refer to our Github!
“Our goal is to show information about the Ethereum2 mainnet.”
We have deployed two API endpoints:
Crawler List: This API will show our currently running crawlers
Client Distribution: This API will show the client distribution detected by each of the running crawlers.
Please feel free to visit our website for more information about how to use them!
Also, be welcome you to use our Live Dashboard. Here you will find useful information about the Ethereum2 mainnet. We currently expose statistics about geographical node distribution, discovered peers, client distribution…
We are glad to release this tool to the community and we welcome everyone to test it themselves.
But, apart from being able to release a new version of the Armiarma tool, we have gathered a lot of knowledge about the performance and interaction of the different peers participating in the Ethereum 2 network.
Our philosophy is that the more information and the better understanding we have, the more the ecosystem will improve and grow.
With this post, we aim to describe our experience, the challenges we have faced and, overall, offer more data to the community about our research.
We would like to strongly thank the Ethereum Foundation (@ethereum) for their great collaboration and guidance in this first release, which has been an enormous help during this journey.
Also thanks to @alrevuelta for his amazing support and contribution to the project.
We hope this post is useful for you and thank you for your attention!
Follow us for more information and updates:
We are more than happy to finally release our Ethereum2 crawler! A tool that monitors and scans the Ethereum2 main network.
In this post we will explain the different challenges we have faced and strategies we have built to create this crawler.
During these months we have gained knowledge about the network and the performance of the peers participating in it. Our main goal is to get the gathered data and share it with the community, so everyone is able to use it to understand the ecosystem better.
“A tool that monitors and scans the Ethereum2 main network.”
Ant crawler is a tool that builds a peer status database by attempting to connect to all peers in the Ethereum2 mainnet, which is a peer-to-peer distributed network. Each peer is composed of a beacon node and, optionally, one or more validator nodes.

New peers are extracted from the discovery5 service, a component which keeps discovering new peers in the network. After a peer is discovered, a connection is attempted and their metadata (e.g: which client they use) is requested.
Keep in mind that, as explained, Ethereum2 is operated on top of a distributed network, so there is no single source of truth where to get this data from. Instead, we sample the network by connecting to each of the peers in it.
This could mean that some segment of the network is not scanned, due to peers refusing our connections, firewall rules, or even peer discovery issues.
With this, our intention is to build a consolidated database which can serve others to analyze and extract useful information for the ecosystem. We understand this is a changing environment so we try our best to understand what we do, in order to adapt and offer the most accurate data possible.
Our goal is to show information about the Ethereum2 mainnet.
Under the Ethereum network, we can find several networks (these include the mainnet and some testnets). However, we are only showing the metrics from the Ethereum2 mainnet, as this is the one in production and is the one used by the whole ecosystem as a primary chain.
One of the most important metrics we would like to expose to the community is client diversity, the ability to measure which percentage of the network is covered by each client.
As for now, there are 6 main Ethereum clients: Prysm, Lighthouse, Teku, Nimbus, Lodestar and Grandine. Each client has taken a different approach to implement the Ethereum2 specifications.
The current distribution shows a strong dominance of the Prysm client:

To do this, we attempt to connect to each of the discovered peers (represents a client) and then request this information. This successful request is what we call to identify the client.
Our main goal is to get all the gathered data and share it with the community, so everyone is able to use it to understand the ecosystem better.
Until now, our main goal was to identify new peers and store them in our database. However, there is an important thing to take into account: software updates.
All six clients usually deploy new updates of their software, which means clients need to be stopped, updated and then resumed. When this update happens, the ID of the peer is often refreshed, as the private key is most of the times generated again.
Also, it could happen that a node is temporarily down for external reasons.
This means that, even though we have identified a peer, we need to keep track of both: its status and its ID. This is why we have implemented a pruning strategy.
This strategy keeps connecting to the different peers at a certain frequency, identified, not identified and new ones. However, if we find out that a peer has been a certain period of time (e.g., one day) without being identified, we do not consider it for our dashboard metrics, as we are uncertain if it is still participating in the network (maybe it is alive but not answering) or if it is not alive anymore.
We call this state “deprecated”, when a peer has been more than 1024 minutes (about 17 hours) without being successfully identified.
Currently, the percentage of deprecated peers is:

This means that 70% of discovered peers have not been successfully identified for more than 17 hours according to our crawler. These are most likely old peers that have renewed their ID after a software client update. Some others have been down for a while and some were never identified.
When attempting to connect a peer, we have identified different error responses, which we have categorized as the following:

Each of the above errors give us a hint about the target peer. So, there are errors which suggest the peer is no longer active, or that the peer is temporarily down, or that there is a punctual error.
The behavior we have observed is that it can happen both, that the peer always responds with the same error answer, but also that the peer returns a different error every time we attempt to connect to it.
We have observed that “I/O timeout” represents around 40–50% of our peer database (~= 6000 / 14000). Currently, only 800 out of the 6000 have been identified.
In order to build a consolidated client distribution metric, we continuously explore the Ethereum2 network looking for new peers (through the Discovery5 component). But, we also monitor already found peers, as explained before.
As a first approach, one could try to continuously connect to all of them. However, we realized that this is not needed, as just checking their status regularly is largely enough. We are also taking this new approach so as not to flood the network unnecessarily.
So, we have implemented a certain delay to each different case based on the response. The main idea is not to collapse any peer with our requests, but still regularly monitor them. This way, peers are only connected after a certain delay (next connection time).
Therefore, for each error category, we assign a different delay depending on if we think it could be alive, or if we think it is probably down or malfunctioning.

We always connect with new peers first (blue category) and, after that, we put them in one of the other categories (depending on the response).
This way, every time we connect with a peer, we set a delay time representing the next time we will try to connect with it. If there are consecutive negative responses, the delay grows bigger every time as the probability of establishing a connection decreases (exponential decay).
The average error category distribution is the following:

70% of discovered peers have been more than 17 hours without being connected/identified.
In order to deploy a local instance of our crawler please refer to our Github!
“Our goal is to show information about the Ethereum2 mainnet.”
We have deployed two API endpoints:
Crawler List: This API will show our currently running crawlers
Client Distribution: This API will show the client distribution detected by each of the running crawlers.
Please feel free to visit our website for more information about how to use them!
Also, be welcome you to use our Live Dashboard. Here you will find useful information about the Ethereum2 mainnet. We currently expose statistics about geographical node distribution, discovered peers, client distribution…
We are glad to release this tool to the community and we welcome everyone to test it themselves.
But, apart from being able to release a new version of the Armiarma tool, we have gathered a lot of knowledge about the performance and interaction of the different peers participating in the Ethereum 2 network.
Our philosophy is that the more information and the better understanding we have, the more the ecosystem will improve and grow.
With this post, we aim to describe our experience, the challenges we have faced and, overall, offer more data to the community about our research.
We would like to strongly thank the Ethereum Foundation (@ethereum) for their great collaboration and guidance in this first release, which has been an enormous help during this journey.
Also thanks to @alrevuelta for his amazing support and contribution to the project.
We hope this post is useful for you and thank you for your attention!
Follow us for more information and updates:
No activity yet