Data Agregation and its role in blockchain oracles

This issue is dedicated to Data Agregation and its role in blockchain oracles. This information is not an easy read, so if you are not an experienced miner, I advise you to first read the previous issues:

1 Data

Data is recorded information; a representation of facts, concepts, or instructions in a form suitable for communication, interpretation, or processing by humans or automated means.

When talking about blockchain oracles, by data we mean prices of various assets, tokens, and coins delivered from the outside world directly to the blockchain.

Let's look at the data received by blockchain users using the RedStone Oracle as an example.

The price feeds provided to RedStone clients come from a variety of sources. These include exchanges such as Binance and Coinbase, decentralized exchanges (DEXs) such as Uniswap and Sushiswap, and price aggregators such as CoinMarketCap and CoinGecko.

RedStone currently has over 150 integrated sources. The data is aggregated by independent nodes operated by data providers using a variety of methodologies.

Some methods include median, TWAP, and LWAP, which are designed to produce the most accurate price based on factors such as the amount of liquidity available and the average price over a given time frame.

In addition, RedStone has implemented data quality measures such as unexpected value detection (outlier detection) to ensure the correctness of the data.

2 Data aggregation

Data aggregation is the process of collecting different values (usually from different sources) and summing them (usually into a single value). A simple example is collecting ETH/USD price data from multiple exchanges and calculating an average.

Data aggregation itself is one of the main ways to improve the quality of Oracle services However, the quality of the data provided by the Oracle service depends on two main criteria:

Data availability - this means that Oracle data must always be available to end users (or smart contracts) and must be updated with the promised frequency.
Data correctness - can be defined in different ways and usually depends on the type of data. For example, the correctness of objective data (e.g. the results of a particular football match) can be easily verified, but with less objective data (e.g. the price of an ETH token expressed in US dollars) it can be much more difficult to determine the correctness.

3 Methods of Data Aggregation

3.1 Average price

The first aggregation algorithm that comes to mind is the average. It is very simple and may look quite “fair”, but in reality it has a significant flaw, as it is not robust to manipulation by even a small subset of corrupt sources.

For example, let’s say you want to get the ETH/USD value from 5 different exchanges, where 4 of them claim that the current price is around $2000, but one of them insists that it is only $1. Then the average is ~$1600, which is too far off to be considered correct.

3.2 Median price value

There is another approach that uses the median value calculation. It is much better than the average and is definitely more resistant to manipulation by corrupt sources. However, even this method is not a perfect way to calculate the value.

As an example, let's say you take the same ETH/USD value from one large crypto exchange (ETH/USD daily trading volume is $100 million) and 4 small ones (ETH/USD daily trading volume is ~$10k), and the large exchange gives a value of $2000, and all the small ones give a value of less than $1900. Then the aggregated median value in this case will be less than $1900, but as you can guess, it is not close enough to the "real" market value.

3.3 Volume Weighted Average Price (VWAP)

The next and one of the best aggregation methods is the calculation of the volume-weighted average price. As the name suggests, this is a trade-based price determination that takes into account the different trading volumes of different sources. The higher the trading volume of a source, the greater the weight of its price value.

3.4 Time-Weighted Average Price (TWAP)

Another common price aggregation method is based on a weighted average price, with weights determined by a time criterion. This is especially useful for calculating price values based only on decentralized exchanges. Many DEXs even offer their own TWAP-based oracle solutions (example: Uniswap TWAP oracle).

But beyond DEX-based oracles, this method can be used to make market manipulation more difficult when there are limited data sources. RedStone uses TWAP to make price data for low-liquidity assets more stable and reliable.

4 The Role of Data Aggregation in the Operation of Blockchain Oracles

As many of you probably already realized, the role of data aggregation is incredibly important.

The ideal price value does not depend on the requested order (amount, buy/sell type) and must take into account the order books of each available exchange with all the associated fees. This is quite difficult to calculate. Fortunately, a good enough price value does not necessarily have to be perfect by definition. And some combination of aggregation algorithms described above can work great for most use cases in the DeFi space.

Ultimately, we can say that data aggregation plays a key role in the operation of blockchain oracles, namely in the following aspects:

Increasing data reliability.
Ensuring decentralization.
Reducing volatility and eliminating anomalies.
Optimizing performance.
Creating composite data.
Supporting specialized solutions.

Website: http://redstone.finance

Blog: http://blog.redstone.finance

Twitter: https://x.com/redstone_defi

Discord: https://discord.com/invite/redstonedefi

Docs: https://docs.redstone.finance

Won Chong