# Advanced SQL Query Optimization for Superchain Analysis in Tokenflow

By [Wolfcito](https://paragraph.com/@wolfcito-2) · 2024-10-16

---

### **Introduction**

Recently, I delved into the world of [**Optimism**](https://www.optimism.io/), specifically focusing on the [**Superchain**](https://www.superchain.eco/), and explored the advantages of tools like [**TokenFlow**](https://tokenflow.live/), which facilitate the visualization of metrics and complex analyses. Leveraging my knowledge in SQL, I decided to create small widgets and graphs that simplify the visualization of the **four pillars** of Superchain analysis by displaying transactions, active addresses, deployed contracts, and fees paid.

During this process, I compared one of my queries with others already in existence and found significant differences in performance. This led me to reflect on the importance of properly structuring a widget for TokenFlow from the start, ensuring optimization that better utilizes the platform’s resources. This would not only improve overall performance but also directly impact visualization times. Below, I describe the use cases, the results obtained, and the optimizations applied.

* * *

### **Explanation and Justification of the Queries**

While analyzing and constructing SQL queries for the Superchain blockchains, I implemented several optimizations to improve performance without compromising accuracy. Here I detail the key optimizations:

1.  **Selecting specific columns**:
    
    *   Instead of using `SELECT *`, I selected only the necessary columns for each pillar, such as `block_date`, `gas_used`, `effective_gas_price`, among others. This reduces the amount of data processed and lightens the system load.
        
2.  **Optimized date filtering**:
    
    *   I applied the filter `block_date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())` directly in each subquery. This way, each dataset is filtered before performing joins or combinations, avoiding unnecessary processing.
        
3.  **Using SUM with CASE**:
    
    *   To aggregate data by blockchain, I used `SUM(CASE WHEN chain = 'Zora' THEN daily_transactions ELSE 0 END)`. This allows aggregations efficiently within the same query without resorting to operations like `PIVOT`, improving performance.
        
4.  **Using UNION ALL instead of JOIN**:
    
    *   By using `UNION ALL` to combine data from different blockchains, we avoid duplicating rows and maintain the simplicity of the query. This is especially useful in environments with large data volumes where `JOIN` operations can be costly.
        

* * *

### **Query Optimization: Use Cases**

#### **Use Case 1: Daily Transactions**

**Unoptimized query**:The query combined data from all blockchains into a single set before counting daily transactions, which resulted in less efficient performance.

    WITH superchain_transactions AS (
        SELECT *, 'Zora' AS chain FROM zora.core.TRANSACTIONS
        UNION ALL
        SELECT *, 'OP Mainnet' AS chain FROM optimism.core.transactions
        UNION ALL 
        SELECT *, 'Mint' AS chain FROM mint.core.transactions
        UNION ALL
        SELECT *, 'Mode' AS chain FROM mode.core.transactions
    )
    
    SELECT * FROM (
        SELECT chain,
               block_date AS date,
               COUNT(*) AS "Daily Transactions"
        FROM superchain_transactions
        WHERE block_date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())
        GROUP BY chain, block_date
        ORDER BY block_date, chain
    ) 
    PIVOT (
        MAX("Daily Transactions") FOR chain IN ('Zora', 'OP Mainnet', 'Mint', 'Mode')
    ) AS p(date, zora, "OP Mainnet", mint, mode)
    ORDER BY date;
    

**Result**:

*   **94 rows** / **2.39 KB** in **5,879 ms**
    

**Optimized query**:In this version, each set of transactions is aggregated separately and then combined, avoiding unnecessary processing.

    WITH superchain_transactions AS (
        SELECT block_date, COUNT(*) AS daily_transactions, 'Zora' AS chain
        FROM zora.core.transactions
        WHERE block_date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())
        GROUP BY block_date
    
        UNION ALL
    
        SELECT block_date, COUNT(*) AS daily_transactions, 'OP Mainnet' AS chain
        FROM optimism.core.transactions
        WHERE block_date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())
        GROUP BY block_date
    
        UNION ALL
    
        SELECT block_date, COUNT(*) AS daily_transactions, 'Mint' AS chain
        FROM mint.core.transactions
        WHERE block_date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())
        GROUP BY block_date
    
        UNION ALL
    
        SELECT block_date, COUNT(*) AS daily_transactions, 'Mode' AS chain
        FROM mode.core.transactions
        WHERE block_date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())
        GROUP BY block_date
    )
    
    SELECT 
        block_date AS date, 
        SUM(CASE WHEN chain = 'Zora' THEN daily_transactions ELSE 0 END) AS zora_transactions,
        SUM(CASE WHEN chain = 'OP Mainnet' THEN daily_transactions ELSE 0 END) AS op_mainnet_transactions,
        SUM(CASE WHEN chain = 'Mint' THEN daily_transactions ELSE 0 END) AS mint_transactions,
        SUM(CASE WHEN chain = 'Mode' THEN daily_transactions ELSE 0 END) AS mode_transactions
    FROM 
        superchain_transactions
    GROUP BY 
        block_date
    ORDER BY 
        block_date;
    

**Result**:

*   **94 rows** / **2.39 KB** in **1,038 ms**
    

**Recommended Chart**: Stacked bars

*   **X-axis**: `date`
    
*   **Y-axis**: `daily_transactions`
    
*   **Differentiation**: By blockchain (`Zora`, `OP Mainnet`, `Mint`, `Mode`)
    

![Daily Transactions](https://storage.googleapis.com/papyrus_images/7cfc0c24fbf5ef4e4b757621ed3ffc17d6c9027127d22a39b5a02336bebe5688.png)

Daily Transactions

[https://app.tokenflow.live/studio/editor/67060f50ed220e424ffd20ea](https://app.tokenflow.live/studio/editor/67060f50ed220e424ffd20ea)

#### **Use Case 2: Active Addresses**

**Unoptimized query**:

    WITH superchain_calls AS (
        SELECT *, 'Zora' AS chain FROM zora.core.calls
        UNION ALL 
        SELECT *, 'OP Mainnet' AS chain FROM optimism.core.calls
        UNION ALL
        SELECT *, 'Mint' AS chain FROM mint.core.calls
        UNION ALL
        SELECT *, 'Mode' AS chain FROM mode.core.calls
    )
    
    SELECT * FROM (
        SELECT chain,
               block_date AS date,
               COUNT(DISTINCT from_address) AS "Active Addresses"
        FROM superchain_calls
        WHERE block_date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())
        GROUP BY chain, block_date
    ) 
    PIVOT (
        MAX("Active Addresses") FOR chain IN ('Zora', 'OP Mainnet', 'Mint', 'Mode')
    ) AS p(date, zora, "OP Mainnet", mint, mode)
    ORDER BY date;
    

**Result**:

*   **94 rows** / **2.39 KB** in **21,207 ms**
    

**Optimized query**:

    WITH superchain_calls AS (
        SELECT block_date, COUNT(DISTINCT from_address) AS active_addresses, 'Zora' AS chain
        FROM zora.core.calls
        WHERE block_date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())
        GROUP BY block_date
    
        UNION ALL
    
        SELECT block_date, COUNT(DISTINCT from_address) AS active_addresses, 'OP Mainnet' AS chain
        FROM optimism.core.calls
        WHERE block_date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())
        GROUP BY block_date
    
        UNION ALL
    
        SELECT block_date, COUNT(DISTINCT from_address) AS active_addresses, 'Mint' AS chain
        FROM mint.core.calls
        WHERE block_date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())
        GROUP BY block_date
    
        UNION ALL
    
        SELECT block_date, COUNT(DISTINCT from_address) AS active_addresses, 'Mode' AS chain
        FROM mode.core.calls
        WHERE block_date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())
        GROUP BY block_date
    )
    
    SELECT 
        block_date AS date, 
        SUM(CASE WHEN chain = 'Zora' THEN active_addresses ELSE 0 END) AS zora_active_addresses,
        SUM(CASE WHEN chain = 'OP Mainnet' THEN active_addresses ELSE 0 END) AS op_mainnet_active_addresses,
        SUM(CASE WHEN chain = 'Mint' THEN active_addresses ELSE 0 END) AS mint_active_addresses,
        SUM(CASE WHEN chain = 'Mode' THEN active_addresses ELSE 0 END) AS mode_active_addresses
    FROM 
        superchain_calls
    GROUP BY 
        block_date
    ORDER BY 
        block_date;
    

**Result**:

*   **94 rows** / **8.63 KB** in **11,231 ms**
    

**Recommended Chart**: Stacked bars

*   **X-axis**: `date`
    
*   **Y-axis**: `active_addresses`
    
*   **Differentiation**: By blockchain (`Zora`, `OP Mainnet`, `Mint`, `Mode`)
    

![Active addresses](https://storage.googleapis.com/papyrus_images/ebf8d678d7ea9314963ed693f709b89f0a0bbd95eec3e93db031e29147236b56.png)

Active addresses

[https://app.tokenflow.live/studio/editor/670df0cb6ae7711da7ef68ef](https://app.tokenflow.live/studio/editor/670df0cb6ae7711da7ef68ef)

* * *

### **Early Aggregation Approach**

In the optimized queries, I implemented the \*\*early

aggregation\*\* approach, performing aggregation operations such as counts and sums directly in the subqueries for each blockchain before combining the results. This contrasts with the unoptimized approach, where the data from all blockchains was first combined and then the aggregations were performed. This change offers several benefits:

1.  **Lower processing load**: By counting transactions or active addresses in each subquery, the size of the dataset processed afterward is reduced, minimizing memory and processing load.
    
2.  **Efficient use of indexes**: Filtering and aggregating at the source allows databases to make better use of their indexes, speeding up aggregation operations.
    
3.  **Reduction in combined data size**: Instead of combining large volumes of raw data, the optimized queries combine only the aggregated results, significantly reducing the size of the final dataset.
    

### **General Benefits of the Applied Optimizations**

The optimizations not only improved the performance of each query but also made the code cleaner, scalable, and easier to understand. Key benefits include:

1.  **Scalability**: The queries are designed to be scalable, grouping and filtering only the necessary data, which improves performance in large datasets.
    
2.  **Clarity in visualization**: The columns differentiated by blockchain make it easy to generate stacked charts and comparative visualizations of metrics such as transactions, active addresses, deployed contracts, and fees paid.
    
3.  **Reduction in execution time**: As observed in the use cases, execution time was drastically reduced in the optimized queries, enabling more complex analyses in less time and freeing up resources for other processes.
    

* * *

### **Use Case 3: Optimization Exception**

In this case, I analyzed the **deployed contracts** in the various Superchain blockchains. When applying the same optimizations, the improvement in performance was not as significant, making it an interesting exception.

#### **Unoptimized query**:

    WITH superchain_calls AS (
        SELECT *, 'Zora' AS chain FROM zora.core.calls WHERE call_type IN ('CREATE', 'CREATE2')
        UNION ALL
        SELECT *, 'OP Mainnet' AS chain FROM optimism.core.calls WHERE call_type IN ('CREATE', 'CREATE2')
        UNION ALL
        SELECT *, 'Mint' AS chain FROM mint.core.calls WHERE call_type IN ('CREATE', 'CREATE2')
        UNION ALL
        SELECT *, 'Mode' AS chain FROM mode.core.calls WHERE call_type IN ('CREATE', 'CREATE2')
    )
    
    SELECT * FROM (
        SELECT chain,
               block_date AS date,
               COUNT(*) AS "Contracts Deployed"
        FROM superchain_calls
        WHERE date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())
        GROUP BY chain, block_date
    ) 
    PIVOT (
        MAX("Contracts Deployed") FOR chain IN ('Zora', 'OP Mainnet', 'Mint', 'Mode')
    ) AS p(date, zora, "OP Mainnet", mint, mode)
    ORDER BY date;
    

**Result**:

*   **94 rows** / **2.39 KB** in **30,429 ms**
    

#### **Optimized query**:

    WITH superchain_calls AS (
        SELECT block_date, COUNT(*) AS contracts_deployed, 'Zora' AS chain
        FROM zora.core.calls
        WHERE call_type IN ('CREATE', 'CREATE2')
        AND block_date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())
        GROUP BY block_date
    
        UNION ALL
    
        SELECT block_date, COUNT(*) AS contracts_deployed, 'OP Mainnet' AS chain
        FROM optimism.core.calls
        WHERE call_type IN ('CREATE', 'CREATE2')
        AND block_date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())
        GROUP BY block_date
    
        UNION ALL
    
        SELECT block_date, COUNT(*) AS contracts_deployed, 'Mint' AS chain
        FROM mint.core.calls
        WHERE call_type IN ('CREATE', 'CREATE2')
        AND block_date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())
        GROUP BY block_date
    
        UNION ALL
    
        SELECT block_date, COUNT(*) AS contracts_deployed, 'Mode' AS chain
        FROM mode.core.calls
        WHERE call_type IN ('CREATE', 'CREATE2')
        AND block_date >= DATEADD('month', -{{n_months}}, CURRENT_DATE())
        GROUP BY block_date
    )
    
    SELECT 
        block_date AS date, 
        SUM(CASE WHEN chain = 'Zora' THEN contracts_deployed ELSE 0 END) AS zora_contracts_deployed,
        SUM(CASE WHEN chain = 'OP Mainnet' THEN contracts_deployed ELSE 0 END) AS op_mainnet_contracts_deployed,
        SUM(CASE WHEN chain = 'Mint' THEN contracts_deployed ELSE 0 END) AS mint_contracts_deployed,
        SUM(CASE WHEN chain = 'Mode' THEN contracts_deployed ELSE 0 END) AS mode_contracts_deployed
    FROM 
        superchain_calls
    GROUP BY 
        block_date
    ORDER BY 
        block_date;
    

**Result**:

*   **94 rows** / **9.36 KB** in **29,781 ms**
    

![Smart Contracts Deployed](https://storage.googleapis.com/papyrus_images/48abe78c973001a727f7bc351609341fc2db17f9feb13e3d69d31179f7a504a7.png)

Smart Contracts Deployed

[https://app.tokenflow.live/studio/editor/670dec7c6ae7711da7ef68ee](https://app.tokenflow.live/studio/editor/670dec7c6ae7711da7ef68ee)

#### **Exception Analysis**

Despite applying the same optimization techniques, the improvement was minimal (from **30,429 ms** to **29,781 ms**). Possible reasons:

1.  **Data size**: The volume of data was significantly smaller, so the optimization did not yield a noticeable improvement in execution time.
    
2.  **Calculation complexity**: The counting of deployed contracts (`COUNT(*)`) was already handled efficiently, and avoiding the use of `PIVOT` did not provide significant improvement.
    
3.  **Processing transaction types**: The filter by `call_type IN ('CREATE', 'CREATE2')` is straightforward and doesn’t introduce much overhead, limiting the optimization potential.
    

* * *

### **Final Analysis Conclusion**

In this article, I explored the optimizations applied to SQL queries for analyzing key metrics in Superchain blockchains, focusing on **daily transactions**, **active addresses**, and **deployed contracts**. In most cases, the **early aggregation** approach and other techniques significantly optimized performance.

However, the analysis of **deployed contracts** presented an exception, reminding us that optimizations won’t always have the same impact across all contexts and types of queries.

### **Summary of Pillars and Results**

*   **Daily Transactions**: Execution time reduced from **5,879 ms** to **1,038 ms** (84% improvement).
    
*   **Active Addresses**: Execution time reduced from **21,207 ms** to **11,231 ms** (47% improvement).
    
*   **Deployed Contracts** (Exception): Minimal reduction from **30,429 ms** to **29,781 ms** (2% improvement).
    

### **Summary of Applied Optimizations**

1.  **Selecting specific columns**.
    
2.  **Optimized date filtering**.
    
3.  **Using** `SUM` with `CASE` to avoid `PIVOT`.
    
4.  **Efficient data combination with** `UNION ALL`.
    

### **Overall Conclusion**

The applied optimizations, such as **early aggregation** and careful column selection, proved effective in most cases. These improvements not only reduce execution time but also make the queries more readable and maintainable. While there will always be cases where the impact is smaller, this analysis shows how small adjustments can lead to significant performance improvements in environments with large datasets like blockchain analysis.

---

*Originally published on [Wolfcito](https://paragraph.com/@wolfcito-2/advanced-sql-query-optimization-for-superchain-analysis-in-tokenflow)*
