Crypto-Native AI Agent framework?

It’s evident that the AI and Crypto communities are almost entirely separate ecosystems. Vitalik’s peer and zkLLM creator Haochen, after open-sourcing the implementation of zkLLM, also released an letter to the Web3 community (

https://cs.uwaterloo.ca/~h299sun/publication/ccs2024-zkllm/web3.html

), he emphasized:：

I am not a member of the Web3/Blockchain community, nor am I familiar with it. Please do not expect me to behave like a member of your community.

This divergence has resulted in significant information asymmetry between the Crypto and AI communities. On one hand, AI frameworks like LangChain have become so mature—arguably to the point of being overly complex—that numerous frameworks claiming to be “Simple RAG” have emerged. On the other hand, the Crypto community has produced a multitude of redundant AI Agent frameworks, such as Eliza and Swarms, which have gained relatively high valuations within the community (compared to LangChain). Thus, we observe substantial value asymmetry and opacity here.

The primary goal of this article is to outline what I believe a Crypto-Friendly or even Crypto-Native AI Agent Framework should look like.

What Is an AI Agent?

Base LLM Model

Examples include GPT-4o, locally hosted Llama 3.2, or uncensored Llama 3.2. The base model can essentially be seen as a massive dataset.

2.Loader/Runtime for the Base Model

We now see various loaders, such as WASM-based loaders that allow models to run directly in the browser with GPU support, or HVM-based loaders that are compatible with multiple backends like CUDA and WebGPU. Most loaders, however, remain traditionally CUDA- or OpenCL-based.

3. Checkpointer

This serves as the memory system for conversation context. When workflows involving large models become sufficiently complex, we need a checkpointer to store intermediate states. This can be simplified as a database, such as Postgres or Redis.

4. RAG (Retrieval-Augmented Generation)

RAG essentially functions as a database, but the difference is that it vectorizes data using an embedding model (e.g., the popular Snowflake Arctic Embed) during storage and uses this as an index. Queries are also processed based on vectorized input text for data matching.

5. Vector Database

A vector database is essential for efficient RAG. The challenge lies in the significant CPU or GPU resources required for approximate vector matching. However, some methodologies suggest that a sufficiently large memory system could eliminate the need for a specialized database.

6. Tool Calling

This is the simplest part and currently the focus of many large model frameworks in the Crypto community. Essentially, it enables the LLM to call external functions, thereby interacting with social networks or blockchains.

How Can It Be Decentralized?

After identifying the six components of an AI Agent, we can easily pinpoint areas that could be decentralized or made verifiable:

1. Decentralization of Base LLM Models

AO has made some progress here by hosting Llama 2-8b and a WASM loader on Arweave, allowing users to directly access them.

2. Decentralization of the Loader/Runtime

While AO has contributed in this area, significant issues remain, with the biggest being the lack of verifiability. This is where zkLLM or zkWASM may come into play:

• zkLLM can generate layered proofs for LLM computation (currently taking ~15 minutes).

• zkWASM fully supports zero-knowledge computation for WASM instruction sets.

zkLLM’s potential optimizations include:

• Using the Fiat-Shamir Heuristic to convert interactive proofs into NIZK (non-interactive zero-knowledge proofs).

• zkLLM computes and proves each layer independently, leaving significant room for decentralization optimizations.

3. Decentralized Checkpointers

Essentially, this boils down to decentralized databases. There are already excellent implementations in the community, such as Filecoin, Arweave, or pure IPFS.

4. Decentralized RAG and Vector Databases

Currently, no usable solutions exist in the Crypto community. While

@Glacier_Labs

claims to have an implementation, I tested it a few days ago and found it non-functional. It seems they merely integrated some APIs, and I couldn’t identify any effective solution to the vector computation problem in their design.

5. Tool Calling

This is currently the most trending aspect in the Crypto community. While third-party tools undoubtedly improve framework usability, the moat here is quite narrow. Migrating a framework (e.g., Swarms) to integrate with another tool (e.g., Eliza) typically requires fewer than 100 lines of code.

Based on the above, our expectations for a Crypto-Native AI Agent Framework include:

1. The LLM model itself should support decentralized hosting and WASM-based loading.

2. The LLM model’s input and output should be verifiable using techniques such as LLM watermarking or zkLLM (an ecosystem and implementations are currently lacking).

3. Checkpointers should adopt decentralized database solutions wherever possible.

4. We eagerly await the emergence of a truly usable decentralized vector database.

https://cs.uwaterloo.ca/~h299sun/publication/ccs2024-zkllm/web3.html

), he emphasized:：

I am not a member of the Web3/Blockchain community, nor am I familiar with it. Please do not expect me to behave like a member of your community.

The primary goal of this article is to outline what I believe a Crypto-Friendly or even Crypto-Native AI Agent Framework should look like.

What Is an AI Agent?

Base LLM Model

Examples include GPT-4o, locally hosted Llama 3.2, or uncensored Llama 3.2. The base model can essentially be seen as a massive dataset.

2.Loader/Runtime for the Base Model

3. Checkpointer

4. RAG (Retrieval-Augmented Generation)

5. Vector Database

6. Tool Calling

How Can It Be Decentralized?

After identifying the six components of an AI Agent, we can easily pinpoint areas that could be decentralized or made verifiable:

1. Decentralization of Base LLM Models

AO has made some progress here by hosting Llama 2-8b and a WASM loader on Arweave, allowing users to directly access them.

2. Decentralization of the Loader/Runtime

While AO has contributed in this area, significant issues remain, with the biggest being the lack of verifiability. This is where zkLLM or zkWASM may come into play:

• zkLLM can generate layered proofs for LLM computation (currently taking ~15 minutes).

• zkWASM fully supports zero-knowledge computation for WASM instruction sets.

zkLLM’s potential optimizations include:

• Using the Fiat-Shamir Heuristic to convert interactive proofs into NIZK (non-interactive zero-knowledge proofs).

• zkLLM computes and proves each layer independently, leaving significant room for decentralization optimizations.

3. Decentralized Checkpointers

Essentially, this boils down to decentralized databases. There are already excellent implementations in the community, such as Filecoin, Arweave, or pure IPFS.

4. Decentralized RAG and Vector Databases

Currently, no usable solutions exist in the Crypto community. While

@Glacier_Labs

5. Tool Calling

Based on the above, our expectations for a Crypto-Native AI Agent Framework include:

1. The LLM model itself should support decentralized hosting and WASM-based loading.

2. The LLM model’s input and output should be verifiable using techniques such as LLM watermarking or zkLLM (an ecosystem and implementations are currently lacking).

3. Checkpointers should adopt decentralized database solutions wherever possible.

4. We eagerly await the emergence of a truly usable decentralized vector database.

Elder Ryan

Elder Ryan

3 comments

Elder Ryan

Elder Ryan

3 comments

Crypto-Native AI Agent framework?

Crypto-Native AI Agent framework?

3 comments

3 comments