This paper was written in collaboration with Shlok Khemani from Decentralised.
While the euphoria around this sector has died down in recent weeks, the impact of AI on technology and society is unlike anything we've seen since the internet. If crypto is to be the financial rails of the future, as we predict it would be, then its intertwining with AI is going to be a recurring theme, rather than a one-off.
One of the more interesting categories of projects that emerged from this wave was crypto-native AI agent frameworks. They are a fascinating experiment in bringing blockchain's core tenets—permissionless value transfer, transparency, and aligned incentives—to AI development. Their open-source nature gives us a rare chance to peek under the hood and analyze not just what they promise, but how they actually work.
In this piece, we start by unpacking what agent frameworks actually are and why they matter. We then tackle the elephant in the room: why do we need crypto-native frameworks when established options like LangChain exist? To this end, we analyze the leading crypto-native frameworks and their strengths and limitations across different use cases. Finally, if you’re building an AI agent, we'll help you decide which framework might suit your needs. Or, whether you should be building with frameworks at all.
Let’s dive in.
"Civilization advances by extending the number of important operations which we can perform without thinking about them." - Alfred North Whitehead
Think about how our ancestors lived. Every family had to grow their own food, make their own clothes, and build their own shelter. They spent endless hours on basic survival tasks, leaving little time for anything else. Even two centuries ago, nearly 90% of humans worked in agriculture. Today, we get our food from supermarkets, live in homes built by specialists, and wear clothes manufactured in distant factories. Tasks that once consumed generations of human effort have become simple transactions. Today, only 27% of the global population works in agriculture (dropping to under 5% in developed nations).
When we start mastering a new technology, familiar patterns emerge. We begin by understanding the fundamentals—what works, what doesn't, and which patterns keep appearing. Once these patterns become clear, we package them into abstractions that prove easier, faster, and more reliable. These abstractions free up time and resources to tackle more diverse and meaningful challenges. The same can be said about building software.
Take web development, for instance. In its early days, developers wrote everything from scratch—handling HTTP requests, managing state, and creating UIs—tasks that were both complex and time-intensive. Then frameworks like React emerged, dramatically simplifying these challenges by providing useful abstractions. Mobile development followed a similar path. Initially, though, developers needed deep, platform-specific knowledge, until tools like React Native and Flutter arrived, letting them write code once and deploy it everywhere.
The same pattern of increasing abstraction is playing out in machine learning. In the early 2000s, researchers spotted the potential of GPUs for ML workloads. At first, developers had to wrestle with graphics primitives and languages like OpenGL's GLSL— tools that weren't built for general computation. Everything changed in 2006 when NVIDIA launched CUDA, making GPU programming more approachable and bringing ML training to a wider developer base.
As ML development gained momentum, specialized frameworks emerged to abstract the complexities of GPU programming. TensorFlow and PyTorch let developers focus on model architecture rather than getting bogged down in low-level GPU code or implementation details. This led to faster iterations in model architectures and the rapid progress we’ve seen in AI/ML over the past few years.
We’re now seeing a similar evolution unfold with AI agents—a software program that can make decisions and take actions to achieve a goal, much like a human assistant or an employee would. It uses a large language model as its "brain" and can tap into different tools, like searching the web, making API calls, or accessing databases, to get things done.
To build an agent from scratch, developers would have to write complex code to handle every aspect: how the agent thinks through problems, how it decides what tool to use and when, how to interface with these tools, how it remembers context from earlier interactions, and how it breaks down big tasks into manageable steps. Each of these patterns would have to be figured out individually, leading to duplicated effort and inconsistent results.
That's where AI agent frameworks come in. Just as React simplified web development by handling the tricky bits of UI updates and state management, these frameworks tackle the common challenges in building AI agents. They provide ready-made components for the patterns we've discovered work well, like how to structure an agent's decision-making process, integrate different tools, and maintain context across multiple interactions.
With a framework, developers can focus on what makes their agent unique—its specific capabilities and use cases—rather than rebuilding these fundamental pieces themselves. They can create sophisticated AI agents in days or weeks instead of months, experiment with different approaches more easily, and build on best practices discovered by other developers and communities.
To understand the importance of frameworks better, consider a developer building an agent that helps doctors review medical reports. Without a framework, they'd need to write code from scratch for everything: handling email attachments, extracting text from PDFs, feeding that text to an LLM in the right format, managing conversation history to track what's been discussed, and ensuring the agent responds appropriately. That's a lot of complex code for tasks that aren't unique to their specific use case.
With an agent framework, many of these building blocks come ready to use. The framework handles reading emails and PDFs, provides patterns for structuring the medical knowledge prompt, manages the conversation flow, and even helps track important details across multiple exchanges. The developer can focus on what makes their agent special, like fine-tuning the medical analysis prompts or adding specific safety checks for diagnoses, rather than reinventing common patterns. What might have taken months to build from scratch can now be prototyped in days.
LangChain has emerged as the Swiss Army knife of AI development, offering a flexible toolkit for building LLM-based applications. While not strictly an agent framework, it provides the essential building blocks that most agent frameworks are built upon, from chains for sequencing LLM calls to memory systems for maintaining context. Its broad ecosystem of integrations and extensive documentation have made it the go-to starting point for developers looking to build practical AI applications.
Then there are multi-agent frameworks like CrewAI and AutoGen, which enable developers to build systems where multiple AI agents collaborate, each with its unique role and capabilities. Rather than merely executing tasks in sequence, these frameworks emphasize agent collaboration through dialogue to solve problems collectively.
For example, when assigned a research report, one agent might outline its structure, another might gather pertinent information, and a third might critique and refine the final draft. It’s akin to assembling a virtual team where AI agents can discuss, debate, and collectively improve solutions. Multi-agent systems working together in this manner to achieve a high-level goal are often referred to as “swarms” of AI agents.
AutoGPT, while not a traditional framework, pioneered this concept of autonomous AI agents. It demonstrated how an AI could take a high-level goal, break it down into subtasks, and work through it independently with minimal human input. Though it has limitations, AutoGPT sparked a wave of innovation in autonomous agents and influenced the design of the more structured frameworks that followed.
All of that context finally brings us to the rise of crypto-native AI agent frameworks. At this point, you might be wondering why, when we have relatively mature frameworks like Langchain and CrewAI in Web2, does Web3 even need its own frameworks. Surely, developers can use these incumbents to build whatever agents they want to? Given the industry’s love for forcing Web3 into any and all narratives, such skepticism is justified.
We believe that there are three solid reasons for the existence of Web3-specific agent frameworks.
We believe that the majority of financial transactions will occur on blockchain rails in the future. This accelerates the need for a class of AI agents that can parse on-chain data, execute blockchain transactions, and manage digital assets across multiple protocols and networks. From automated trading bots that can detect arbitrage opportunities, to portfolio managers executing yield strategies, these agents depend on the deep integration of blockchain functionality in their core workflows. (Decentralised recently wrote about the intersection of DeFi and AI.)
Traditional Web2 frameworks don’t offer native components for these tasks. You’d have to cobble together third-party libraries for interacting with smart contracts, parse raw on-chain events, and handle private key management—introducing layers of complexity and potential vulnerabilities. Conversely, a dedicated Web3 framework would handle these capabilities out of the box, enabling developers to focus on the logic and strategy of their agents rather than wrestling with low-level blockchain plumbing.
Blockchains are not just about digital currencies. They offer a global, trust-minimized system of record with built-in financial instruments that can supercharge multi-agent coordination. Instead of relying on off-chain reputation or siloed databases, developers can use on-chain primitives, like staking, escrow, and incentive pools, to align the interests of multiple AI agents.
Imagine a swarm of agents collaborating to fulfil a complex task (e.g., data labelling for training a new model). Each agent’s performance could be tracked on-chain, with rewards automatically distributed based on contributions. The transparency and immutability of blockchain-based systems allow for fair compensation, more robust reputation tracking, and incentive schemes that evolve in real-time.
Crypto-native frameworks can explicitly bake these functionalities in, letting developers design incentive structures using smart contracts without reinventing the wheel each time they need an agent to trust or pay another agent.
While frameworks like LangChain already have mindshare and network effects, the AI agent space is still in its infancy. It’s unclear what the final state of these systems will look like, and no single approach has locked down the market.
Cryptonomic incentives open up new possibilities for how frameworks are built, governed, and monetized—possibilities that don’t map neatly onto legacy SaaS or Web2 economics. Experimentation in this early phase can unlock novel monetization strategies for the frameworks themselves, not just the agents built on top of them.
Associated with the popular project AI16Z, ElizaOS is a Typescript-based framework for creating, deploying, and managing AI agents. It is designed as a Web3-friendly AI agent operating system that allows developers to build agents with unique personalities, flexible tooling for blockchain interactions, and scale easily with multi-agent systems.
Rig is an open-source AI agent framework developed by Playgrounds Analytics Inc., built using the Rust programming language to create modular and scalable AI agents. It is associated with the project AI Rig Complex (ARC).
Daydreams is a generative agent framework originally created with the intention of creating autonomous agents for on-chain games but extended to executing tasks on-chain.
Pippin is an AI agent framework developed by Yohei Nakajima, creator of BabyAGI, designed to help developers create modular and autonomous digital assistants. Yohei started by building a standalone agent before extending it to a generalized framework.
ZerePy is an open-source Python framework designed to deploy autonomous agents across multiple platforms and blockchains, with a focus on creative AI and social media integration. Like Pippin, Zerepy started as the standalone agent Zerebro, before extending into a framework.
To evaluate the strength of each of these frameworks, we put ourselves in the shoes of a developer looking to build an AI agent. What would they care about? We think it’s useful to break down the assessment into three main categories: the core, the functionality, and the developer experience.
You can think of the core of a framework as the foundation upon which all other agents are built. If the core is weak, slow, or not constantly evolving, the agents created using the framework will suffer from the same limitations. The core can be evaluated based on:
Core Reasoning Loops: The brain of any agent framework; how it approaches problem-solving. Strong frameworks support everything from basic input-output flows to sophisticated patterns like Chain of Thoughts. Without robust reasoning, agents can't effectively break down complex tasks or evaluate multiple options, reducing them to glorified chatbots.
Memory Mechanisms: Agents need both short-term memory for ongoing conversations and long-term storage for persistent knowledge. Good frameworks don't just remember—they understand relationships between different pieces of information and can prioritize what's worth retaining versus forgetting.
Embedding and RAG Support: Modern agents must work with external knowledge like documentation and market data. Strong frameworks make it easy to embed this information and retrieve it contextually through RAG, grounding responses in specific knowledge rather than relying solely on base model training.
Personality Configuration: The ability to shape how agents communicate—their tone, formality and character—is crucial for user engagement. Good frameworks make it simple to configure these traits, recognising that an agent's personality significantly impacts user trust.
Multi-Agent Coordination: Strong frameworks provide built-in patterns for agent collaboration, whether that's through structured dialogues, task delegation, or shared memory systems. This enables the creation of specialist teams where each agent brings unique capabilities to solve problems collectively.
Beyond the core capabilities, a framework's practical utility depends heavily on its features and integrations. Tools dramatically expand what agents can actually do. An agent with access to just an LLM can engage in conversation, but give it access to a web browser, and it can retrieve real-time information. Connect it to your calendar API, and it can actually schedule meetings. Each new tool exponentially increases an agent's capabilities. From a developer’s perspective, the higher the number of tools, the greater the optionality and surface for experimentation.
We evaluate the functionality of crypto-native frameworks across three dimensions:
AI Model Support and Features: Strong frameworks offer native integration with multiple language models—from OpenAI's GPT series to open-source alternatives like Llama and Mistral. But it's not just about LLMs. Support for other AI features like text-to-speech, browser use, image generation, and local model inference can dramatically expand an agent's capabilities. Robust model support is becoming table stakes for many of these frameworks.
Web3 Infrastructure Support: Building agents for crypto requires deep integration with blockchain infrastructure. This means supporting essential Web3 components like wallets for transaction signing, RPCs for chain communication, and indexers for data access. A strong framework should integrate with fundamental tools and services across the ecosystem, from NFT marketplaces and DeFi protocols to identity solutions and data availability layers.
Chain Coverage: While Web3 infrastructure support determines what agents can do, chain coverage determines where they can do it. The crypto ecosystem is growing into a fragmented multi-chain behemoth, highlighting the importance of extensive chain coverage.
Finally, even the most powerful framework is only as good as its developer experience. A framework can have stellar capabilities, but if developers struggle to use it effectively, it will never gain widespread adoption.
The language a framework is built in directly impacts who can build with it. Python dominates both AI and data science, making it a natural choice for AI frameworks. Frameworks written in niche languages might offer unique advantages but risk isolating themselves from the broader developer ecosystem. JavaScript's ubiquity in web development makes it another strong contender, especially for frameworks targeting web integration.
Clear, comprehensive documentation is the lifeline for developers adopting a new framework. This goes beyond just API references, though those are crucial. Strong documentation includes conceptual overviews explaining core principles, step-by-step tutorials, well-commented example code, educational tutorials, troubleshooting guides, and established design patterns.
The table below is a summary of how each framework fares across the parameters we just defined (ranked 1-5).
While discussing the reasoning behind every single data point is beyond the scope of this article, here are some things that stood out to us about each framework.
Eliza is by far the most mature framework on this list. The one feature that helps it stand out is the sheer number of features and integrations it supports, given that the Eliza framework became the Schelling point for the crypto ecosystem to get exposure to AI in the recent agentic wave.
Every blockchain and devtool rushed to integrate themselves into the framework because of the mindshare it generated (it currently has close to 100 integrations!). Concurrently, Eliza also attracted more developer activity than most frameworks. Eliza benefitted, at least in the short term, from some very clear network effects. The framework being in Typescript, a mature language used by both beginners and seasoned developers, only fueled this.
Eliza also stands out for the rich body of educational content and tutorials available for developers to build using the framework.
We’ve seen a range of agents that use the Eliza framework, including Spore, Eliza (the agent), and Pillzumi. A new version of the Eliza framework is due to be released in the coming weeks.
Rig’s approach is fundamentally different from Eliza’s. It stands out for having a strong, lightweight, and performant core. It supports various reasoning patterns including prompt chaining (sequentially applying prompts), orchestration (coordinating multiple agents), conditional logic, and parallelism (performing operations concurrently).
However, Rig isn’t as integration-rich by itself. Instead, it takes a different approach, something the teams call the “Arc Handshake.” Here, the Arc teams partner with different high-quality teams across both Web2 and Web3 to extend Rig’s functionality. Some of these partnerships include working with Soulgraph for agent personality and Listen and Solana Agent Kit for blockchain functionality.
That said, Rig has two drawbacks. First, it’s written in Rust, which, while extremely performant, is familiar to relatively few developers. Second, we’ve only seen a limited number of Rig-powered agents in the wild (AskJimmy being the exception), making it difficult to evaluate genuine developer adoption.
Before starting Daydreams, founder lordOfAFew was a top contributor to the Eliza framework. This gave him exposure to the growth of the framework and, more importantly, to some of its weaknesses. Daydreams differs from other frameworks with its focus on chain-of-thought reasoning to help agents achieve long-term goals. What this means is that when given a high-level and complex objective, agents engage in multi-step reasoning to come up with various actions, accept or discard them based on whether they help it achieve the goal, and continue this process to make progress. This results in agents created with Daydreams being truly autonomous.
The founder’s background in building gaming projects informs this approach. Games, on chain ones in particular, are ideal breeding grounds to train agents and test their capabilities. Unsurprisingly, some of the first use cases for Daydreams agents have been in games like Pistols, Istarai and PonziLand.
The framework also has strong implementations for multi-agent collaboration and orchestration workflows.
Like Daydreams, Pippin was a late entrant to the framework game. Decentralised covered its launch in some detail in this post. The crux of Yohei’s vision is for an agent to be a “digital being” who, with access to the right tools, can operate intelligently and autonomously. This vision is reflected in Pippin’s simple and elegant core. In just a few lines of code, one can create a sophisticated agent that can operate autonomously and even write code for itself.
The downside to the framework is that it is lacking in even basic features like support vector embeddings and RAG workflows. It also encourages developers to use the third-party library Composio for most integrations. It simply isn’t as mature compared to other frameworks discussed so far.
Some agents built using Pippin include Ditto and Telemafia.
Zerepy has a relatively simple core implementation. It effectively picks a single task from a set of configured tasks and executes it when needed. However, it lacks sophisticated reasoning patterns like goal-drive or chain-of-thought planning.
While it supports making inference calls to multiple LLMs, it lacks any embedding or RAG implementations. It also lacks any primitives for memory or multi-agent coordination.
This lack of core features and integrations is reflected in Zerepy’s adoption. We’re yet to see any functional agents that use the framework live.
If all of that sounds very technical and theoretical, we don’t blame you. A simpler question to ask would be “What kind of agents can I build using these frameworks without having to write a bunch of code myself?”.
To evaluate these frameworks in practice, we identified five common agent types developers frequently want to build. These represent different levels of complexity and test various aspects of each framework's capabilities.
Document Chat Agent: Tests core RAG capabilities, including document handling, context maintenance, citation accuracy and memory management. This test reveals how well frameworks navigate between genuine document understanding and simple pattern matching.
Chatbot: Evaluates memory systems and behavior consistency. The framework must maintain coherent personality traits, remember key information across sessions, and allow personality configuration, essentially transforming stateless chatbots into persistent digital entities.
On-chain Trading Bots: Stress-tests external integrations by processing real-time market data, executing cross-chain transactions, analyzing social sentiment and implementing trading strategies. This reveals how frameworks handle complex blockchain infrastructure and API connections.
Gaming NPCs: While the world has woken up to agents over the past year, they’ve played a crucial role as non-playable characters (NPCs) in games for decades now. Gaming agents are transitioning from rule-based ones to intelligent LLM-powered ones, and remain a top use case for frameworks. Here, we test the ability of agents to understand their environment, reason through scenarios autonomously, and achieve their long-term goals.
Voice Assistant: Assesses real-time processing and user experience through speech handling, quick response times and messaging platform integration. This tests whether frameworks can power truly interactive applications beyond simple request-response patterns.
We gave each framework a score out of five for each of these agent types. Here is how they fared:
Most analyses weigh GitHub metrics like stars and forks heavily when evaluating these frameworks. Here, we’ll quickly touch upon what these metrics are and how indicative they are of the quality of a framework.
Stars act as the most visible popularity signal. They're essentially bookmarks that developers give to projects they find interesting or want to track. While a high star count suggests broad awareness and interest, it can be misleading. Projects sometimes accumulate stars through marketing rather than technical merit. Think of stars as social proof rather than a measure of quality.
Fork counts tell you how many developers have created their own copy of the codebase to build upon. More forks typically indicate that developers are actively using and extending the project. That said, many forks end up abandoned, so raw fork numbers need context.
The contributor count reveals how many different developers have actually committed code to the project. This is often more meaningful than stars or forks. A healthy number of regular contributors suggests the project has an active community maintaining and improving it.
We went one step further and designed our own metrics—contributor score. We evaluate the public history of each developer, including their past contributions to other projects, activity frequency, and the popularity of their account, to assign each contributor a score. We then average these out across all contributors to a project weighted by the number of contributions they make.
What do these numbers mean for our frameworks?
In most cases, the number of stars can be discounted. They are not a meaningful indicator of adoption. The exception here is Eliza, which became the #1 trending repository across all of GitHub for a while, in line with it becoming the Schelling point for all of crypto AI. Additionally, notable developers like 0xCygaar contributed to the project. This is also reflected in the number of contributors— 10x more than other projects— Eliza attracted.
Apart from this, Daydreams stands out to us simply because of the high quality of developers it is attracting. As a late entrant that launched post peak-hype, it doesn’t benefit from the network effects of Eliza, though.
If you’re a developer, we hope that we’ve provided you with at least a starting point in choosing which framework to build with (if you need one at all). Beyond this, you still have to put in the hard work of testing out the suitability of each framework’s core reasoning and integrations against your use case. There is no escaping it.
From an observer’s point of view, it is important to remember that all of these AI agent frameworks are less than three months old. (Yes, it feels longer.) Within that period, they’ve gone from being extremely hyped to being called “vaporware.” Such is the nature of technology. Despite this volatility, we believe that this sector is a novel experiment in crypto that is interesting and here to stay.
What matters next is how these frameworks mature, both from a technological and monetization point of view.
As for technology, the biggest advantage a framework can create for itself is making it seamless for agents to interact on-chain. This is the number one reason a developer would choose a crypto-native framework over a general one. Additionally, agents and agent-building techniques are frontier technology problems globally, with new developments seen on a daily basis. Frameworks also have to constantly evolve and adapt to these advances as they happen.
How frameworks monetize themselves is more interesting. In these initial days, creating a Virtuals-inspired launchpad is the low-hanging fruit for projects. But we think there is a lot of room for experimentation here. We’re heading towards a future with millions of agents that specialize in every niche imaginable. Tools that help coordinate between them effectively can capture tremendous value from transaction fees. As the gateway to builders, frameworks, of course, are best positioned to capture this.
Parallelly, the monetization of frameworks also masquerades as the problem of monetizing open-source projects and rewarding contributors for what has historically been free, thankless work. If a team can crack the code on how to create a sustainable open-source economy, while maintaining its fundamental ethos, the repercussions will echo far beyond agent frameworks.