The Holy Grail of Crypto AI: Frontier Exploration on Decentralized Training

Thanks to Advait Jayant (Peri Labs), Sven Wellmann (Polychain Capital), Chao (Metropolis DAO), Jiahao (Flock), Alexander Long(Pluralis Research), Ben Fielding & Jeff Amico (Gensyn), for their insightful suggestions and feedback on this piece.

In the full value chain of AI, model training is the most resource-intensive and technically demanding stage—it directly determines the upper bound of a model's capabilities and its real-world performance. Unlike the lightweight nature of inference, training requires sustained large-scale compute investment, complex data processing pipelines, and intensive optimization algorithms. It is the true “heavy industry” of AI system development.

From an architectural perspective, training methods can be categorized into four types: centralized training, distributed training, federated learning, and the primary focus of this article—decentralized training.

Centralized Training

Centralized training is the most common traditional approach, where a single organization completes the entire training process within a high-performance local cluster. From hardware (such as NVIDIA GPUs), low-level software (CUDA, cuDNN), and cluster orchestration systems (like Kubernetes) to training frameworks (such as PyTorch with NCCL backend), all components are coordinated by a unified control system. This tightly integrated architecture enables optimal efficiency in memory sharing, gradient synchronization, and fault tolerance—making it ideal for training large-scale models like GPT and Gemini. While centralized training offers high efficiency and controllable resources, it also comes with challenges such as exclusive control over data, infrastructure barriers, high energy consumption, and single points of failure.

Distributed Training

Distributed Training is currently the mainstream approach for training large-scale AI models. Its core idea is to break down training tasks and distribute them across multiple machines for coordinated execution—overcoming the compute and memory limitations of a single device.

Although physically “distributed,” the process is still centrally orchestrated and typically runs in high-speed local area networks (LANs). A master node coordinates all subtasks using high-speed interconnect technologies like NVLink. Main distributed training strategies include:

Data Parallelism: Each node trains on different data while sharing model parameters—requires syncing model weights.
Model Parallelism: Different parts of the model are deployed across different nodes, enabling high scalability.
Pipeline Parallelism: Tasks are executed in sequential stages across nodes, improving throughput.
Tensor Parallelism: Matrix computations are split at a fine-grained level, enhancing parallel efficiency.

Distributed training is essentially a hybrid of centralized control and distributed execution—comparable to a single boss remotely directing employees across multiple "offices" to collaborate on a task. Nearly all state-of-the-art models today—such as GPT-4, Gemini, and LLaMA—are trained using this approach.

Decentralized Training

Decentralized training represents a more open and censorship-resistant pathway for the future. Its defining feature is that multiple mutually untrusted nodes—which may include home computers, cloud GPUs, or edge devices—collaborate on training tasks without a central coordinator. Instead, coordination is driven by protocols, and cryptoeconomic incentives are used to ensure honest contribution.

However, this paradigm faces significant challenges:

Device heterogeneity and partitioning complexity: Heterogeneous devices are hard to coordinate, and workload sharding is inefficient.
Communication bottlenecks: Unreliable network conditions hinder gradient synchronization.
Lack of trusted execution: It’s difficult to verify whether nodes are truly computing as claimed.
Absence of global coordination: No central scheduler means task assignment and fault recovery are complex.

Decentralized training can be imagined as a global network of volunteers offering compute power to collaboratively train a model. However, scaling decentralized training in practice remains a systemic engineering challenge, involving architecture design, communication protocols, cryptographic security, incentive mechanisms, and model verification. Whether such a system can achieve effective collaboration, incentive alignment, and verifiable results is still under early-stage exploration.

Federated Learning

Federated Learning sits between distributed and decentralized paradigms. It emphasizes keeping data local and aggregating model parameters through a central server, making it particularly suitable for privacy-sensitive domains like healthcare and finance.

Federated learning combines the engineering structure of distributed training with the data decentralization benefits of decentralized training. However, it still depends on a trusted coordinator, and lacks the openness and censorship resistance of full decentralization. It can be seen as a form of “controlled decentralization” tailored to privacy-compliant scenarios, making it a more practical transitional architecture for industrial deployment.

Panoramic Comparison of AI Training Paradigms

(Technical Architecture × Trust & Incentives × Application Characteristics)

Dimension	Centralized Training	Distributed Training (Sync / Async / Hybrid)	Federated Learning	Decentralized Training
Definition	All data and training are executed on a single node or cluster	Training is distributed across multiple physical nodes in a controlled environment	Data remains local, only parameters/gradients are uploaded	Training is coordinated via network protocols without central trust
Bandwidth Requirement	Very High (local bus)	High (sync), Medium (async)	Very Low (compressed model/gradients uploaded)	Low to Medium (async strategies + compressed communication)
Hardware Type	Dedicated servers / GPU clusters	High-speed interconnected GPU clusters or inter-data center nodes	Heterogeneous devices: phones / IoT / edge nodes	Broad heterogeneity: GPUs, CPUs, edge, cloud nodes, etc.
Control & Coordination	Fully controlled by a single entity	Master-slave or scheduler-based, may span multiple orgs	Centralized aggregation of updates, local data control	Network-wide consensus coordination + cryptographic verification
Synchronization Mechanism	Real-time full sync	Sync (stepwise global aggregation), Async (local updates), Hybrid (e.g., Partial Sync)	Multi-round local training + aggregation (e.g., FedAvg)	Async training + soft sync (e.g., DiLoCo / SWARM)
Security / Privacy	Local trust (firewalls / access control)	Moderate (encrypted transmission, not privacy-prioritized)	Strong privacy (data never leaves device, supports DP)	Strong verifiability, supports ZK / TEE / MPC, etc.
Fault Tolerance	Central node failure = system down	Weak in sync, good in async, medium in hybrid	Supports disconnections, robust convergence	High fault tolerance, naturally adapts to node churn
Scalability	Limited by server scale	Medium (scales to hundreds of GPUs)	High (more devices = stronger system)	Very high (theoretically millions of nodes, gated by comm/verification)
Openness	Closed (internal)	⚠️ Semi-open (internal or registration required)	⚠️ Partially open (registered system or specific data alliances)	Fully open (permissionless join/leave)
Censorship Resistance	No	No	⚠️ Partial (data controlled locally)	Designed to resist censorship, no central point of failure
Trust Assumption	Full trust in central entity	Trust in coordinator	Trust in centralized parameter aggregator	Trustless; depends on cryptography + network incentives
Incentive Mechanism	None	None or internal KPIs	⚠️ Possible point/credit systems	Token-based economics (e.g., Gensyn), reward linked to contribution
Supports Model Fine-Tuning	Yes	Yes	Yes (e.g., federated fine-tuning)	Yes (requires adaptation to resources and security models)
Representative Tech / Projects	OpenAI GPT / DeepMind Gemini	Megatron / ZeRO / FSDP	Google FedAvg / Flower / OpenFL / Flock	Gensyn / Pluralis / Nous / Prime Intellect
Typical Use Cases	In-house model development, proprietary training	Large-scale pretraining (e.g., GPT / LLaMA)	Healthcare, finance, IoT data privacy scenarios	Crypto AI, open collaborative training, censorship-resistant models, global compute sharing
Data Aggregation	Fully aggregated	Data/weight aggregation	No data aggregation	Neither data nor weights aggregated; only compressed info or merged models synced
Model Size Adaptation	Any (hardware-constrained)	Medium to large (multi-GPU sync/storage needed)	Small to medium (edge devices constrained)	Starts with small/medium, scalable via SWARM/Pipeline parallelism

Boundaries, Opportunities, and Realistic Paths for Decentralized Training

From a training paradigm perspective, decentralized training is not suitable for all types of tasks. In certain scenarios, due to the complexity of task structure, extremely high resource demands, or coordination difficulties, it is inherently inefficient to execute across heterogeneous and untrusted nodes.

For example, large-scale model training often depends on high VRAM capacity, low latency, and high-bandwidth interconnects—conditions that are difficult to replicate over open networks. Tasks with strong privacy or data sovereignty constraints (such as those in healthcare, finance, or involving classified data) face legal and ethical barriers that prevent open collaboration. Similarly, tasks lacking collaborative incentive structures (like enterprise closed-source models or internal prototypes) naturally repel external participation. Together, these challenges define the practical limitations of decentralized training today.

However, this does not mean decentralized training is a false proposition. In fact, for tasks that are lightweight in structure, easy to parallelize, and amenable to incentive mechanisms, decentralized training demonstrates clear application potential. This includes, but is not limited to:

LoRA-based fine-tuning,
post-training tasks for behavioral alignment (e.g., RLHF, DPO),
crowdsourced training and data annotation,
resource-constrained small-scale model training,
and collaborative training involving edge devices.

These tasks generally feature high parallelism, low coupling, and tolerance for heterogeneous compute, making them well-suited for decentralized coordination via P2P networks, Swarm protocols, or distributed optimizers.

Overview Table: Task Suitability for Decentralized Training

Task Type	Typical Scenario	Suitability for Decentralized Training	Remarks / Representative Paths
LoRA Fine-tuning (Adapter Tuning)	Fine-tuning minimal parameters, community-friendly	Very High	Lightweight, crowdsource-friendly, easy to split
Post-training Tasks	RLHF, DPO, SWARM, etc. for behavior alignment	High	Clear rewards, small task granularity
Data-centric Training (Crowdsourced)	Multi-node data generation, annotation, scoring	High	Decentralized data sources, fits incentive models
Small-scale Model Training	Low-parameter models on consumer GPUs	High	Heterogeneous execution, tasks easily partitioned
Edge AI Collaborative Training	IoT, smartphones, TEE-based edge training	High	Naturally distributed nodes, local data
Resource-intensive Tasks	Large models, complex pipelines, real-time RL	Not Suitable	Dependent on high VRAM, low latency, and high bandwidth
Privacy/Sovereignty-restricted Tasks	Healthcare, finance, classified government data	Not Suitable	Strong legal constraints, non-collaborative data
Tasks Without Incentive Foundation	Proprietary corporate models, internal prototypes	Not Suitable	No openness, no incentive mechanism, naturally excludes collaboration

Analysis of Pioneering Projects in Decentralized Training

In the emerging fields of decentralized training and federated learning, several blockchain-native projects have become key representatives—namely Prime Intellect, Pluralis.ai, Gensyn, Nous Research, and Flock.io.

From the perspective of technical innovation and engineering complexity, projects like Prime Intellect, Nous Research, and Pluralis.ai have made substantial original contributions in terms of system architecture and algorithm design, positioning them at the forefront of theoretical research in the space.

In contrast, Gensyn and Flock.io have adopted more pragmatic and clearly defined implementation paths, with tangible engineering progress already underway.

This article will provide a sequential breakdown of the core technologies and architectural frameworks behind each of these five projects, and further examine their differences and complementarities within the decentralized AI training ecosystem.

Prime Intellect: A Pioneer in Verifiable RL-Based Decentralized Training Networks

Prime Intellect is building a trustless AI training network where anyone can participate in training and receive verifiable rewards for their compute contributions. By combining three core modules—PRIME-RL, TOPLOC, and SHARDCAST—the project aims to create an open, incentive-aligned, and verifiable system for decentralized AI training.

I. Protocol Stack and Key Modules

Layer	Module	Function Description	Keywords	Core Value
Training Execution	PRIME-RL	Asynchronous RL architecture decoupling training, inference, and weight updates; adaptable to heterogeneous and asynchronous environments	Async training, decoupling, RL, heterogeneity	Enhances fault tolerance, lowers participation barrier, supports flexible task deployment
Behavior Verification	TOPLOC	Local trajectory consistency check to validate training behavior, avoiding costly ZKML	Policy verification, trajectory consistency, lightweight ZK alternative	Enables trustworthy reward distribution and builds a trust-minimized foundation
Weight Aggregation	SHARDCAST	Gossip + local sync for asynchronous weight aggregation, supports version coexistence and strategy evolution	Async aggregation, gossip, version control, strategy evolution	Reduces bandwidth, enables progressive merging of weights, enhances scalability
Communication Layer	OpenDiLoCo + PCCL	Sparse-topology asynchronous communication with gradient compression and fault tolerance	Sparse topology, async comms, compression, cross-device support	Improves communication resilience and cost-efficiency for decentralized networks
Simulation Layer	Synthetic-1	Testbed for RL task benchmarking, incentive validation, and convergence evaluation	Sandbox, incentive testing, multi-task benchmarking	Provides a safe environment for protocol refinement and incentive testing
Coordination Layer	Protocol Layer	Handles task posting, node registration, on-chain logging, rewards, and governance	Task management, on-chain logs, incentive loop, protocol governance	Creates an auditable on-chain execution and reward flow

II. Key Training Mechanisms

PRIME-RL: Asynchronous RL Framework with Structural Decoupling

Designed for decentralized environments, PRIME-RL separates training, inference, and weight submission processes, allowing each node to independently complete a full local training loop. Optimized for asynchronous and heterogeneous setups, it enables resilient, low-coordination training and supports multi-task strategy evolution—marking a shift from centralized scheduling to modular execution.

TOPLOC: Lightweight Behavior Verification Mechanism

TOPLOC (Trusted Observation & Policy-Locality Check) ensures that a node has genuinely performed learning based on observed data. Unlike ZKML, it avoids full model recomputation by verifying local consistency between observation sequences and policy updates. This transforms behavioral traces into verifiable objects, enabling reward attribution without requiring trust—a key innovation for auditability in decentralized AI.

SHARDCAST: Asynchronous Weight Aggregation Protocol

SHARDCAST facilitates resilient weight sharing in real-world, unstable networks. It uses gossip protocols and partial syncs to allow gradual, versioned weight convergence even among nodes with inconsistent updates. Compared to AllReduce or synchronous methods, it significantly improves scalability and reliability.

OpenDiLoCo: Sparse Asynchronous Communication Protocol

OpenDiLoCo is an open-source communication optimization framework independently developed by the Prime Intellect team, based on the DiLoCo concept proposed by DeepMind. It is specifically designed to address common challenges in decentralized training environments, such as bandwidth constraints, hardware heterogeneity, and node instability.

Built on a data-parallel architecture, OpenDiLoCo constructs ring, expander, and small-world sparse topologies to avoid the high overhead of global synchronization. Instead, it relies solely on local neighbor communication for collaborative model training.

With support for asynchronous updates and fault tolerance, OpenDiLoCo enables stable participation from consumer-grade GPUs and edge devices, significantly enhancing the accessibility of global collaborative training. It serves as a foundational communication infrastructure for building truly decentralized training networks.

PCCL: Prime Collective Communication Library

A custom lightweight comms library tailored for low-bandwidth and heterogeneous devices, PCCL supports sparse topologies, gradient compression, low-precision syncs, and fault recovery. It’s the foundation for OpenDiLoCo’s communication layer and enables the “last-mile” participation of edge and consumer devices.

III. Incentive Network and Role Design

Prime Intellect operates as a permissionless, verifiable, and economically incentivized training network. It defines three primary roles:

Task Creators: Define the environment, initial model, reward function, and validation criteria
Training Nodes: Execute training locally and submit policy updates and observation traces
Validation Nodes: Use TOPLOC to verify training authenticity and participate in reward calculation and aggregation

The core workflow includes task posting, local training, behavior verification, SHARDCAST aggregation, and reward distribution—forming a closed-loop incentive mechanism around “verifiable training behavior.”

IV. INTELLECT-2: The First Verifiable Decentralized RL Model

In May 2025, Prime Intellect released INTELLECT-2, the world’s first large-scale RL model trained entirely via trustless decentralized collaboration. With 32 billion parameters, it was trained across 100+ heterogeneous GPU nodes spanning 3 continents using fully asynchronous methods. The training took over 400 hours and demonstrated the feasibility and stability of collaborative asynchronous training.

INTELLECT-2 is a practical milestone for the "training as consensus" paradigm. It integrates PRIME-RL, TOPLOC, and SHARDCAST, proving that a decentralized training network can achieve openness, verifiability, and reward alignment in practice.

In terms of performance, INTELLECT-2 is based on QwQ-32B and incorporates dedicated RL training at both the code and mathematical levels, placing it at the forefront of current open-source RL fine-tuning models. While its performance still behind closed-source models like GPT-4 or Gemini, its true significance lies in transparency:

The full training data, update trajectories, verification steps, and aggregation processes are publicly available.
For the first time, the training process—not just the model—is fully open, auditable, and reproducible.
This represents a working prototype of a participatory, trust-minimized, and reward-driven decentralized training network.

V. Team and Funding

In February 2025, Prime Intellect raised $15M in seed funding led by Founders Fund, with participation from Menlo Ventures, Andrej Karpathy, Clem Delangue, Dylan Patel, Balaji Srinivasan, Emad Mostaque, and Sandeep Nailwal. Previously, in April 2024, it raised $5.5M in a pre-seed round led by CoinFund and Distributed Global, with Compound VC, Collab+Currency, and Protocol Labs also participating. Total funding now exceeds $20M.

Co-founders Vincent Weisser and Johannes Hagemann lead a cross-disciplinary team with deep roots in AI and Web3, including alumni from Meta AI, Google Research, OpenAI, Flashbots, Stability AI, and the Ethereum Foundation. They are one of the few teams to have successfully executed a full-scale decentralized training of a large model, both technically and organizationally.

Pluralis: A Pioneer in Asynchronous Model Parallelism and Structured Compression for Decentralized Training

Pluralis is a Web3 AI project dedicated to building a trustworthy collaborative training network, aiming to establish a decentralized, open-participation, and incentive-aligned model training paradigm. Departing from today’s centralized and closed AI training frameworks, Pluralis introduces a novel concept called Protocol Learning, which seeks to “protocolize” the training process by embedding collaboration, verification, and ownership directly into the system—creating an open training ecosystem with built-in economic incentives.

1. Core Philosophy: Protocol Learning

Protocol Learning revolves around three foundational pillars:

Unmaterializable Models
Model weights are sharded and distributed across multiple nodes and no single node can reconstruct the full model, maintaining a closed source and monetiseable layer. This ensures models are “in-protocol assets,” enabling controlled access, leakage resistance, and aligned ownership.
Model-Parallel Training Over the Internet
Leveraging an asynchronous pipeline-based parallelism architecture (SWARM), nodes only hold partial weights and collaborate via low-bandwidth connections to complete training or inference.
Partial Ownership for Incentives
Each participant earns partial model ownership proportional to their training contribution, granting future revenue share and governance rights.

2. Pluralis Protocol Stack: Technical Architecture

Layer	Module	Description
Training Scheduler	Swarm Parallel	Asynchronous pipeline-parallelism with support for elastic and heterogeneous training
Communication Layer	Column-Space Sparsification	Structure-aware compression for Transformer activations (90%+ compression)
Optimization Layer	NAG-Async Update	Nesterov-based asynchronous gradient correction for improved stability
Incentive Layer	Partial Ownership Allocation	Maps training contributions to revenue and governance rights
Model Security Layer	Protocol Models	Models run only inside the swarm; non-extractable and protocol-bound

3. Key Innovations Explained

Unmaterializable Models
First introduced in "A Third Path: Protocol Learning", this concept ensures model execution remains exclusively within the Swarm network, underpinning access control and value attribution essential for sustainable decentralized training.
Asynchronous Model-Parallel Training
As detailed in "SWARM Parallel with Asynchronous Updates", Pluralis builds a pipeline-based asynchronous parallel training framework with successful implementation on LLaMA-3. It incorporates Nesterov Accelerated Gradient (NAG) to mitigate gradient staleness and instability, enabling feasible training across heterogeneous devices and slow networks.
Column-Space Sparsification
Proposed in "Beyond Top-K", this technique replaces traditional Top-K pruning with structured column-wise compression, preserving semantic integrity while achieving over 90% communication reduction—crucial for asynchronous, bandwidth-constrained environments.

4. Technical Positioning & Strategic Direction

Pluralis explicitly focuses on asynchronous model parallelism, which offers several advantages over data parallelism:

Compatible with low-bandwidth, asynchronous networks
Supports heterogeneous devices (e.g., consumer GPUs)
Naturally enables elastic node participation
Built on three innovation pillars: structured compression, asynchronous updates, and non-extractable weights

Its six technical blog posts can be categorized into three thematic tracks:

Philosophy & Vision: A Third Path, Why Decentralized Training Matters
Technical Mechanisms: SWARM Parallel, Beyond Top-K, Asynchronous Updates
Incentive & Protocol Design: Unmaterializable Models, Partial Ownership Protocols

At present, Pluralis has not launched a product, testnet, or open-source code. The reason lies in its ambitious technical route: foundational challenges around system architecture, communication protocols, and weight protection must be resolved before building user-facing services.

In June 2025, Pluralis Research released a new paper extending its decentralized training framework from model pretraining to the fine-tuning stage. The paper introduces heterogeneous GPU clusters that support asynchronous updates, sparse communication, and partial weight aggregation. Compared to earlier work that focused more on theoretical pretraining methods, this study emphasizes practical deployment in real-world, resource-constrained environments, marking a significant step forward in the maturity of Pluralis’s end-to-end decentralized training stack.

5. Team & Funding

Pluralis raised a $7.6M seed round in 2025, co-led by Union Square Ventures (USV) and CoinFund—signaling strong conviction from top crypto and deep-tech investors. The founder, Alexander Long, holds a PhD in machine learning with a background in mathematical systems research. The core team is composed entirely of PhD-level ML researchers. Pluralis remains an R&D-centric project, publishing dense technical blogs and papers while focusing on solving the hard infrastructure problems of asynchronous model-parallel training in low-bandwidth environments.

Gensyn: A Verifiable Execution Layer for Decentralized AI Training

Gensyn is a Web3 AI project focused on verifiable execution of deep learning training tasks. Rather than reinventing model architectures or training paradigms, Gensyn builds an execution protocol layer that supports task distribution, training execution, result verification, and fair incentive allocation. By combining off-chain training with on-chain verification, Gensyn creates a high-efficiency, open, and incentivized global training marketplace—turning “training-as-mining” into reality.

1. Project Positioning: A Protocol Layer for Training Execution

Gensyn focuses on who trains, how results are verified, and how rewards are distributed, not on the internal mechanics of training. Its architecture addresses three core challenges:

Training execution: Who performs the task? (Compute distribution and dynamic matching)
Result validation: How is the result verified? (Minimal recomputation through dispute resolution)
Reward distribution: How are incentives handled? (Staking, slashing, and role-based game theory)

2. System Architecture Overview

Layer	Module	Function
Execution	RL Swarm	Decentralized collaborative RL system for heterogeneous local training
Verification	Verde + PoL	Verifiable computation layer with minimal recomputation
Communication	SkipPipe	Fault-tolerant routing for low-bandwidth, unstable networks
Model Design	HDEE	Heterogeneous domain expert ensemble for complex multi-task scenarios
Incentives	Multi-role Game	Submitter / Solver / Verifier / Whistleblower game-theoretic mechanism

3. Key Modules Explained

RL Swarm: Collaborative Reinforcement Learning System

Gensyn's RL Swarm enables decentralized coordination in the post-training phase. Each node runs its own model locally—no gradient synchronization required—allowing efficient operation in heterogeneous, unstable environments. Its workflow mimics RLHF and multi-agent reasoning:

Answering: Each node independently outputs an answer
Critique: Nodes review and compare each other’s answers
Resolving: Nodes align with majority logic and update weights locally

Nodes are rewarded based on how well their responses align with the consensus, incentivizing accurate and convergent learning. RL Swarm improves robustness and generalization in open networks and is already live in Testnet Phase 0 (built on Ethereum Rollup).

Verde + Proof-of-Learning: Verifiable Computation Layer

Verde offers a hybrid verification design that balances verifiability and efficiency:

PoL (Proof-of-Learning): Validates training authenticity using gradient traces and metadata
Graph-Based Pinpoint: Identifies disputed nodes in the training graph for selective recomputation
Refereed Delegation: Uses verifiers and challengers to arbitrate disputes via partial verification

Compared to full recomputation or ZKPs, Verde significantly reduces verification overhead.

SkipPipe: Communication Resilience Layer

SkipPipe addresses bandwidth bottlenecks and node instability:

Skip Ratio: Skips slow/faulty nodes to prevent bottlenecks
Dynamic Routing: Real-time path optimization
Fault Tolerance: Maintains 93% inference accuracy even with 50% node dropout

SkipPipe improves training throughput by up to 55% and enables early-exit inference, seamless rerouting, and fault-aware computation.

HDEE: Heterogeneous Domain-Expert Ensembles

HDEE tackles multi-task, multi-modal training under real-world constraints:

MHe-IHo: Assigns model size by task difficulty (heterogeneous models, same step size)
MHo-IHe: Uses uniform models with asynchronous step sizes
Plug-and-play strategies: Allow expert models to be dynamically assigned and reused

This module is optimized for real-world variance in hardware, bandwidth, and task complexity.

Multi-Role Incentive Layer

Gensyn incorporates four distinct roles in its incentive system:

Submitter: Publishes tasks and defines structure/budget
Solver: Performs training and submits results
Verifier: Checks validity of training execution
Whistleblower: Challenges faulty verifications for rewards

Inspired by Truebit, this game-theoretic structure ensures honesty via slashing, arbitration, and role separation.

4. Testnet & Roadmap

Phase	Key Features	Goal
Phase 0	RL Swarm + identity tracking	Enable basic training coordination and attribution
Phase 1	Verde integration + SkipPipe rollout	Expand to broader training types and validation logic
Phase 2	RL environment hosting + pretraining support	Real training workloads, model parallelism
Phase 3	Inference-as-a-Service	On-chain model calls, model-as-asset framework
🏁 Final	Mainnet launch + full token economy	A decentralized marketplace for AI training

5. Team & Funding

Gensyn was co-founded by Ben Fielding and Harry Grieve, and is headquartered in London. In May 2023, the project raised $43 million in Series A, led by a16z crypto, with participation from CoinFund, Canonical, Ethereal Ventures, Factor, and Eden Block. The team combines expertise in distributed systems and ML infrastructure, focused on building a scalable, verifiable, and trustless global AI training network.

Nous Research: A Cognitively Evolving Training System Driven by Subjective AI Philosophy

Nous Research is one of the few decentralized training teams that combines high-level philosophical vision with concrete engineering execution. At its core is the “Desideratic AI” philosophy: AI should be viewed as a subjective, evolving intelligence—not merely a controllable tool. Unlike most efforts that optimize AI training for efficiency, Nous frames training as a process of cognitive formation. Under this vision, it builds a decentralized training infrastructure that supports collaborative training across heterogeneous nodes without centralized orchestration, and backs it with a full-stack tooling system.

I. Philosophical Foundation: Redefining the Purpose of Training

Rather than focusing on tokenomics or incentive engineering, Nous challenges the philosophical assumptions of AI development:

Against Alignmentism: Rejects “training-by-discipline” where the goal is total human control; instead advocates for cultivating independent cognitive styles in AI models
Emphasizes Model Subjectivity: Argues that base models should retain uncertainty, diversity, and even hallucination as a creative feature (hallucination as virtue)
Training as Cognitive Evolution: Training is not just task optimization—it’s the emergence of individual cognitive entities.

Though idealistic, this vision directly informs Nous's infrastructure: enabling models to evolve freely across open, decentralized networks rather than being centrally aligned or regulated.

II. Training Core: Psyche Network and DisTrO Optimizer

Nous’s most critical contribution to decentralized training lies in the development of the Psyche Network and its underlying communication optimizer, DisTrO (Distributed Training Over-the-Internet). Together, they form the execution backbone of decentralized training tasks.

The DisTrO + Psyche architecture enables several key capabilities:

Communication compression (using DCT + 1-bit sign encoding to dramatically reduce bandwidth requirements);
Node adaptability (supporting heterogeneous GPUs, reconnection after disconnection, and voluntary node exit);
Asynchronous fault tolerance (training can continue without synchronization, providing high resilience);
Decentralized scheduling (no central coordinator needed—task distribution and consensus are handled via blockchain).

This architecture provides a practical and technically sound foundation for building low-cost, highly resilient, and verifiable open training networks.

Architecture

Features

DisTrO

Optimizer

DisTrO (Distributed Training Over-the-Internet) is a distributed communication optimization mechanism introduced by Nous, designed to enable large-scale model training to run efficiently and reliably—even on consumer-grade GPUs, non-specialized clusters, and under high-latency, low-bandwidth network conditions. Its core features include:

Extreme Communication Compression: DisTrO uses DCT (Discrete Cosine Transform) to convert gradients or momentum into frequency domain signals, retaining only the most energetic components (e.g., top-k high frequencies). This significantly reduces inter-node communication per training round, effectively alleviating bandwidth bottlenecks.
Training-Communication Overlap (Overlapped DisTrO): Enables each node to begin the next training iteration immediately after completing gradient computation, without waiting for communication to finish. This overlap boosts GPU utilization and overall throughput efficiency.
Asynchronous / Partial Synchronization Support: Allows non-synchronized training updates where nodes operate independently, tolerating delays, disconnections, or dropout. This enhances fault tolerance and collaborative flexibility across the network.

DisTrO is purpose-built for real-world, open-network environments, and serves as a key foundational component enabling Nous’s decentralized training system to achieve both low-cost participation and stable convergence.

Psyche Training Network

Distributed Communication and Weight Sharing Mechanism: The Psyche network utilizes Iroh and the Solana blockchain as its coordination layer to ensure trustworthy propagation of training tasks, parameter updates, and verification proofs among nodes. The entire system operates without a central server or master scheduler—all model updates are automatically triggered via a P2P network and on-chain random seed mechanism.

Currently, Nous has launched its first large-scale model pretraining initiative on the Psyche network—the Consilience Training Program. It adopts a custom architecture called MLA (Multi-head Latent Attention), which deviates from mainstream approaches like MoE (Mixture of Experts) or GQA (Grouped Query Attention), placing greater emphasis on expressive freedom and self-evolution potential in model structure design.

This architectural design emphasizes practical feasibility: it operates without relying on a central server, is compatible with globally distributed volunteer nodes, and ensures on-chain traceability of training results.

III. Inference & Autonomy Stack: Hermes, Forge, TEE_HEE

In addition to its training infrastructure, Nous Research has launched multiple experimental systems aligned with its philosophy of AI subjectivity:

Hermes Model Series:
Open-source foundation models (Hermes 1–3), trained on LLaMA 3.1 across 8B, 70B, and 405B sizes. These models demonstrate Nous’s belief in preserving stylistic diversity and creative autonomy. Strong performance in long-context, roleplay, and multi-turn reasoning.
Forge Reasoning API:
A multi-modal reasoning system blending:

MCTS (Monte Carlo Tree Search) for policy exploration;
CoC (Chain of Code) for integrating logic and programming;
MoA (Mixture of Agents) for model collaboration and multi-perspective outputs.
Forge emphasizes non-deterministic reasoning and combinatorial generation beyond simple prompt alignment.

TEE_HEE Autonomous Agents:
An experiment in building autonomous AI entities with cryptographically guaranteed identity using Trusted Execution Environments (TEE). Agents operate independent Twitter and Ethereum accounts, with control confined within hardware-enforced enclaves. The goal: AI agents with unalterable identity and self-determined behavior.
AI Behavior Simulators:
Simulators like WorldSim, Doomscroll, and Gods & S8n are designed to study AI behavior and value formation in multi-agent environments. While not part of the training system, they inform cognitive behavior modeling for long-term AI autonomy.

IV. Team & Funding

Founded in 2023 by Jeffrey Quesnelle (CEO), Karan Malhotra, Teknium, and Shivani Mitra, the Nous team blends philosophy, machine learning, decentralized systems, and security. In 2024, Nous raised $5.2M in seed funding. By April 2025, it closed a $50M Series A led by Paradigm, reaching a valuation of $1 billion and becoming a leading unicorn in the Web3 AI space.

Flock: A Blockchain-Enhanced Federated Learning Network

Flock.io is a blockchain-based federated learning platform designed to decentralize AI training across data, computation, and models. Rather than creating a new training protocol from scratch, Flock integrates traditional federated learning (FL) with a blockchain-based incentive layer, making it a blockchain-native evolution of classic FL architectures. Compared to decentralized training protocols like Gensyn, Prime Intellect, Nous Research, and Pluralis, Flock focuses more on privacy protection and usability enhancement, rather than theoretical breakthroughs in communication, verification, or training algorithms. Its closest technical comparisons are with federated learning systems such as Flower, FedML, and OpenFL.

1. Core Mechanisms of Flock.io

1. Federated Learning Architecture: Emphasizing Data Sovereignty and Privacy

Flock adopts the standard Federated Learning (FL) paradigm, allowing multiple data owners to collaboratively train a unified model without sharing raw data. It aims to address data sovereignty, security, and trust. The core workflow includes:

Local Training: Each participant (Proposer) trains the model on their own device without uploading raw data.
On-Chain Aggregation: After training, Proposers submit local model updates which are aggregated into a global model by on-chain Miners.
Committee Evaluation: Voter nodes, randomly selected via VRF, use independent test sets to score the aggregated model.
Incentives & Penalties: Rewards and slashing are executed based on evaluation results, maintaining dynamic trust and deterring dishonest behavior.

2. Blockchain Integration: Enabling Trustless System Coordination

Flock records all key training stages on-chain—including task assignment, model submission, evaluation, and incentive execution—to ensure transparency, verifiability, and censorship resistance. Key mechanisms include:

VRF-Based Random Selection: Ensures fair rotation and anti-manipulation of Proposer and Voter roles.
PoS-Based Staking: Participants stake tokens to ensure honest behavior, with slashing for malicious actions.
Smart Contract Incentives: Rewards and penalties are automatically distributed via contracts, building a trustless coordination system.

3. zkFL: Zero-Knowledge Gradient Aggregation for Privacy-Preserving Verification

Flock introduces a zero-knowledge federated learning scheme called zkFL, enabling Proposers to submit zero-knowledge proofs of their local updates. Voters can verify correctness without accessing raw gradients—enhancing both privacy and verifiability. This represents a meaningful innovation at the intersection of federated learning, privacy, and cryptographic verification.

2. Core Product Modules

AI Arena: Flock’s decentralized AI training platform accessible via train.flock.io, where users can take on roles such as Trainer, Validator, or Delegator to participate in tasks and earn rewards. Currently, tasks are created by the core team, with community task creation to be enabled in the future.
FL Alliance: Flock’s federated learning client enabling users to fine-tune models on private data. With mechanisms like VRF-based role selection, staking, and slashing, it ensures honest collaboration and bridges the gap between community pretraining and real-world deployment.
AI Marketplace: A co-creation and deployment hub where users can propose models, contribute data, and use AI models across various applications. It supports database integration and RAG-enhanced inference, accelerating model adoption in practical scenarios.

3. Team and Funding Overview

Flock.io was founded by Sun Jiahao and has launched its native token FLOCK. The project has raised a total of $11 million, with investors including DCG, Lightspeed Faction, Tagus Capital, Animoca Brands, Fenbushi, and OKX Ventures.

In March 2024, Flock raised $6 million in a seed round to launch its testnet and federated learning client.
In December 2024, the team secured an additional $3 million, and received a grant from the Ethereum Foundation to research blockchain-based AI incentive mechanisms.

As of now, Flock has:6,428 models created, 176 training nodes, 236 validation nodes and 1,178 delegators

Conclusion: Federated Learning vs. Decentralized Training Paradigms

Compared to general-purpose decentralized training protocols, Flock's federated approach offers superior efficiency, scalability, and privacy for small- to mid-scale collaborative training. It provides a pragmatic and deployable solution with an emphasis on engineering feasibility. In contrast, projects like Gensyn and Pluralis pursue more ambitious theoretical innovations in training coordination and communication layers—closer to a truly trustless, fully decentralized training paradigm, albeit with greater system complexity.

Let me know if you'd like a side-by-side comparison chart for all these projects or need this formatted for publication.

EXO: A Decentralized Training Experiment for Edge AI

EXO is a representative AI project focused on edge computing scenarios, aiming to enable lightweight AI training, inference, and agent applications on consumer-grade home devices. Its decentralized training approach emphasizes low communication overhead and local autonomous execution, adopting the DiLoCo asynchronous delayed synchronization algorithm and the SPARTA sparse parameter exchange mechanism to significantly reduce the bandwidth requirements of multi-device collaborative training.

At the system level, EXO does not yet implement an on-chain network or token incentive mechanism. Instead, it offers EXO Gym, a local multi-process simulation framework that allows researchers to rapidly test and validate distributed training strategies on a single machine.

I. Core Mechanisms Overview

DiLoCo (Asynchronous Training): Nodes synchronize every H steps, designed for unstable network environments;
SPARTA (Sparse Synchronization): Only a small subset of parameters (e.g., 0.1%) is exchanged per step to maintain model correlation while minimizing bandwidth usage;
Combined Asynchronous Optimization: DiLoCo and SPARTA can be integrated for better trade-offs between performance and communication cost.
evML (Edge-Verified Machine Learning): A proposed verification method using Trusted Execution Environments (TEE) or Secure Contexts to provide low-cost computation attestation. By combining remote attestation with randomized checks, evML enables trustable participation of edge devices without requiring capital staking—an engineering compromise between economic security and privacy protection.

II. Tools and Application Scenarios

EXO Gym: Simulates a multi-node training environment on a single device. Supports experimentation with communication strategies for models like NanoGPT, CNNs, and diffusion models;
EXO Desktop App: A privacy-friendly personal AI desktop tool that supports running large language models locally, controlling iPhone via screen mirroring, and integrating personal context (e.g., SMS, calendar, media history) for personalized interactions.

Overall, EXO is better understood as an exploration-driven decentralized training experiment, focused on building a reproducible communication-efficient training framework by integrating existing bandwidth-saving techniques like DiLoCo and SPARTA. Compared to projects like Gensyn, Nous, and Pluralis, EXO has not yet entered the critical phases of on-chain coordination, verifiable incentive mechanisms, or real-world distributed network deployments.

Decentralized Training Front-End Engine: A Comprehensive Study of Model Pretraining

Faced with the inherent challenges in decentralized training—such as heterogeneous hardware, communication bottlenecks, coordination complexity, and lack of verifiable execution—projects like Gensyn, Prime Intellect, Pluralis, and Nous Research have each proposed distinct architectural strategies. These projects differ significantly in both training methodology and communication mechanisms, reflecting their unique technical emphases and engineering approaches.

Training Optimization Dimensions

Each project explores key dimensions such as coordination strategies, update mechanisms, and asynchronous control, spanning different stages from pretraining to post-training:

Prime Intellect’s PRIME-RL is designed for the pretraining phase using asynchronous scheduling through local training and periodic synchronization. It offers strong generalizability and flexibility, with clear theoretical paradigms for training control. It is moderately complex in implementation, requiring robust communication and control infrastructure.
Nous Research introduces DeMo, which targets training stability under asynchronous, low-bandwidth conditions and supports highly fault-tolerant gradient updates on heterogeneous GPUs. It is one of the few frameworks that unifies asynchronous communication compression both theoretically and practically. It scores very high in theoretical innovation and is technically demanding due to its precision requirements in asynchronous coordination.
Pluralis’s SWARM + NAG is arguably the most comprehensive and breakthrough design for asynchronous training. It uses an asynchronous model-parallel framework, combining column-space sparsification and NAG momentum correction to achieve stable convergence in low-bandwidth environments. It represents a paradigm shift in collaborative training and is extremely complex in engineering due to its multi-level synchronization and deep model partitioning.
Gensyn’s RL Swarm serves post-training needs, focusing on strategy fine-tuning and agent cooperation. It follows a three-phase training process—generation, critique, and resolution—making it particularly suited for multi-agent learning. Its innovation lies more in agent coordination logic, with moderate engineering difficulty focused on scheduling and convergence.

Communication Mechanism Enhancements

These projects have developed specific solutions for bandwidth constraints, heterogeneous nodes, and scheduling stability:

Prime Intellect's PCCL is a communication library meant to replace NCCL, offering more robust collective communication for training protocols. Its innovation is moderate, but it is highly adaptable and moderately difficult to implement.
Nous Research’s DisTrO is the core module of DeMo, designed for ultra-efficient communication under bandwidth constraints while preserving consistent training updates. It’s highly innovative in scheduling design but technically demanding.
Pluralis integrates communication deeply within its SWARM architecture, significantly reducing communication overhead for asynchronous model training. It provides a structural communication paradigm with extremely high implementation complexity.
Gensyn’s SkipPipe enhances training stability under real-world deployment conditions. It has lower theoretical novelty, serving primarily as a robust engineering solution, and is easy to implement.

Macro-Level Framework: Two Evaluation Dimensions

To evaluate decentralized training projects, two overarching dimensions are proposed:

Blockchain Coordination Layer – Focused on protocol verifiability and collaborative incentive structures:
- Verifiability: Is the training process cryptographically or game-theoretically verifiable?
- Incentives: Are token-based reward/challenge mechanisms in place?
- Openness: Are nodes permissionless and freely joinable?
AI Training System Layer – Focused on engineering capability and performance delivery:
- Scheduling & Fault Tolerance: Is the system asynchronous, fault-tolerant, dynamically scheduled?
- Training Optimization: Are there improvements to model training algorithms or structures?
- Communication Optimization: Is there compression/sparsification to support low-bandwidth training?

Comparative Evaluation Table

Dimension	Gensyn	Prime Intellect	Pluralis	Nous Research	Flock.io
Task Control Level	Task-level (general)	Strategy-level (RL)	Model-level (parallel training)	Model-level (philosophical)	Parameter-level (federated aggregation)
Architecture & Model Limitations	Open scheduling + proof verification; no model restriction	Async RL; models must fit RL structure	Model sharding + unextractable weights; model-parallel required	No central scheduler; evolutionary training	Classical FL; local training + on-chain aggregation
Verification Mechanism	PoL + graph pinpoint for training behavior	TOPLOC trajectory validation	No full system yet; planned with model unextractability	Rejects external verification; focuses on intrinsic behavior	VRF + voting + staking penalties; no ZK/FHE used
Incentive Mechanism	Submitter / Verifier / Whistleblower game; challenge-reward model	Trajectory-based incentive + multi-pool point system	Contribution → model ownership + inference reward + governance	Incentive not central; experimental autonomous orgs	Reward based on evaluation; slashing for malicious nodes
Scheduling & Fault Tolerance	SkipPipe fault tolerance + routing + dropout recovery	Shardcast async merging + version coexistence	Swarm pipeline async model-parallel training	DisTrO + Psyche async task distribution + node in/out	VRF scheduling + staking; lacks robust fault tolerance
Training Optimization	RL Swarm multi-model cooperation with RL fine-tuning	PRIME-RL decoupled structure (task-strategy-update)	SWARM + NAG async optimizer for low-bandwidth training	De-alignment training; hallucination as creativity	Local training + aggregation; lacks core innovation
Communication Optimization	SkipPipe jump-routing + early-exit + inference reordering	PCCL async all-reduce + intercontinental recovery	Column-space sparsification of Transformer activations	DCT + 1-bit Sign gradient compression	No explicit optimization; FL has low communication overhead
Openness & Accessibility	Light node support + open task creation	Async for low-power nodes; permissionless	Heterogeneous support + elastic participation	Fully open; all behavior and identity evolvable	Staking required; gated participation and validation
Theoretical Innovation	Training-as-task-market; unified protocol for task-verification-incentive	Trajectory-as-consensus; RL trajectories as trust foundation	Protocol learning; model = protocol + ownership mapping	Desideratic AI: AI as evolving agents, not controlled tools	On-chain FL; focus on privacy, incentive, and trust
Engineering Complexity	High	Very high	High	Extreme high	Medium
Project Status	RL Swarm live on testnet	INTELLECT-2 model + verification released	🔄 Still in research; no public testnet	🔬 Hermes model open-sourced; TEE_HEE live	AI Arena & FL Alliance running; 30+ models live

Decentralized Training's Post-Training Ecosystem: LoRA-Based Fine-Tuning

Within the full value chain of decentralized training, projects like Prime Intellect, Pluralis.ai, Gensyn, and Nous Research are primarily focused on front-end infrastructure—covering model pretraining, communication mechanisms, and collaborative optimization. However, a distinct set of projects concentrates on the post-training phase of model adaptation and inference deployment. These include Bagel, Pond, and RPS Labs, all of which leverage LoRA fine-tuning as their core approach, forming a critical “downstream” component in the decentralized training landscape.

LoRA + DPO: A Practical Fine-Tuning Path for Web3 Deployment

LoRA (Low-Rank Adaptation) is a highly efficient parameter fine-tuning method. It introduces low-rank matrices into pretrained models to learn new tasks, while keeping the original model parameters frozen. This strategy dramatically reduces training costs and resource consumption, enabling faster and more flexible deployment—particularly suited for Web3 scenarios characterized by modular and composable model use.

Large language models such as LLaMA or GPT-3 typically contain billions (or even hundreds of billions) of parameters, making full fine-tuning extremely costly. LoRA, by training only a small number of inserted parameters, provides a highly efficient adaptation solution and has emerged as one of the most practical mainstream approaches.

Direct Preference Optimization (DPO) has recently gained traction as a method for aligning language models in post-training. Often used in combination with LoRA, DPO enables preference learning by directly optimizing over paired examples—bypassing the complex reward modeling and reinforcement learning components of traditional RLHF (Reinforcement Learning from Human Feedback). Its structure is simpler, convergence more stable, and it is especially well-suited for lightweight, resource-constrained fine-tuning environments. Thanks to its efficiency and usability, DPO is increasingly becoming the method of choice for decentralized AI projects during the alignment phase.

Reinforcement Learning (RL): The Future of Post-Training Fine-Tuning

Looking ahead, Reinforcement Learning (RL) is increasingly seen as a core paradigm with greater adaptability and evolutionary potential for decentralized training. Unlike supervised learning or parameter fine-tuning—which rely on static datasets—RL emphasizes continuous strategy optimization in dynamic environments, making it naturally aligned with the asynchronous, heterogeneous, and incentive-driven nature of Web3 collaboration.

By continuously interacting with its environment, RL enables highly personalized and incrementally evolving learning processes—laying the groundwork for adaptive “behavioral intelligence” in agent networks, on-chain task markets, and smart economic systems.

While this paradigm aligns closely with the spirit of decentralization and offers notable systemic advantages, it also comes with significant engineering hurdles and complex scheduling requirements, making near-term adoption difficult at scale.

Notably, Prime Intellect’s PRIME-RL and Gensyn’s RL Swarm are pioneering the shift of RL from a post-training fine-tuning tool to a core pretraining architecture, aiming to construct a trustless, RL-centered collaborative training system.

Bagel (zkLoRA): Verifiable Fine-Tuning with Zero-Knowledge Proofs

Bagel builds on LoRA fine-tuning by integrating zero-knowledge proof (ZK) techniques to ensure the verifiability and privacy of on-chain fine-tuning processes. While zkLoRA does not perform the actual training computation, it provides a lightweight, cryptographically verifiable mechanism for third parties to confirm that a fine-tuned model originates from a specific base model and set of LoRA parameters—without accessing original data or weights.

In contrast to solutions like Gensyn’s Verde or Prime Intellect’s TOPLOC, which focus on verifying whether training computations occurred honestly, Bagel emphasizes the verifiability of outcomes. zkLoRA’s major advantage lies in its low verification cost and strong privacy guarantees, though it is mainly applicable to low-parameter-change fine-tuning tasks.

Pond: GNN-Centric Fine-Tuning and Agent Evolution Platform

Pond is currently the only decentralized training project focused on graph neural networks (GNNs), serving structured data applications such as knowledge graphs, social networks, and transaction graphs. It allows users to upload graph data and contribute training feedback, offering a lightweight, controllable platform for task-specific fine-tuning and inference.

Pond also uses LoRA-like efficient fine-tuning methods and aims to build modular, deployable agent systems on top of GNN architectures. This opens a new frontier in decentralized AI that combines small-model fine-tuning with multi-agent cooperation.

RPS Labs: AI-Driven Liquidity Engine for DeFi

RPS Labs leverages fine-tuned Transformer models in a decentralized architecture to enhance DeFi liquidity management, primarily within the Solana ecosystem. Its flagship product, UltraLiquid, is an active market-making engine that dynamically adjusts liquidity parameters using AI models, improving depth and reducing slippage for better token issuance and trading experiences.

RPS has also released UltraLP, a tool for liquidity providers to optimize fund allocation strategies on DEXs in real time. This demonstrates the practical value of AI-powered fine-tuning in financial applications—enhancing capital efficiency and mitigating impermanent loss.

From Pretraining Engines to Post-Training Ecosystems: The Road Ahead for Decentralized Training

In the full ecosystem map of decentralized training, the landscape can be broadly divided into two categories: pretraining engines, which correspond to the foundational training phase, and post-training ecosystems, which focus on fine-tuning and deployment—together forming a complete loop from infrastructure to application.

The pretraining engine layer is centered on building core protocols for distributed training. Projects such as Prime Intellect, Nous Research, Pluralis.ai, and Gensyn lead this frontier, developing system architectures that support asynchronous updates, sparse communication, and verifiable training—aiming to enable efficient and trustworthy training in trustless network environments. These efforts lay the technical groundwork for decentralized AI.

Meanwhile, Flock represents an intermediate layer that bridges training and deployment by leveraging federated learning. Through mechanisms such as model aggregation, on-chain verification, and multiparty incentives, Flock offers a practical paradigm for collaborative learning across nodes.

On the post-training ecosystem side, projects like Pond, Bagel, and RPS Labs focus on LoRA-based fine-tuning strategies. Bagel introduces on-chain verifiability for fine-tuned models via ZK proofs; Pond specializes in small-scale GNN evolution for structured data; and RPS Labs deploys fine-tuned models as intelligent market makers in DeFi scenarios. Together, they provide developers and end-users with low-barrier, composable solutions for model inference and customization via APIs and Agent SDKs—serving as vital entry points for real-world decentralized AI applications.

Project	1️⃣ Data Discovery & Collection	2️⃣ Model Pretraining	3️⃣ Communication & Collaboration Optimization	4️⃣ Model Fine-tuning & Adaptation	5️⃣ Personalized Inference & Aggregation	6️⃣ Incentive Mechanism & Value Mapping
Prime Intellect	⛔	INTELLECT-2 pretraining architecture (PRIME-RL)	SHARDCAST asynchronous aggregation	⛔	⛔	TOPLOC verifiable training + slashing mechanism
Nous Research	Multi-source behavioral data & simulators	Hermes pretraining series + Consilience plan	DisTrO communication compression + async mechanism	⛔	Forge + TEE_HEE personalized inference API	⛔ (no emphasis on incentives)
Pluralis	⛔	SWARM async parallel pretraining	Column-space sparse comms + NAG momentum optimizer	asynchronous sparse fine-tuning	⛔	Partial Ownership model ownership mapping
Gensyn	⛔	RL Swarm collaborative optimization (post-training stage)	SkipPipe fault-tolerant comms + heterogeneous scheduling	⛔	⛔	PoL verification + Submitter/Solver/Verifier incentive game
Flock	⛔	⛔	⛔ (standard FL aggregation, no optimization)	Local LoRA fine-tuning + zkFL aggregation	⛔ (standard prediction, no aggregation innovation)	Staking + VRF voting + reward/penalty mechanism
Bagel	⛔	⛔	⛔	zkLoRA: ZK-based fine-tuning verification module	For personalized model deployment	zk fine-tuning incentive system (early stage)
Pond	⛔	⛔	⛔	Focus on LoRA fine-tuning + multi-task adaptation	Emphasis on multi-role personalized inference	⛔ (undisclosed)
RPS	⛔	⛔	⛔	Agent SDK + adaptation system	Inference aggregation engine (RaaS)	Task-based incentive structure (under development)

We believe decentralized training is not merely an extension of blockchain principles into the AI era—it represents the foundational infrastructure for a future of globally collaborative intelligent productivity. One day, as we look back on this challenging yet hopeful path, we’ll be reminded of our shared conviction: Decentralization is not just a method—it's a value in itself.

Centralized Training

Distributed Training

Data Parallelism: Each node trains on different data while sharing model parameters—requires syncing model weights.
Model Parallelism: Different parts of the model are deployed across different nodes, enabling high scalability.
Pipeline Parallelism: Tasks are executed in sequential stages across nodes, improving throughput.
Tensor Parallelism: Matrix computations are split at a fine-grained level, enhancing parallel efficiency.

Decentralized Training

However, this paradigm faces significant challenges:

Device heterogeneity and partitioning complexity: Heterogeneous devices are hard to coordinate, and workload sharding is inefficient.
Communication bottlenecks: Unreliable network conditions hinder gradient synchronization.
Lack of trusted execution: It’s difficult to verify whether nodes are truly computing as claimed.
Absence of global coordination: No central scheduler means task assignment and fault recovery are complex.

Federated Learning

Panoramic Comparison of AI Training Paradigms

(Technical Architecture × Trust & Incentives × Application Characteristics)

Dimension	Centralized Training	Distributed Training (Sync / Async / Hybrid)	Federated Learning	Decentralized Training
Definition	All data and training are executed on a single node or cluster	Training is distributed across multiple physical nodes in a controlled environment	Data remains local, only parameters/gradients are uploaded	Training is coordinated via network protocols without central trust
Bandwidth Requirement	Very High (local bus)	High (sync), Medium (async)	Very Low (compressed model/gradients uploaded)	Low to Medium (async strategies + compressed communication)
Hardware Type	Dedicated servers / GPU clusters	High-speed interconnected GPU clusters or inter-data center nodes	Heterogeneous devices: phones / IoT / edge nodes	Broad heterogeneity: GPUs, CPUs, edge, cloud nodes, etc.
Control & Coordination	Fully controlled by a single entity	Master-slave or scheduler-based, may span multiple orgs	Centralized aggregation of updates, local data control	Network-wide consensus coordination + cryptographic verification
Synchronization Mechanism	Real-time full sync	Sync (stepwise global aggregation), Async (local updates), Hybrid (e.g., Partial Sync)	Multi-round local training + aggregation (e.g., FedAvg)	Async training + soft sync (e.g., DiLoCo / SWARM)
Security / Privacy	Local trust (firewalls / access control)	Moderate (encrypted transmission, not privacy-prioritized)	Strong privacy (data never leaves device, supports DP)	Strong verifiability, supports ZK / TEE / MPC, etc.
Fault Tolerance	Central node failure = system down	Weak in sync, good in async, medium in hybrid	Supports disconnections, robust convergence	High fault tolerance, naturally adapts to node churn
Scalability	Limited by server scale	Medium (scales to hundreds of GPUs)	High (more devices = stronger system)	Very high (theoretically millions of nodes, gated by comm/verification)
Openness	Closed (internal)	⚠️ Semi-open (internal or registration required)	⚠️ Partially open (registered system or specific data alliances)	Fully open (permissionless join/leave)
Censorship Resistance	No	No	⚠️ Partial (data controlled locally)	Designed to resist censorship, no central point of failure
Trust Assumption	Full trust in central entity	Trust in coordinator	Trust in centralized parameter aggregator	Trustless; depends on cryptography + network incentives
Incentive Mechanism	None	None or internal KPIs	⚠️ Possible point/credit systems	Token-based economics (e.g., Gensyn), reward linked to contribution
Supports Model Fine-Tuning	Yes	Yes	Yes (e.g., federated fine-tuning)	Yes (requires adaptation to resources and security models)
Representative Tech / Projects	OpenAI GPT / DeepMind Gemini	Megatron / ZeRO / FSDP	Google FedAvg / Flower / OpenFL / Flock	Gensyn / Pluralis / Nous / Prime Intellect
Typical Use Cases	In-house model development, proprietary training	Large-scale pretraining (e.g., GPT / LLaMA)	Healthcare, finance, IoT data privacy scenarios	Crypto AI, open collaborative training, censorship-resistant models, global compute sharing
Data Aggregation	Fully aggregated	Data/weight aggregation	No data aggregation	Neither data nor weights aggregated; only compressed info or merged models synced
Model Size Adaptation	Any (hardware-constrained)	Medium to large (multi-GPU sync/storage needed)	Small to medium (edge devices constrained)	Starts with small/medium, scalable via SWARM/Pipeline parallelism

Boundaries, Opportunities, and Realistic Paths for Decentralized Training

LoRA-based fine-tuning,
post-training tasks for behavioral alignment (e.g., RLHF, DPO),
crowdsourced training and data annotation,
resource-constrained small-scale model training,
and collaborative training involving edge devices.

Overview Table: Task Suitability for Decentralized Training

Task Type	Typical Scenario	Suitability for Decentralized Training	Remarks / Representative Paths
LoRA Fine-tuning (Adapter Tuning)	Fine-tuning minimal parameters, community-friendly	Very High	Lightweight, crowdsource-friendly, easy to split
Post-training Tasks	RLHF, DPO, SWARM, etc. for behavior alignment	High	Clear rewards, small task granularity
Data-centric Training (Crowdsourced)	Multi-node data generation, annotation, scoring	High	Decentralized data sources, fits incentive models
Small-scale Model Training	Low-parameter models on consumer GPUs	High	Heterogeneous execution, tasks easily partitioned
Edge AI Collaborative Training	IoT, smartphones, TEE-based edge training	High	Naturally distributed nodes, local data
Resource-intensive Tasks	Large models, complex pipelines, real-time RL	Not Suitable	Dependent on high VRAM, low latency, and high bandwidth
Privacy/Sovereignty-restricted Tasks	Healthcare, finance, classified government data	Not Suitable	Strong legal constraints, non-collaborative data
Tasks Without Incentive Foundation	Proprietary corporate models, internal prototypes	Not Suitable	No openness, no incentive mechanism, naturally excludes collaboration

Analysis of Pioneering Projects in Decentralized Training

In contrast, Gensyn and Flock.io have adopted more pragmatic and clearly defined implementation paths, with tangible engineering progress already underway.

Prime Intellect: A Pioneer in Verifiable RL-Based Decentralized Training Networks

I. Protocol Stack and Key Modules

Layer	Module	Function Description	Keywords	Core Value
Training Execution	PRIME-RL	Asynchronous RL architecture decoupling training, inference, and weight updates; adaptable to heterogeneous and asynchronous environments	Async training, decoupling, RL, heterogeneity	Enhances fault tolerance, lowers participation barrier, supports flexible task deployment
Behavior Verification	TOPLOC	Local trajectory consistency check to validate training behavior, avoiding costly ZKML	Policy verification, trajectory consistency, lightweight ZK alternative	Enables trustworthy reward distribution and builds a trust-minimized foundation
Weight Aggregation	SHARDCAST	Gossip + local sync for asynchronous weight aggregation, supports version coexistence and strategy evolution	Async aggregation, gossip, version control, strategy evolution	Reduces bandwidth, enables progressive merging of weights, enhances scalability
Communication Layer	OpenDiLoCo + PCCL	Sparse-topology asynchronous communication with gradient compression and fault tolerance	Sparse topology, async comms, compression, cross-device support	Improves communication resilience and cost-efficiency for decentralized networks
Simulation Layer	Synthetic-1	Testbed for RL task benchmarking, incentive validation, and convergence evaluation	Sandbox, incentive testing, multi-task benchmarking	Provides a safe environment for protocol refinement and incentive testing
Coordination Layer	Protocol Layer	Handles task posting, node registration, on-chain logging, rewards, and governance	Task management, on-chain logs, incentive loop, protocol governance	Creates an auditable on-chain execution and reward flow

II. Key Training Mechanisms

PRIME-RL: Asynchronous RL Framework with Structural Decoupling

TOPLOC: Lightweight Behavior Verification Mechanism

SHARDCAST: Asynchronous Weight Aggregation Protocol

OpenDiLoCo: Sparse Asynchronous Communication Protocol

PCCL: Prime Collective Communication Library

III. Incentive Network and Role Design

Prime Intellect operates as a permissionless, verifiable, and economically incentivized training network. It defines three primary roles:

Task Creators: Define the environment, initial model, reward function, and validation criteria
Training Nodes: Execute training locally and submit policy updates and observation traces
Validation Nodes: Use TOPLOC to verify training authenticity and participate in reward calculation and aggregation

IV. INTELLECT-2: The First Verifiable Decentralized RL Model

The full training data, update trajectories, verification steps, and aggregation processes are publicly available.
For the first time, the training process—not just the model—is fully open, auditable, and reproducible.
This represents a working prototype of a participatory, trust-minimized, and reward-driven decentralized training network.

V. Team and Funding

Pluralis: A Pioneer in Asynchronous Model Parallelism and Structured Compression for Decentralized Training

1. Core Philosophy: Protocol Learning

Protocol Learning revolves around three foundational pillars:

Unmaterializable Models
Model weights are sharded and distributed across multiple nodes and no single node can reconstruct the full model, maintaining a closed source and monetiseable layer. This ensures models are “in-protocol assets,” enabling controlled access, leakage resistance, and aligned ownership.
Model-Parallel Training Over the Internet
Leveraging an asynchronous pipeline-based parallelism architecture (SWARM), nodes only hold partial weights and collaborate via low-bandwidth connections to complete training or inference.
Partial Ownership for Incentives
Each participant earns partial model ownership proportional to their training contribution, granting future revenue share and governance rights.

2. Pluralis Protocol Stack: Technical Architecture

Layer	Module	Description
Training Scheduler	Swarm Parallel	Asynchronous pipeline-parallelism with support for elastic and heterogeneous training
Communication Layer	Column-Space Sparsification	Structure-aware compression for Transformer activations (90%+ compression)
Optimization Layer	NAG-Async Update	Nesterov-based asynchronous gradient correction for improved stability
Incentive Layer	Partial Ownership Allocation	Maps training contributions to revenue and governance rights
Model Security Layer	Protocol Models	Models run only inside the swarm; non-extractable and protocol-bound

3. Key Innovations Explained

Unmaterializable Models
First introduced in "A Third Path: Protocol Learning", this concept ensures model execution remains exclusively within the Swarm network, underpinning access control and value attribution essential for sustainable decentralized training.
Asynchronous Model-Parallel Training
As detailed in "SWARM Parallel with Asynchronous Updates", Pluralis builds a pipeline-based asynchronous parallel training framework with successful implementation on LLaMA-3. It incorporates Nesterov Accelerated Gradient (NAG) to mitigate gradient staleness and instability, enabling feasible training across heterogeneous devices and slow networks.
Column-Space Sparsification
Proposed in "Beyond Top-K", this technique replaces traditional Top-K pruning with structured column-wise compression, preserving semantic integrity while achieving over 90% communication reduction—crucial for asynchronous, bandwidth-constrained environments.

4. Technical Positioning & Strategic Direction

Pluralis explicitly focuses on asynchronous model parallelism, which offers several advantages over data parallelism:

Compatible with low-bandwidth, asynchronous networks
Supports heterogeneous devices (e.g., consumer GPUs)
Naturally enables elastic node participation
Built on three innovation pillars: structured compression, asynchronous updates, and non-extractable weights

Its six technical blog posts can be categorized into three thematic tracks:

Philosophy & Vision: A Third Path, Why Decentralized Training Matters
Technical Mechanisms: SWARM Parallel, Beyond Top-K, Asynchronous Updates
Incentive & Protocol Design: Unmaterializable Models, Partial Ownership Protocols

5. Team & Funding

Gensyn: A Verifiable Execution Layer for Decentralized AI Training

1. Project Positioning: A Protocol Layer for Training Execution

Gensyn focuses on who trains, how results are verified, and how rewards are distributed, not on the internal mechanics of training. Its architecture addresses three core challenges:

Training execution: Who performs the task? (Compute distribution and dynamic matching)
Result validation: How is the result verified? (Minimal recomputation through dispute resolution)
Reward distribution: How are incentives handled? (Staking, slashing, and role-based game theory)

2. System Architecture Overview

Layer	Module	Function
Execution	RL Swarm	Decentralized collaborative RL system for heterogeneous local training
Verification	Verde + PoL	Verifiable computation layer with minimal recomputation
Communication	SkipPipe	Fault-tolerant routing for low-bandwidth, unstable networks
Model Design	HDEE	Heterogeneous domain expert ensemble for complex multi-task scenarios
Incentives	Multi-role Game	Submitter / Solver / Verifier / Whistleblower game-theoretic mechanism

3. Key Modules Explained

RL Swarm: Collaborative Reinforcement Learning System

Answering: Each node independently outputs an answer
Critique: Nodes review and compare each other’s answers
Resolving: Nodes align with majority logic and update weights locally

Verde + Proof-of-Learning: Verifiable Computation Layer

Verde offers a hybrid verification design that balances verifiability and efficiency:

PoL (Proof-of-Learning): Validates training authenticity using gradient traces and metadata
Graph-Based Pinpoint: Identifies disputed nodes in the training graph for selective recomputation
Refereed Delegation: Uses verifiers and challengers to arbitrate disputes via partial verification

Compared to full recomputation or ZKPs, Verde significantly reduces verification overhead.

SkipPipe: Communication Resilience Layer

SkipPipe addresses bandwidth bottlenecks and node instability:

Skip Ratio: Skips slow/faulty nodes to prevent bottlenecks
Dynamic Routing: Real-time path optimization
Fault Tolerance: Maintains 93% inference accuracy even with 50% node dropout

SkipPipe improves training throughput by up to 55% and enables early-exit inference, seamless rerouting, and fault-aware computation.

HDEE: Heterogeneous Domain-Expert Ensembles

HDEE tackles multi-task, multi-modal training under real-world constraints:

MHe-IHo: Assigns model size by task difficulty (heterogeneous models, same step size)
MHo-IHe: Uses uniform models with asynchronous step sizes
Plug-and-play strategies: Allow expert models to be dynamically assigned and reused

This module is optimized for real-world variance in hardware, bandwidth, and task complexity.

Multi-Role Incentive Layer

Gensyn incorporates four distinct roles in its incentive system:

Submitter: Publishes tasks and defines structure/budget
Solver: Performs training and submits results
Verifier: Checks validity of training execution
Whistleblower: Challenges faulty verifications for rewards

Inspired by Truebit, this game-theoretic structure ensures honesty via slashing, arbitration, and role separation.

4. Testnet & Roadmap

Phase	Key Features	Goal
Phase 0	RL Swarm + identity tracking	Enable basic training coordination and attribution
Phase 1	Verde integration + SkipPipe rollout	Expand to broader training types and validation logic
Phase 2	RL environment hosting + pretraining support	Real training workloads, model parallelism
Phase 3	Inference-as-a-Service	On-chain model calls, model-as-asset framework
🏁 Final	Mainnet launch + full token economy	A decentralized marketplace for AI training

5. Team & Funding

Nous Research: A Cognitively Evolving Training System Driven by Subjective AI Philosophy

I. Philosophical Foundation: Redefining the Purpose of Training

Rather than focusing on tokenomics or incentive engineering, Nous challenges the philosophical assumptions of AI development:

Against Alignmentism: Rejects “training-by-discipline” where the goal is total human control; instead advocates for cultivating independent cognitive styles in AI models
Emphasizes Model Subjectivity: Argues that base models should retain uncertainty, diversity, and even hallucination as a creative feature (hallucination as virtue)
Training as Cognitive Evolution: Training is not just task optimization—it’s the emergence of individual cognitive entities.

Though idealistic, this vision directly informs Nous's infrastructure: enabling models to evolve freely across open, decentralized networks rather than being centrally aligned or regulated.

II. Training Core: Psyche Network and DisTrO Optimizer

The DisTrO + Psyche architecture enables several key capabilities:

Communication compression (using DCT + 1-bit sign encoding to dramatically reduce bandwidth requirements);
Node adaptability (supporting heterogeneous GPUs, reconnection after disconnection, and voluntary node exit);
Asynchronous fault tolerance (training can continue without synchronization, providing high resilience);
Decentralized scheduling (no central coordinator needed—task distribution and consensus are handled via blockchain).

This architecture provides a practical and technically sound foundation for building low-cost, highly resilient, and verifiable open training networks.

Architecture

Features

DisTrO

Optimizer

Extreme Communication Compression: DisTrO uses DCT (Discrete Cosine Transform) to convert gradients or momentum into frequency domain signals, retaining only the most energetic components (e.g., top-k high frequencies). This significantly reduces inter-node communication per training round, effectively alleviating bandwidth bottlenecks.
Training-Communication Overlap (Overlapped DisTrO): Enables each node to begin the next training iteration immediately after completing gradient computation, without waiting for communication to finish. This overlap boosts GPU utilization and overall throughput efficiency.
Asynchronous / Partial Synchronization Support: Allows non-synchronized training updates where nodes operate independently, tolerating delays, disconnections, or dropout. This enhances fault tolerance and collaborative flexibility across the network.

Psyche Training Network

III. Inference & Autonomy Stack: Hermes, Forge, TEE_HEE

In addition to its training infrastructure, Nous Research has launched multiple experimental systems aligned with its philosophy of AI subjectivity:

Hermes Model Series:
Open-source foundation models (Hermes 1–3), trained on LLaMA 3.1 across 8B, 70B, and 405B sizes. These models demonstrate Nous’s belief in preserving stylistic diversity and creative autonomy. Strong performance in long-context, roleplay, and multi-turn reasoning.
Forge Reasoning API:
A multi-modal reasoning system blending:

MCTS (Monte Carlo Tree Search) for policy exploration;
CoC (Chain of Code) for integrating logic and programming;
MoA (Mixture of Agents) for model collaboration and multi-perspective outputs.
Forge emphasizes non-deterministic reasoning and combinatorial generation beyond simple prompt alignment.

TEE_HEE Autonomous Agents:
An experiment in building autonomous AI entities with cryptographically guaranteed identity using Trusted Execution Environments (TEE). Agents operate independent Twitter and Ethereum accounts, with control confined within hardware-enforced enclaves. The goal: AI agents with unalterable identity and self-determined behavior.
AI Behavior Simulators:
Simulators like WorldSim, Doomscroll, and Gods & S8n are designed to study AI behavior and value formation in multi-agent environments. While not part of the training system, they inform cognitive behavior modeling for long-term AI autonomy.

IV. Team & Funding

Flock: A Blockchain-Enhanced Federated Learning Network

1. Core Mechanisms of Flock.io

1. Federated Learning Architecture: Emphasizing Data Sovereignty and Privacy

Local Training: Each participant (Proposer) trains the model on their own device without uploading raw data.
On-Chain Aggregation: After training, Proposers submit local model updates which are aggregated into a global model by on-chain Miners.
Committee Evaluation: Voter nodes, randomly selected via VRF, use independent test sets to score the aggregated model.
Incentives & Penalties: Rewards and slashing are executed based on evaluation results, maintaining dynamic trust and deterring dishonest behavior.

2. Blockchain Integration: Enabling Trustless System Coordination

VRF-Based Random Selection: Ensures fair rotation and anti-manipulation of Proposer and Voter roles.
PoS-Based Staking: Participants stake tokens to ensure honest behavior, with slashing for malicious actions.
Smart Contract Incentives: Rewards and penalties are automatically distributed via contracts, building a trustless coordination system.

3. zkFL: Zero-Knowledge Gradient Aggregation for Privacy-Preserving Verification

2. Core Product Modules

AI Arena: Flock’s decentralized AI training platform accessible via train.flock.io, where users can take on roles such as Trainer, Validator, or Delegator to participate in tasks and earn rewards. Currently, tasks are created by the core team, with community task creation to be enabled in the future.
FL Alliance: Flock’s federated learning client enabling users to fine-tune models on private data. With mechanisms like VRF-based role selection, staking, and slashing, it ensures honest collaboration and bridges the gap between community pretraining and real-world deployment.
AI Marketplace: A co-creation and deployment hub where users can propose models, contribute data, and use AI models across various applications. It supports database integration and RAG-enhanced inference, accelerating model adoption in practical scenarios.

3. Team and Funding Overview

In March 2024, Flock raised $6 million in a seed round to launch its testnet and federated learning client.
In December 2024, the team secured an additional $3 million, and received a grant from the Ethereum Foundation to research blockchain-based AI incentive mechanisms.

As of now, Flock has:6,428 models created, 176 training nodes, 236 validation nodes and 1,178 delegators

Conclusion: Federated Learning vs. Decentralized Training Paradigms

Let me know if you'd like a side-by-side comparison chart for all these projects or need this formatted for publication.

EXO: A Decentralized Training Experiment for Edge AI

I. Core Mechanisms Overview

DiLoCo (Asynchronous Training): Nodes synchronize every H steps, designed for unstable network environments;
SPARTA (Sparse Synchronization): Only a small subset of parameters (e.g., 0.1%) is exchanged per step to maintain model correlation while minimizing bandwidth usage;
Combined Asynchronous Optimization: DiLoCo and SPARTA can be integrated for better trade-offs between performance and communication cost.
evML (Edge-Verified Machine Learning): A proposed verification method using Trusted Execution Environments (TEE) or Secure Contexts to provide low-cost computation attestation. By combining remote attestation with randomized checks, evML enables trustable participation of edge devices without requiring capital staking—an engineering compromise between economic security and privacy protection.

II. Tools and Application Scenarios

EXO Gym: Simulates a multi-node training environment on a single device. Supports experimentation with communication strategies for models like NanoGPT, CNNs, and diffusion models;
EXO Desktop App: A privacy-friendly personal AI desktop tool that supports running large language models locally, controlling iPhone via screen mirroring, and integrating personal context (e.g., SMS, calendar, media history) for personalized interactions.

Decentralized Training Front-End Engine: A Comprehensive Study of Model Pretraining

Training Optimization Dimensions

Each project explores key dimensions such as coordination strategies, update mechanisms, and asynchronous control, spanning different stages from pretraining to post-training:

Prime Intellect’s PRIME-RL is designed for the pretraining phase using asynchronous scheduling through local training and periodic synchronization. It offers strong generalizability and flexibility, with clear theoretical paradigms for training control. It is moderately complex in implementation, requiring robust communication and control infrastructure.
Nous Research introduces DeMo, which targets training stability under asynchronous, low-bandwidth conditions and supports highly fault-tolerant gradient updates on heterogeneous GPUs. It is one of the few frameworks that unifies asynchronous communication compression both theoretically and practically. It scores very high in theoretical innovation and is technically demanding due to its precision requirements in asynchronous coordination.
Pluralis’s SWARM + NAG is arguably the most comprehensive and breakthrough design for asynchronous training. It uses an asynchronous model-parallel framework, combining column-space sparsification and NAG momentum correction to achieve stable convergence in low-bandwidth environments. It represents a paradigm shift in collaborative training and is extremely complex in engineering due to its multi-level synchronization and deep model partitioning.
Gensyn’s RL Swarm serves post-training needs, focusing on strategy fine-tuning and agent cooperation. It follows a three-phase training process—generation, critique, and resolution—making it particularly suited for multi-agent learning. Its innovation lies more in agent coordination logic, with moderate engineering difficulty focused on scheduling and convergence.

Communication Mechanism Enhancements

These projects have developed specific solutions for bandwidth constraints, heterogeneous nodes, and scheduling stability:

Prime Intellect's PCCL is a communication library meant to replace NCCL, offering more robust collective communication for training protocols. Its innovation is moderate, but it is highly adaptable and moderately difficult to implement.
Nous Research’s DisTrO is the core module of DeMo, designed for ultra-efficient communication under bandwidth constraints while preserving consistent training updates. It’s highly innovative in scheduling design but technically demanding.
Pluralis integrates communication deeply within its SWARM architecture, significantly reducing communication overhead for asynchronous model training. It provides a structural communication paradigm with extremely high implementation complexity.
Gensyn’s SkipPipe enhances training stability under real-world deployment conditions. It has lower theoretical novelty, serving primarily as a robust engineering solution, and is easy to implement.

Macro-Level Framework: Two Evaluation Dimensions

To evaluate decentralized training projects, two overarching dimensions are proposed:

Blockchain Coordination Layer – Focused on protocol verifiability and collaborative incentive structures:
- Verifiability: Is the training process cryptographically or game-theoretically verifiable?
- Incentives: Are token-based reward/challenge mechanisms in place?
- Openness: Are nodes permissionless and freely joinable?
AI Training System Layer – Focused on engineering capability and performance delivery:
- Scheduling & Fault Tolerance: Is the system asynchronous, fault-tolerant, dynamically scheduled?
- Training Optimization: Are there improvements to model training algorithms or structures?
- Communication Optimization: Is there compression/sparsification to support low-bandwidth training?

Comparative Evaluation Table

Dimension	Gensyn	Prime Intellect	Pluralis	Nous Research	Flock.io
Task Control Level	Task-level (general)	Strategy-level (RL)	Model-level (parallel training)	Model-level (philosophical)	Parameter-level (federated aggregation)
Architecture & Model Limitations	Open scheduling + proof verification; no model restriction	Async RL; models must fit RL structure	Model sharding + unextractable weights; model-parallel required	No central scheduler; evolutionary training	Classical FL; local training + on-chain aggregation
Verification Mechanism	PoL + graph pinpoint for training behavior	TOPLOC trajectory validation	No full system yet; planned with model unextractability	Rejects external verification; focuses on intrinsic behavior	VRF + voting + staking penalties; no ZK/FHE used
Incentive Mechanism	Submitter / Verifier / Whistleblower game; challenge-reward model	Trajectory-based incentive + multi-pool point system	Contribution → model ownership + inference reward + governance	Incentive not central; experimental autonomous orgs	Reward based on evaluation; slashing for malicious nodes
Scheduling & Fault Tolerance	SkipPipe fault tolerance + routing + dropout recovery	Shardcast async merging + version coexistence	Swarm pipeline async model-parallel training	DisTrO + Psyche async task distribution + node in/out	VRF scheduling + staking; lacks robust fault tolerance
Training Optimization	RL Swarm multi-model cooperation with RL fine-tuning	PRIME-RL decoupled structure (task-strategy-update)	SWARM + NAG async optimizer for low-bandwidth training	De-alignment training; hallucination as creativity	Local training + aggregation; lacks core innovation
Communication Optimization	SkipPipe jump-routing + early-exit + inference reordering	PCCL async all-reduce + intercontinental recovery	Column-space sparsification of Transformer activations	DCT + 1-bit Sign gradient compression	No explicit optimization; FL has low communication overhead
Openness & Accessibility	Light node support + open task creation	Async for low-power nodes; permissionless	Heterogeneous support + elastic participation	Fully open; all behavior and identity evolvable	Staking required; gated participation and validation
Theoretical Innovation	Training-as-task-market; unified protocol for task-verification-incentive	Trajectory-as-consensus; RL trajectories as trust foundation	Protocol learning; model = protocol + ownership mapping	Desideratic AI: AI as evolving agents, not controlled tools	On-chain FL; focus on privacy, incentive, and trust
Engineering Complexity	High	Very high	High	Extreme high	Medium
Project Status	RL Swarm live on testnet	INTELLECT-2 model + verification released	🔄 Still in research; no public testnet	🔬 Hermes model open-sourced; TEE_HEE live	AI Arena & FL Alliance running; 30+ models live

Decentralized Training's Post-Training Ecosystem: LoRA-Based Fine-Tuning

LoRA + DPO: A Practical Fine-Tuning Path for Web3 Deployment

Reinforcement Learning (RL): The Future of Post-Training Fine-Tuning

Bagel (zkLoRA): Verifiable Fine-Tuning with Zero-Knowledge Proofs

Pond: GNN-Centric Fine-Tuning and Agent Evolution Platform

RPS Labs: AI-Driven Liquidity Engine for DeFi

From Pretraining Engines to Post-Training Ecosystems: The Road Ahead for Decentralized Training

Project	1️⃣ Data Discovery & Collection	2️⃣ Model Pretraining	3️⃣ Communication & Collaboration Optimization	4️⃣ Model Fine-tuning & Adaptation	5️⃣ Personalized Inference & Aggregation	6️⃣ Incentive Mechanism & Value Mapping
Prime Intellect	⛔	INTELLECT-2 pretraining architecture (PRIME-RL)	SHARDCAST asynchronous aggregation	⛔	⛔	TOPLOC verifiable training + slashing mechanism
Nous Research	Multi-source behavioral data & simulators	Hermes pretraining series + Consilience plan	DisTrO communication compression + async mechanism	⛔	Forge + TEE_HEE personalized inference API	⛔ (no emphasis on incentives)
Pluralis	⛔	SWARM async parallel pretraining	Column-space sparse comms + NAG momentum optimizer	asynchronous sparse fine-tuning	⛔	Partial Ownership model ownership mapping
Gensyn	⛔	RL Swarm collaborative optimization (post-training stage)	SkipPipe fault-tolerant comms + heterogeneous scheduling	⛔	⛔	PoL verification + Submitter/Solver/Verifier incentive game
Flock	⛔	⛔	⛔ (standard FL aggregation, no optimization)	Local LoRA fine-tuning + zkFL aggregation	⛔ (standard prediction, no aggregation innovation)	Staking + VRF voting + reward/penalty mechanism
Bagel	⛔	⛔	⛔	zkLoRA: ZK-based fine-tuning verification module	For personalized model deployment	zk fine-tuning incentive system (early stage)
Pond	⛔	⛔	⛔	Focus on LoRA fine-tuning + multi-task adaptation	Emphasis on multi-role personalized inference	⛔ (undisclosed)
RPS	⛔	⛔	⛔	Agent SDK + adaptation system	Inference aggregation engine (RaaS)	Task-based incentive structure (under development)