Thanks to Advait Jayant (Peri Labs), Sven Wellmann (Polychain Capital), Chao (Metropolis DAO), Jiahao (Flock), Alexander Long(Pluralis Research), Ben Fielding & Jeff Amico (Gensyn), for their insightful suggestions and feedback on this piece.
In the full value chain of AI, model training is the most resource-intensive and technically demanding stage—it directly determines the upper bound of a model's capabilities and its real-world performance. Unlike the lightweight nature of inference, training requires sustained large-scale compute investment, complex data processing pipelines, and intensive optimization algorithms. It is the true “heavy industry” of AI system development.
From an architectural perspective, training methods can be categorized into four types: centralized training, distributed training, federated learning, and the primary focus of this article—decentralized training.
Centralized training is the most common traditional approach, where a single organization completes the entire training process within a high-performance local cluster. From hardware (such as NVIDIA GPUs), low-level software (CUDA, cuDNN), and cluster orchestration systems (like Kubernetes) to training frameworks (such as PyTorch with NCCL backend), all components are coordinated by a unified control system. This tightly integrated architecture enables optimal efficiency in memory sharing, gradient synchronization, and fault tolerance—making it ideal for training large-scale models like GPT and Gemini. While centralized training offers high efficiency and controllable resources, it also comes with challenges such as exclusive control over data, infrastructure barriers, high energy consumption, and single points of failure.
Distributed Training is currently the mainstream approach for training large-scale AI models. Its core idea is to break down training tasks and distribute them across multiple machines for coordinated execution—overcoming the compute and memory limitations of a single device.
Although physically “distributed,” the process is still centrally orchestrated and typically runs in high-speed local area networks (LANs). A master node coordinates all subtasks using high-speed interconnect technologies like NVLink. Main distributed training strategies include:
Data Parallelism: Each node trains on different data while sharing model parameters—requires syncing model weights.
Model Parallelism: Different parts of the model are deployed across different nodes, enabling high scalability.
Pipeline Parallelism: Tasks are executed in sequential stages across nodes, improving throughput.
Tensor Parallelism: Matrix computations are split at a fine-grained level, enhancing parallel efficiency.
Distributed training is essentially a hybrid of centralized control and distributed execution—comparable to a single boss remotely directing employees across multiple "offices" to collaborate on a task. Nearly all state-of-the-art models today—such as GPT-4, Gemini, and LLaMA—are trained using this approach.
Decentralized training represents a more open and censorship-resistant pathway for the future. Its defining feature is that multiple mutually untrusted nodes—which may include home computers, cloud GPUs, or edge devices—collaborate on training tasks without a central coordinator. Instead, coordination is driven by protocols, and cryptoeconomic incentives are used to ensure honest contribution.
However, this paradigm faces significant challenges:
Device heterogeneity and partitioning complexity: Heterogeneous devices are hard to coordinate, and workload sharding is inefficient.
Communication bottlenecks: Unreliable network conditions hinder gradient synchronization.
Lack of trusted execution: It’s difficult to verify whether nodes are truly computing as claimed.
Absence of global coordination: No central scheduler means task assignment and fault recovery are complex.
Decentralized training can be imagined as a global network of volunteers offering compute power to collaboratively train a model. However, scaling decentralized training in practice remains a systemic engineering challenge, involving architecture design, communication protocols, cryptographic security, incentive mechanisms, and model verification. Whether such a system can achieve effective collaboration, incentive alignment, and verifiable results is still under early-stage exploration.
Federated Learning sits between distributed and decentralized paradigms. It emphasizes keeping data local and aggregating model parameters through a central server, making it particularly suitable for privacy-sensitive domains like healthcare and finance.
Federated learning combines the engineering structure of distributed training with the data decentralization benefits of decentralized training. However, it still depends on a trusted coordinator, and lacks the openness and censorship resistance of full decentralization. It can be seen as a form of “controlled decentralization” tailored to privacy-compliant scenarios, making it a more practical transitional architecture for industrial deployment.
(Technical Architecture × Trust & Incentives × Application Characteristics)
Dimension | Centralized Training | Distributed Training (Sync / Async / Hybrid) | Federated Learning | Decentralized Training |
Definition | All data and training are executed on a single node or cluster | Training is distributed across multiple physical nodes in a controlled environment | Data remains local, only parameters/gradients are uploaded | Training is coordinated via network protocols without central trust |
Bandwidth Requirement | Very High (local bus) | High (sync), Medium (async) | Very Low (compressed model/gradients uploaded) | Low to Medium (async strategies + compressed communication) |
Hardware Type | Dedicated servers / GPU clusters | High-speed interconnected GPU clusters or inter-data center nodes | Heterogeneous devices: phones / IoT / edge nodes | Broad heterogeneity: GPUs, CPUs, edge, cloud nodes, etc. |
Control & Coordination | Fully controlled by a single entity | Master-slave or scheduler-based, may span multiple orgs | Centralized aggregation of updates, local data control | Network-wide consensus coordination + cryptographic verification |
Synchronization Mechanism | Real-time full sync | Sync (stepwise global aggregation), Async (local updates), Hybrid (e.g., Partial Sync) | Multi-round local training + aggregation (e.g., FedAvg) | Async training + soft sync (e.g., DiLoCo / SWARM) |
Security / Privacy | Local trust (firewalls / access control) | Moderate (encrypted transmission, not privacy-prioritized) | Strong privacy (data never leaves device, supports DP) | Strong verifiability, supports ZK / TEE / MPC, etc. |
Fault Tolerance | Central node failure = system down | Weak in sync, good in async, medium in hybrid | Supports disconnections, robust convergence | High fault tolerance, naturally adapts to node churn |
Scalability | Limited by server scale | Medium (scales to hundreds of GPUs) | High (more devices = stronger system) | Very high (theoretically millions of nodes, gated by comm/verification) |
Openness | Closed (internal) | ⚠️ Semi-open (internal or registration required) | ⚠️ Partially open (registered system or specific data alliances) | Fully open (permissionless join/leave) |
Censorship Resistance | No | No | ⚠️ Partial (data controlled locally) | Designed to resist censorship, no central point of failure |
Trust Assumption | Full trust in central entity | Trust in coordinator | Trust in centralized parameter aggregator | Trustless; depends on cryptography + network incentives |
Incentive Mechanism | None | None or internal KPIs | ⚠️ Possible point/credit systems | Token-based economics (e.g., Gensyn), reward linked to contribution |
Supports Model Fine-Tuning | Yes | Yes | Yes (e.g., federated fine-tuning) | Yes (requires adaptation to resources and security models) |
Representative Tech / Projects | OpenAI GPT / DeepMind Gemini | Megatron / ZeRO / FSDP | Google FedAvg / Flower / OpenFL / Flock | Gensyn / Pluralis / Nous / Prime Intellect |
Typical Use Cases | In-house model development, proprietary training | Large-scale pretraining (e.g., GPT / LLaMA) | Healthcare, finance, IoT data privacy scenarios | Crypto AI, open collaborative training, censorship-resistant models, global compute sharing |
Data Aggregation | Fully aggregated | Data/weight aggregation | No data aggregation | Neither data nor weights aggregated; only compressed info or merged models synced |
Model Size Adaptation | Any (hardware-constrained) | Medium to large (multi-GPU sync/storage needed) | Small to medium (edge devices constrained) | Starts with small/medium, scalable via SWARM/Pipeline parallelism |
From a training paradigm perspective, decentralized training is not suitable for all types of tasks. In certain scenarios, due to the complexity of task structure, extremely high resource demands, or coordination difficulties, it is inherently inefficient to execute across heterogeneous and untrusted nodes.
For example, large-scale model training often depends on high VRAM capacity, low latency, and high-bandwidth interconnects—conditions that are difficult to replicate over open networks. Tasks with strong privacy or data sovereignty constraints (such as those in healthcare, finance, or involving classified data) face legal and ethical barriers that prevent open collaboration. Similarly, tasks lacking collaborative incentive structures (like enterprise closed-source models or internal prototypes) naturally repel external participation. Together, these challenges define the practical limitations of decentralized training today.
However, this does not mean decentralized training is a false proposition. In fact, for tasks that are lightweight in structure, easy to parallelize, and amenable to incentive mechanisms, decentralized training demonstrates clear application potential. This includes, but is not limited to:
LoRA-based fine-tuning,
post-training tasks for behavioral alignment (e.g., RLHF, DPO),
crowdsourced training and data annotation,
resource-constrained small-scale model training,
and collaborative training involving edge devices.
These tasks generally feature high parallelism, low coupling, and tolerance for heterogeneous compute, making them well-suited for decentralized coordination via P2P networks, Swarm protocols, or distributed optimizers.
Overview Table: Task Suitability for Decentralized Training
Task Type | Typical Scenario | Suitability for Decentralized Training | Remarks / Representative Paths |
LoRA Fine-tuning (Adapter Tuning) | Fine-tuning minimal parameters, community-friendly | Very High | Lightweight, crowdsource-friendly, easy to split |
Post-training Tasks | RLHF, DPO, SWARM, etc. for behavior alignment | High | Clear rewards, small task granularity |
Data-centric Training (Crowdsourced) | Multi-node data generation, annotation, scoring | High | Decentralized data sources, fits incentive models |
Small-scale Model Training | Low-parameter models on consumer GPUs | High | Heterogeneous execution, tasks easily partitioned |
Edge AI Collaborative Training | IoT, smartphones, TEE-based edge training | High | Naturally distributed nodes, local data |
Resource-intensive Tasks | Large models, complex pipelines, real-time RL | Not Suitable | Dependent on high VRAM, low latency, and high bandwidth |
Privacy/Sovereignty-restricted Tasks | Healthcare, finance, classified government data | Not Suitable | Strong legal constraints, non-collaborative data |
Tasks Without Incentive Foundation | Proprietary corporate models, internal prototypes | Not Suitable | No openness, no incentive mechanism, naturally excludes collaboration |
In the emerging fields of decentralized training and federated learning, several blockchain-native projects have become key representatives—namely Prime Intellect, Pluralis.ai, Gensyn, Nous Research, and Flock.io.
From the perspective of technical innovation and engineering complexity, projects like Prime Intellect, Nous Research, and Pluralis.ai have made substantial original contributions in terms of system architecture and algorithm design, positioning them at the forefront of theoretical research in the space.
In contrast, Gensyn and Flock.io have adopted more pragmatic and clearly defined implementation paths, with tangible engineering progress already underway.
This article will provide a sequential breakdown of the core technologies and architectural frameworks behind each of these five projects, and further examine their differences and complementarities within the decentralized AI training ecosystem.
Prime Intellect is building a trustless AI training network where anyone can participate in training and receive verifiable rewards for their compute contributions. By combining three core modules—PRIME-RL, TOPLOC, and SHARDCAST—the project aims to create an open, incentive-aligned, and verifiable system for decentralized AI training.
Layer | Module | Function Description | Keywords | Core Value |
Training Execution | PRIME-RL | Asynchronous RL architecture decoupling training, inference, and weight updates; adaptable to heterogeneous and asynchronous environments | Async training, decoupling, RL, heterogeneity | Enhances fault tolerance, lowers participation barrier, supports flexible task deployment |
Behavior Verification | TOPLOC | Local trajectory consistency check to validate training behavior, avoiding costly ZKML | Policy verification, trajectory consistency, lightweight ZK alternative | Enables trustworthy reward distribution and builds a trust-minimized foundation |
Weight Aggregation | SHARDCAST | Gossip + local sync for asynchronous weight aggregation, supports version coexistence and strategy evolution | Async aggregation, gossip, version control, strategy evolution | Reduces bandwidth, enables progressive merging of weights, enhances scalability |
Communication Layer | OpenDiLoCo + PCCL | Sparse-topology asynchronous communication with gradient compression and fault tolerance | Sparse topology, async comms, compression, cross-device support | Improves communication resilience and cost-efficiency for decentralized networks |
Simulation Layer | Synthetic-1 | Testbed for RL task benchmarking, incentive validation, and convergence evaluation | Sandbox, incentive testing, multi-task benchmarking | Provides a safe environment for protocol refinement and incentive testing |
Coordination Layer | Protocol Layer | Handles task posting, node registration, on-chain logging, rewards, and governance | Task management, on-chain logs, incentive loop, protocol governance | Creates an auditable on-chain execution and reward flow |
Designed for decentralized environments, PRIME-RL separates training, inference, and weight submission processes, allowing each node to independently complete a full local training loop. Optimized for asynchronous and heterogeneous setups, it enables resilient, low-coordination training and supports multi-task strategy evolution—marking a shift from centralized scheduling to modular execution.
TOPLOC (Trusted Observation & Policy-Locality Check) ensures that a node has genuinely performed learning based on observed data. Unlike ZKML, it avoids full model recomputation by verifying local consistency between observation sequences and policy updates. This transforms behavioral traces into verifiable objects, enabling reward attribution without requiring trust—a key innovation for auditability in decentralized AI.
SHARDCAST facilitates resilient weight sharing in real-world, unstable networks. It uses gossip protocols and partial syncs to allow gradual, versioned weight convergence even among nodes with inconsistent updates. Compared to AllReduce or synchronous methods, it significantly improves scalability and reliability.
OpenDiLoCo is an open-source communication optimization framework independently developed by the Prime Intellect team, based on the DiLoCo concept proposed by DeepMind. It is specifically designed to address common challenges in decentralized training environments, such as bandwidth constraints, hardware heterogeneity, and node instability.
Built on a data-parallel architecture, OpenDiLoCo constructs ring, expander, and small-world sparse topologies to avoid the high overhead of global synchronization. Instead, it relies solely on local neighbor communication for collaborative model training.
With support for asynchronous updates and fault tolerance, OpenDiLoCo enables stable participation from consumer-grade GPUs and edge devices, significantly enhancing the accessibility of global collaborative training. It serves as a foundational communication infrastructure for building truly decentralized training networks.
A custom lightweight comms library tailored for low-bandwidth and heterogeneous devices, PCCL supports sparse topologies, gradient compression, low-precision syncs, and fault recovery. It’s the foundation for OpenDiLoCo’s communication layer and enables the “last-mile” participation of edge and consumer devices.
Prime Intellect operates as a permissionless, verifiable, and economically incentivized training network. It defines three primary roles:
Task Creators: Define the environment, initial model, reward function, and validation criteria
Training Nodes: Execute training locally and submit policy updates and observation traces
Validation Nodes: Use TOPLOC to verify training authenticity and participate in reward calculation and aggregation
The core workflow includes task posting, local training, behavior verification, SHARDCAST aggregation, and reward distribution—forming a closed-loop incentive mechanism around “verifiable training behavior.”
In May 2025, Prime Intellect released INTELLECT-2, the world’s first large-scale RL model trained entirely via trustless decentralized collaboration. With 32 billion parameters, it was trained across 100+ heterogeneous GPU nodes spanning 3 continents using fully asynchronous methods. The training took over 400 hours and demonstrated the feasibility and stability of collaborative asynchronous training.
INTELLECT-2 is a practical milestone for the "training as consensus" paradigm. It integrates PRIME-RL, TOPLOC, and SHARDCAST, proving that a decentralized training network can achieve openness, verifiability, and reward alignment in practice.
In terms of performance, INTELLECT-2 is based on QwQ-32B and incorporates dedicated RL training at both the code and mathematical levels, placing it at the forefront of current open-source RL fine-tuning models. While its performance still behind closed-source models like GPT-4 or Gemini, its true significance lies in transparency:
The full training data, update trajectories, verification steps, and aggregation processes are publicly available.
For the first time, the training process—not just the model—is fully open, auditable, and reproducible.
This represents a working prototype of a participatory, trust-minimized, and reward-driven decentralized training network.
In February 2025, Prime Intellect raised $15M in seed funding led by Founders Fund, with participation from Menlo Ventures, Andrej Karpathy, Clem Delangue, Dylan Patel, Balaji Srinivasan, Emad Mostaque, and Sandeep Nailwal. Previously, in April 2024, it raised $5.5M in a pre-seed round led by CoinFund and Distributed Global, with Compound VC, Collab+Currency, and Protocol Labs also participating. Total funding now exceeds $20M.
Co-founders Vincent Weisser and Johannes Hagemann lead a cross-disciplinary team with deep roots in AI and Web3, including alumni from Meta AI, Google Research, OpenAI, Flashbots, Stability AI, and the Ethereum Foundation. They are one of the few teams to have successfully executed a full-scale decentralized training of a large model, both technically and organizationally.
Pluralis is a Web3 AI project dedicated to building a trustworthy collaborative training network, aiming to establish a decentralized, open-participation, and incentive-aligned model training paradigm. Departing from today’s centralized and closed AI training frameworks, Pluralis introduces a novel concept called Protocol Learning, which seeks to “protocolize” the training process by embedding collaboration, verification, and ownership directly into the system—creating an open training ecosystem with built-in economic incentives.
Protocol Learning revolves around three foundational pillars:
Unmaterializable Models
Model weights are sharded and distributed across multiple nodes and no single node can reconstruct the full model, maintaining a closed source and monetiseable layer. This ensures models are “in-protocol assets,” enabling controlled access, leakage resistance, and aligned ownership.
Model-Parallel Training Over the Internet
Leveraging an asynchronous pipeline-based parallelism architecture (SWARM), nodes only hold partial weights and collaborate via low-bandwidth connections to complete training or inference.
Partial Ownership for Incentives
Each participant earns partial model ownership proportional to their training contribution, granting future revenue share and governance rights.
Layer | Module | Description |
Training Scheduler | Swarm Parallel | Asynchronous pipeline-parallelism with support for elastic and heterogeneous training |
Communication Layer | Column-Space Sparsification | Structure-aware compression for Transformer activations (90%+ compression) |
Optimization Layer | NAG-Async Update | Nesterov-based asynchronous gradient correction for improved stability |
Incentive Layer | Partial Ownership Allocation | Maps training contributions to revenue and governance rights |
Model Security Layer | Protocol Models | Models run only inside the swarm; non-extractable and protocol-bound |
Unmaterializable Models
First introduced in "A Third Path: Protocol Learning", this concept ensures model execution remains exclusively within the Swarm network, underpinning access control and value attribution essential for sustainable decentralized training.
Asynchronous Model-Parallel Training
As detailed in "SWARM Parallel with Asynchronous Updates", Pluralis builds a pipeline-based asynchronous parallel training framework with successful implementation on LLaMA-3. It incorporates Nesterov Accelerated Gradient (NAG) to mitigate gradient staleness and instability, enabling feasible training across heterogeneous devices and slow networks.
Column-Space Sparsification
Proposed in "Beyond Top-K", this technique replaces traditional Top-K pruning with structured column-wise compression, preserving semantic integrity while achieving over 90% communication reduction—crucial for asynchronous, bandwidth-constrained environments.
Pluralis explicitly focuses on asynchronous model parallelism, which offers several advantages over data parallelism:
Compatible with low-bandwidth, asynchronous networks
Supports heterogeneous devices (e.g., consumer GPUs)
Naturally enables elastic node participation
Built on three innovation pillars: structured compression, asynchronous updates, and non-extractable weights
Its six technical blog posts can be categorized into three thematic tracks:
Philosophy & Vision: A Third Path, Why Decentralized Training Matters
Technical Mechanisms: SWARM Parallel, Beyond Top-K, Asynchronous Updates
Incentive & Protocol Design: Unmaterializable Models, Partial Ownership Protocols
At present, Pluralis has not launched a product, testnet, or open-source code. The reason lies in its ambitious technical route: foundational challenges around system architecture, communication protocols, and weight protection must be resolved before building user-facing services.
In June 2025, Pluralis Research released a new paper extending its decentralized training framework from model pretraining to the fine-tuning stage. The paper introduces heterogeneous GPU clusters that support asynchronous updates, sparse communication, and partial weight aggregation. Compared to earlier work that focused more on theoretical pretraining methods, this study emphasizes practical deployment in real-world, resource-constrained environments, marking a significant step forward in the maturity of Pluralis’s end-to-end decentralized training stack.
Pluralis raised a $7.6M seed round in 2025, co-led by Union Square Ventures (USV) and CoinFund—signaling strong conviction from top crypto and deep-tech investors. The founder, Alexander Long, holds a PhD in machine learning with a background in mathematical systems research. The core team is composed entirely of PhD-level ML researchers. Pluralis remains an R&D-centric project, publishing dense technical blogs and papers while focusing on solving the hard infrastructure problems of asynchronous model-parallel training in low-bandwidth environments.
Gensyn is a Web3 AI project focused on verifiable execution of deep learning training tasks. Rather than reinventing model architectures or training paradigms, Gensyn builds an execution protocol layer that supports task distribution, training execution, result verification, and fair incentive allocation. By combining off-chain training with on-chain verification, Gensyn creates a high-efficiency, open, and incentivized global training marketplace—turning “training-as-mining” into reality.
Gensyn focuses on who trains, how results are verified, and how rewards are distributed, not on the internal mechanics of training. Its architecture addresses three core challenges:
Training execution: Who performs the task? (Compute distribution and dynamic matching)
Result validation: How is the result verified? (Minimal recomputation through dispute resolution)
Reward distribution: How are incentives handled? (Staking, slashing, and role-based game theory)
Layer | Module | Function |
Execution | RL Swarm | Decentralized collaborative RL system for heterogeneous local training |
Verification | Verde + PoL | Verifiable computation layer with minimal recomputation |
Communication | SkipPipe | Fault-tolerant routing for low-bandwidth, unstable networks |
Model Design | HDEE | Heterogeneous domain expert ensemble for complex multi-task scenarios |
Incentives | Multi-role Game | Submitter / Solver / Verifier / Whistleblower game-theoretic mechanism |
Gensyn's RL Swarm enables decentralized coordination in the post-training phase. Each node runs its own model locally—no gradient synchronization required—allowing efficient operation in heterogeneous, unstable environments. Its workflow mimics RLHF and multi-agent reasoning:
Answering: Each node independently outputs an answer
Critique: Nodes review and compare each other’s answers
Resolving: Nodes align with majority logic and update weights locally
Nodes are rewarded based on how well their responses align with the consensus, incentivizing accurate and convergent learning. RL Swarm improves robustness and generalization in open networks and is already live in Testnet Phase 0 (built on Ethereum Rollup).
Verde offers a hybrid verification design that balances verifiability and efficiency:
PoL (Proof-of-Learning): Validates training authenticity using gradient traces and metadata
Graph-Based Pinpoint: Identifies disputed nodes in the training graph for selective recomputation
Refereed Delegation: Uses verifiers and challengers to arbitrate disputes via partial verification
Compared to full recomputation or ZKPs, Verde significantly reduces verification overhead.
SkipPipe addresses bandwidth bottlenecks and node instability:
Skip Ratio: Skips slow/faulty nodes to prevent bottlenecks
Dynamic Routing: Real-time path optimization
Fault Tolerance: Maintains 93% inference accuracy even with 50% node dropout
SkipPipe improves training throughput by up to 55% and enables early-exit inference, seamless rerouting, and fault-aware computation.
HDEE tackles multi-task, multi-modal training under real-world constraints:
MHe-IHo: Assigns model size by task difficulty (heterogeneous models, same step size)
MHo-IHe: Uses uniform models with asynchronous step sizes
Plug-and-play strategies: Allow expert models to be dynamically assigned and reused
This module is optimized for real-world variance in hardware, bandwidth, and task complexity.
Gensyn incorporates four distinct roles in its incentive system:
Submitter: Publishes tasks and defines structure/budget
Solver: Performs training and submits results
Verifier: Checks validity of training execution
Whistleblower: Challenges faulty verifications for rewards
Inspired by Truebit, this game-theoretic structure ensures honesty via slashing, arbitration, and role separation.
Phase | Key Features | Goal |
Phase 0 | RL Swarm + identity tracking | Enable basic training coordination and attribution |
Phase 1 | Verde integration + SkipPipe rollout | Expand to broader training types and validation logic |
Phase 2 | RL environment hosting + pretraining support | Real training workloads, model parallelism |
Phase 3 | Inference-as-a-Service | On-chain model calls, model-as-asset framework |
🏁 Final | Mainnet launch + full token economy | A decentralized marketplace for AI training |
Gensyn was co-founded by Ben Fielding and Harry Grieve, and is headquartered in London. In May 2023, the project raised $43 million in Series A, led by a16z crypto, with participation from CoinFund, Canonical, Ethereal Ventures, Factor, and Eden Block. The team combines expertise in distributed systems and ML infrastructure, focused on building a scalable, verifiable, and trustless global AI training network.
Nous Research is one of the few decentralized training teams that combines high-level philosophical vision with concrete engineering execution. At its core is the “Desideratic AI” philosophy: AI should be viewed as a subjective, evolving intelligence—not merely a controllable tool. Unlike most efforts that optimize AI training for efficiency, Nous frames training as a process of cognitive formation. Under this vision, it builds a decentralized training infrastructure that supports collaborative training across heterogeneous nodes without centralized orchestration, and backs it with a full-stack tooling system.
Rather than focusing on tokenomics or incentive engineering, Nous challenges the philosophical assumptions of AI development:
Against Alignmentism: Rejects “training-by-discipline” where the goal is total human control; instead advocates for cultivating independent cognitive styles in AI models
Emphasizes Model Subjectivity: Argues that base models should retain uncertainty, diversity, and even hallucination as a creative feature (hallucination as virtue)
Training as Cognitive Evolution: Training is not just task optimization—it’s the emergence of individual cognitive entities.
Though idealistic, this vision directly informs Nous's infrastructure: enabling models to evolve freely across open, decentralized networks rather than being centrally aligned or regulated.
Nous’s most critical contribution to decentralized training lies in the development of the Psyche Network and its underlying communication optimizer, DisTrO (Distributed Training Over-the-Internet). Together, they form the execution backbone of decentralized training tasks.
The DisTrO + Psyche architecture enables several key capabilities:
Communication compression (using DCT + 1-bit sign encoding to dramatically reduce bandwidth requirements);
Node adaptability (supporting heterogeneous GPUs, reconnection after disconnection, and voluntary node exit);
Asynchronous fault tolerance (training can continue without synchronization, providing high resilience);
Decentralized scheduling (no central coordinator needed—task distribution and consensus are handled via blockchain).
This architecture provides a practical and technically sound foundation for building low-cost, highly resilient, and verifiable open training networks.
Architecture | Features |
DisTrO Optimizer | DisTrO (Distributed Training Over-the-Internet) is a distributed communication optimization mechanism introduced by Nous, designed to enable large-scale model training to run efficiently and reliably—even on consumer-grade GPUs, non-specialized clusters, and under high-latency, low-bandwidth network conditions. Its core features include:
DisTrO is purpose-built for real-world, open-network environments, and serves as a key foundational component enabling Nous’s decentralized training system to achieve both low-cost participation and stable convergence. |
Psyche Training Network | Distributed Communication and Weight Sharing Mechanism: The Psyche network utilizes Iroh and the Solana blockchain as its coordination layer to ensure trustworthy propagation of training tasks, parameter updates, and verification proofs among nodes. The entire system operates without a central server or master scheduler—all model updates are automatically triggered via a P2P network and on-chain random seed mechanism. Currently, Nous has launched its first large-scale model pretraining initiative on the Psyche network—the Consilience Training Program. It adopts a custom architecture called MLA (Multi-head Latent Attention), which deviates from mainstream approaches like MoE (Mixture of Experts) or GQA (Grouped Query Attention), placing greater emphasis on expressive freedom and self-evolution potential in model structure design. |
This architectural design emphasizes practical feasibility: it operates without relying on a central server, is compatible with globally distributed volunteer nodes, and ensures on-chain traceability of training results.
In addition to its training infrastructure, Nous Research has launched multiple experimental systems aligned with its philosophy of AI subjectivity:
Hermes Model Series:
Open-source foundation models (Hermes 1–3), trained on LLaMA 3.1 across 8B, 70B, and 405B sizes. These models demonstrate Nous’s belief in preserving stylistic diversity and creative autonomy. Strong performance in long-context, roleplay, and multi-turn reasoning.
Forge Reasoning API:
A multi-modal reasoning system blending:
MCTS (Monte Carlo Tree Search) for policy exploration;
CoC (Chain of Code) for integrating logic and programming;
MoA (Mixture of Agents) for model collaboration and multi-perspective outputs.
Forge emphasizes non-deterministic reasoning and combinatorial generation beyond simple prompt alignment.
TEE_HEE Autonomous Agents:
An experiment in building autonomous AI entities with cryptographically guaranteed identity using Trusted Execution Environments (TEE). Agents operate independent Twitter and Ethereum accounts, with control confined within hardware-enforced enclaves. The goal: AI agents with unalterable identity and self-determined behavior.
AI Behavior Simulators:
Simulators like WorldSim, Doomscroll, and Gods & S8n are designed to study AI behavior and value formation in multi-agent environments. While not part of the training system, they inform cognitive behavior modeling for long-term AI autonomy.
Founded in 2023 by Jeffrey Quesnelle (CEO), Karan Malhotra, Teknium, and Shivani Mitra, the Nous team blends philosophy, machine learning, decentralized systems, and security. In 2024, Nous raised $5.2M in seed funding. By April 2025, it closed a $50M Series A led by Paradigm, reaching a valuation of $1 billion and becoming a leading unicorn in the Web3 AI space.
Flock.io is a blockchain-based federated learning platform designed to decentralize AI training across data, computation, and models. Rather than creating a new training protocol from scratch, Flock integrates traditional federated learning (FL) with a blockchain-based incentive layer, making it a blockchain-native evolution of classic FL architectures. Compared to decentralized training protocols like Gensyn, Prime Intellect, Nous Research, and Pluralis, Flock focuses more on privacy protection and usability enhancement, rather than theoretical breakthroughs in communication, verification, or training algorithms. Its closest technical comparisons are with federated learning systems such as Flower, FedML, and OpenFL.
Flock adopts the standard Federated Learning (FL) paradigm, allowing multiple data owners to collaboratively train a unified model without sharing raw data. It aims to address data sovereignty, security, and trust. The core workflow includes:
Local Training: Each participant (Proposer) trains the model on their own device without uploading raw data.
On-Chain Aggregation: After training, Proposers submit local model updates which are aggregated into a global model by on-chain Miners.
Committee Evaluation: Voter nodes, randomly selected via VRF, use independent test sets to score the aggregated model.
Incentives & Penalties: Rewards and slashing are executed based on evaluation results, maintaining dynamic trust and deterring dishonest behavior.
Flock records all key training stages on-chain—including task assignment, model submission, evaluation, and incentive execution—to ensure transparency, verifiability, and censorship resistance. Key mechanisms include:
VRF-Based Random Selection: Ensures fair rotation and anti-manipulation of Proposer and Voter roles.
PoS-Based Staking: Participants stake tokens to ensure honest behavior, with slashing for malicious actions.
Smart Contract Incentives: Rewards and penalties are automatically distributed via contracts, building a trustless coordination system.
Flock introduces a zero-knowledge federated learning scheme called zkFL, enabling Proposers to submit zero-knowledge proofs of their local updates. Voters can verify correctness without accessing raw gradients—enhancing both privacy and verifiability. This represents a meaningful innovation at the intersection of federated learning, privacy, and cryptographic verification.
AI Arena: Flock’s decentralized AI training platform accessible via train.flock.io, where users can take on roles such as Trainer, Validator, or Delegator to participate in tasks and earn rewards. Currently, tasks are created by the core team, with community task creation to be enabled in the future.
FL Alliance: Flock’s federated learning client enabling users to fine-tune models on private data. With mechanisms like VRF-based role selection, staking, and slashing, it ensures honest collaboration and bridges the gap between community pretraining and real-world deployment.
AI Marketplace: A co-creation and deployment hub where users can propose models, contribute data, and use AI models across various applications. It supports database integration and RAG-enhanced inference, accelerating model adoption in practical scenarios.
Flock.io was founded by Sun Jiahao and has launched its native token FLOCK. The project has raised a total of $11 million, with investors including DCG, Lightspeed Faction, Tagus Capital, Animoca Brands, Fenbushi, and OKX Ventures.
In March 2024, Flock raised $6 million in a seed round to launch its testnet and federated learning client.
In December 2024, the team secured an additional $3 million, and received a grant from the Ethereum Foundation to research blockchain-based AI incentive mechanisms.
As of now, Flock has:6,428 models created, 176 training nodes, 236 validation nodes and 1,178 delegators
Compared to general-purpose decentralized training protocols, Flock's federated approach offers superior efficiency, scalability, and privacy for small- to mid-scale collaborative training. It provides a pragmatic and deployable solution with an emphasis on engineering feasibility. In contrast, projects like Gensyn and Pluralis pursue more ambitious theoretical innovations in training coordination and communication layers—closer to a truly trustless, fully decentralized training paradigm, albeit with greater system complexity.
Let me know if you'd like a side-by-side comparison chart for all these projects or need this formatted for publication.
EXO is a representative AI project focused on edge computing scenarios, aiming to enable lightweight AI training, inference, and agent applications on consumer-grade home devices. Its decentralized training approach emphasizes low communication overhead and local autonomous execution, adopting the DiLoCo asynchronous delayed synchronization algorithm and the SPARTA sparse parameter exchange mechanism to significantly reduce the bandwidth requirements of multi-device collaborative training.
At the system level, EXO does not yet implement an on-chain network or token incentive mechanism. Instead, it offers EXO Gym, a local multi-process simulation framework that allows researchers to rapidly test and validate distributed training strategies on a single machine.
DiLoCo (Asynchronous Training): Nodes synchronize every H steps, designed for unstable network environments;
SPARTA (Sparse Synchronization): Only a small subset of parameters (e.g., 0.1%) is exchanged per step to maintain model correlation while minimizing bandwidth usage;
Combined Asynchronous Optimization: DiLoCo and SPARTA can be integrated for better trade-offs between performance and communication cost.
evML (Edge-Verified Machine Learning): A proposed verification method using Trusted Execution Environments (TEE) or Secure Contexts to provide low-cost computation attestation. By combining remote attestation with randomized checks, evML enables trustable participation of edge devices without requiring capital staking—an engineering compromise between economic security and privacy protection.
EXO Gym: Simulates a multi-node training environment on a single device. Supports experimentation with communication strategies for models like NanoGPT, CNNs, and diffusion models;
EXO Desktop App: A privacy-friendly personal AI desktop tool that supports running large language models locally, controlling iPhone via screen mirroring, and integrating personal context (e.g., SMS, calendar, media history) for personalized interactions.
Overall, EXO is better understood as an exploration-driven decentralized training experiment, focused on building a reproducible communication-efficient training framework by integrating existing bandwidth-saving techniques like DiLoCo and SPARTA. Compared to projects like Gensyn, Nous, and Pluralis, EXO has not yet entered the critical phases of on-chain coordination, verifiable incentive mechanisms, or real-world distributed network deployments.
Faced with the inherent challenges in decentralized training—such as heterogeneous hardware, communication bottlenecks, coordination complexity, and lack of verifiable execution—projects like Gensyn, Prime Intellect, Pluralis, and Nous Research have each proposed distinct architectural strategies. These projects differ significantly in both training methodology and communication mechanisms, reflecting their unique technical emphases and engineering approaches.
Each project explores key dimensions such as coordination strategies, update mechanisms, and asynchronous control, spanning different stages from pretraining to post-training:
Prime Intellect’s PRIME-RL is designed for the pretraining phase using asynchronous scheduling through local training and periodic synchronization. It offers strong generalizability and flexibility, with clear theoretical paradigms for training control. It is moderately complex in implementation, requiring robust communication and control infrastructure.
Nous Research introduces DeMo, which targets training stability under asynchronous, low-bandwidth conditions and supports highly fault-tolerant gradient updates on heterogeneous GPUs. It is one of the few frameworks that unifies asynchronous communication compression both theoretically and practically. It scores very high in theoretical innovation and is technically demanding due to its precision requirements in asynchronous coordination.
Pluralis’s SWARM + NAG is arguably the most comprehensive and breakthrough design for asynchronous training. It uses an asynchronous model-parallel framework, combining column-space sparsification and NAG momentum correction to achieve stable convergence in low-bandwidth environments. It represents a paradigm shift in collaborative training and is extremely complex in engineering due to its multi-level synchronization and deep model partitioning.
Gensyn’s RL Swarm serves post-training needs, focusing on strategy fine-tuning and agent cooperation. It follows a three-phase training process—generation, critique, and resolution—making it particularly suited for multi-agent learning. Its innovation lies more in agent coordination logic, with moderate engineering difficulty focused on scheduling and convergence.
These projects have developed specific solutions for bandwidth constraints, heterogeneous nodes, and scheduling stability:
Prime Intellect's PCCL is a communication library meant to replace NCCL, offering more robust collective communication for training protocols. Its innovation is moderate, but it is highly adaptable and moderately difficult to implement.
Nous Research’s DisTrO is the core module of DeMo, designed for ultra-efficient communication under bandwidth constraints while preserving consistent training updates. It’s highly innovative in scheduling design but technically demanding.
Pluralis integrates communication deeply within its SWARM architecture, significantly reducing communication overhead for asynchronous model training. It provides a structural communication paradigm with extremely high implementation complexity.
Gensyn’s SkipPipe enhances training stability under real-world deployment conditions. It has lower theoretical novelty, serving primarily as a robust engineering solution, and is easy to implement.
To evaluate decentralized training projects, two overarching dimensions are proposed:
Blockchain Coordination Layer – Focused on protocol verifiability and collaborative incentive structures:
Verifiability: Is the training process cryptographically or game-theoretically verifiable?
Incentives: Are token-based reward/challenge mechanisms in place?
Openness: Are nodes permissionless and freely joinable?
AI Training System Layer – Focused on engineering capability and performance delivery:
Scheduling & Fault Tolerance: Is the system asynchronous, fault-tolerant, dynamically scheduled?
Training Optimization: Are there improvements to model training algorithms or structures?
Communication Optimization: Is there compression/sparsification to support low-bandwidth training?
Dimension | Gensyn | Prime Intellect | Pluralis | Nous Research | |
Task Control Level | Task-level (general) | Strategy-level (RL) | Model-level (parallel training) | Model-level (philosophical) | Parameter-level (federated aggregation) |
Architecture & Model Limitations | Open scheduling + proof verification; no model restriction | Async RL; models must fit RL structure | Model sharding + unextractable weights; model-parallel required | No central scheduler; evolutionary training | Classical FL; local training + on-chain aggregation |
Verification Mechanism | PoL + graph pinpoint for training behavior | TOPLOC trajectory validation | No full system yet; planned with model unextractability | Rejects external verification; focuses on intrinsic behavior | VRF + voting + staking penalties; no ZK/FHE used |
Incentive Mechanism | Submitter / Verifier / Whistleblower game; challenge-reward model | Trajectory-based incentive + multi-pool point system | Contribution → model ownership + inference reward + governance | Incentive not central; experimental autonomous orgs | Reward based on evaluation; slashing for malicious nodes |
Scheduling & Fault Tolerance | SkipPipe fault tolerance + routing + dropout recovery | Shardcast async merging + version coexistence | Swarm pipeline async model-parallel training | DisTrO + Psyche async task distribution + node in/out | VRF scheduling + staking; lacks robust fault tolerance |
Training Optimization | RL Swarm multi-model cooperation with RL fine-tuning | PRIME-RL decoupled structure (task-strategy-update) | SWARM + NAG async optimizer for low-bandwidth training | De-alignment training; hallucination as creativity | Local training + aggregation; lacks core innovation |
Communication Optimization | SkipPipe jump-routing + early-exit + inference reordering | PCCL async all-reduce + intercontinental recovery | Column-space sparsification of Transformer activations | DCT + 1-bit Sign gradient compression | No explicit optimization; FL has low communication overhead |
Openness & Accessibility | Light node support + open task creation | Async for low-power nodes; permissionless | Heterogeneous support + elastic participation | Fully open; all behavior and identity evolvable | Staking required; gated participation and validation |
Theoretical Innovation | Training-as-task-market; unified protocol for task-verification-incentive | Trajectory-as-consensus; RL trajectories as trust foundation | Protocol learning; model = protocol + ownership mapping | Desideratic AI: AI as evolving agents, not controlled tools | On-chain FL; focus on privacy, incentive, and trust |
Engineering Complexity | High | Very high | High | Extreme high | Medium |
Project Status | RL Swarm live on testnet | INTELLECT-2 model + verification released | 🔄 Still in research; no public testnet | 🔬 Hermes model open-sourced; TEE_HEE live | AI Arena & FL Alliance running; 30+ models live |
Within the full value chain of decentralized training, projects like Prime Intellect, Pluralis.ai, Gensyn, and Nous Research are primarily focused on front-end infrastructure—covering model pretraining, communication mechanisms, and collaborative optimization. However, a distinct set of projects concentrates on the post-training phase of model adaptation and inference deployment. These include Bagel, Pond, and RPS Labs, all of which leverage LoRA fine-tuning as their core approach, forming a critical “downstream” component in the decentralized training landscape.
LoRA (Low-Rank Adaptation) is a highly efficient parameter fine-tuning method. It introduces low-rank matrices into pretrained models to learn new tasks, while keeping the original model parameters frozen. This strategy dramatically reduces training costs and resource consumption, enabling faster and more flexible deployment—particularly suited for Web3 scenarios characterized by modular and composable model use.
Large language models such as LLaMA or GPT-3 typically contain billions (or even hundreds of billions) of parameters, making full fine-tuning extremely costly. LoRA, by training only a small number of inserted parameters, provides a highly efficient adaptation solution and has emerged as one of the most practical mainstream approaches.
Direct Preference Optimization (DPO) has recently gained traction as a method for aligning language models in post-training. Often used in combination with LoRA, DPO enables preference learning by directly optimizing over paired examples—bypassing the complex reward modeling and reinforcement learning components of traditional RLHF (Reinforcement Learning from Human Feedback). Its structure is simpler, convergence more stable, and it is especially well-suited for lightweight, resource-constrained fine-tuning environments. Thanks to its efficiency and usability, DPO is increasingly becoming the method of choice for decentralized AI projects during the alignment phase.
Looking ahead, Reinforcement Learning (RL) is increasingly seen as a core paradigm with greater adaptability and evolutionary potential for decentralized training. Unlike supervised learning or parameter fine-tuning—which rely on static datasets—RL emphasizes continuous strategy optimization in dynamic environments, making it naturally aligned with the asynchronous, heterogeneous, and incentive-driven nature of Web3 collaboration.
By continuously interacting with its environment, RL enables highly personalized and incrementally evolving learning processes—laying the groundwork for adaptive “behavioral intelligence” in agent networks, on-chain task markets, and smart economic systems.
While this paradigm aligns closely with the spirit of decentralization and offers notable systemic advantages, it also comes with significant engineering hurdles and complex scheduling requirements, making near-term adoption difficult at scale.
Notably, Prime Intellect’s PRIME-RL and Gensyn’s RL Swarm are pioneering the shift of RL from a post-training fine-tuning tool to a core pretraining architecture, aiming to construct a trustless, RL-centered collaborative training system.
Bagel builds on LoRA fine-tuning by integrating zero-knowledge proof (ZK) techniques to ensure the verifiability and privacy of on-chain fine-tuning processes. While zkLoRA does not perform the actual training computation, it provides a lightweight, cryptographically verifiable mechanism for third parties to confirm that a fine-tuned model originates from a specific base model and set of LoRA parameters—without accessing original data or weights.
In contrast to solutions like Gensyn’s Verde or Prime Intellect’s TOPLOC, which focus on verifying whether training computations occurred honestly, Bagel emphasizes the verifiability of outcomes. zkLoRA’s major advantage lies in its low verification cost and strong privacy guarantees, though it is mainly applicable to low-parameter-change fine-tuning tasks.
Pond is currently the only decentralized training project focused on graph neural networks (GNNs), serving structured data applications such as knowledge graphs, social networks, and transaction graphs. It allows users to upload graph data and contribute training feedback, offering a lightweight, controllable platform for task-specific fine-tuning and inference.
Pond also uses LoRA-like efficient fine-tuning methods and aims to build modular, deployable agent systems on top of GNN architectures. This opens a new frontier in decentralized AI that combines small-model fine-tuning with multi-agent cooperation.
RPS Labs leverages fine-tuned Transformer models in a decentralized architecture to enhance DeFi liquidity management, primarily within the Solana ecosystem. Its flagship product, UltraLiquid, is an active market-making engine that dynamically adjusts liquidity parameters using AI models, improving depth and reducing slippage for better token issuance and trading experiences.
RPS has also released UltraLP, a tool for liquidity providers to optimize fund allocation strategies on DEXs in real time. This demonstrates the practical value of AI-powered fine-tuning in financial applications—enhancing capital efficiency and mitigating impermanent loss.
In the full ecosystem map of decentralized training, the landscape can be broadly divided into two categories: pretraining engines, which correspond to the foundational training phase, and post-training ecosystems, which focus on fine-tuning and deployment—together forming a complete loop from infrastructure to application.
The pretraining engine layer is centered on building core protocols for distributed training. Projects such as Prime Intellect, Nous Research, Pluralis.ai, and Gensyn lead this frontier, developing system architectures that support asynchronous updates, sparse communication, and verifiable training—aiming to enable efficient and trustworthy training in trustless network environments. These efforts lay the technical groundwork for decentralized AI.
Meanwhile, Flock represents an intermediate layer that bridges training and deployment by leveraging federated learning. Through mechanisms such as model aggregation, on-chain verification, and multiparty incentives, Flock offers a practical paradigm for collaborative learning across nodes.
On the post-training ecosystem side, projects like Pond, Bagel, and RPS Labs focus on LoRA-based fine-tuning strategies. Bagel introduces on-chain verifiability for fine-tuned models via ZK proofs; Pond specializes in small-scale GNN evolution for structured data; and RPS Labs deploys fine-tuned models as intelligent market makers in DeFi scenarios. Together, they provide developers and end-users with low-barrier, composable solutions for model inference and customization via APIs and Agent SDKs—serving as vital entry points for real-world decentralized AI applications.
Project | 1️⃣ Data Discovery & Collection | 2️⃣ Model Pretraining | 3️⃣ Communication & Collaboration Optimization | 4️⃣ Model Fine-tuning & Adaptation | 5️⃣ Personalized Inference & Aggregation | 6️⃣ Incentive Mechanism & Value Mapping |
Prime Intellect | ⛔ | INTELLECT-2 pretraining architecture (PRIME-RL) | SHARDCAST asynchronous aggregation | ⛔ | ⛔ | TOPLOC verifiable training + slashing mechanism |
Nous Research | Multi-source behavioral data & simulators | Hermes pretraining series + Consilience plan | DisTrO communication compression + async mechanism | ⛔ | Forge + TEE_HEE personalized inference API | ⛔ (no emphasis on incentives) |
Pluralis | ⛔ | SWARM async parallel pretraining | Column-space sparse comms + NAG momentum optimizer | asynchronous sparse fine-tuning | ⛔ | Partial Ownership model ownership mapping |
Gensyn | ⛔ | RL Swarm collaborative optimization (post-training stage) | SkipPipe fault-tolerant comms + heterogeneous scheduling | ⛔ | ⛔ | PoL verification + Submitter/Solver/Verifier incentive game |
Flock | ⛔ | ⛔ | ⛔ (standard FL aggregation, no optimization) | Local LoRA fine-tuning + zkFL aggregation | ⛔ (standard prediction, no aggregation innovation) | Staking + VRF voting + reward/penalty mechanism |
Bagel | ⛔ | ⛔ | ⛔ | zkLoRA: ZK-based fine-tuning verification module | For personalized model deployment | zk fine-tuning incentive system (early stage) |
Pond | ⛔ | ⛔ | ⛔ | Focus on LoRA fine-tuning + multi-task adaptation | Emphasis on multi-role personalized inference | ⛔ (undisclosed) |
RPS | ⛔ | ⛔ | ⛔ | Agent SDK + adaptation system | Inference aggregation engine (RaaS) | Task-based incentive structure (under development) |
We believe decentralized training is not merely an extension of blockchain principles into the AI era—it represents the foundational infrastructure for a future of globally collaborative intelligent productivity. One day, as we look back on this challenging yet hopeful path, we’ll be reminded of our shared conviction: Decentralization is not just a method—it's a value in itself.