Authors: Altan Tutar and Divesh Punjabi

Introduction

Artificial Intelligence (AI) is rapidly evolving and growing in importance. With its landmark achievement of reaching over 100 million users in record time, ChatGPT has ignited a wave of curiosity about the potential of large language models (LLMs). This rising interest has prompted the integration of LLMs into numerous products via APIs, offering access to OpenAI's newest LLMs. For instance, Notion, a widely used note-taking software, integrated LLMs into its product for users to improve writing and other tasks. As a result, we anticipate that LLMs will become a standard component of every software stack, consequently exerting a profound influence on human behavior and thought processes.

OpenAI's adaptations of GPT aren't the sole offerings in the marketplace. Major tech companies like Google and Meta have unveiled their respective versions of foundational models, namely PaLM 2 – the LLM behind Bard – and most recently Llama 2. Even though products are relatively similar, except that OpenAI’s latest GPT-4 is considered to be more effective at reasoning, each company took different approaches in releasing their products and codebases.

OpenAI has chosen not to open-source its GPT-4 model. In contrast, Meta has taken a different path by open-sourcing their large language model, offering it to the broader community. Hugging Face, another significant player in the field, has embraced the philosophy of "building in public", emphasizing transparency and community engagement in their development process.

Building such an influential technology in public, albeit not exclusively, is vital since it fosters a culture of transparency and collective wisdom. It accelerates innovation and promotes rigorous scrutiny, meanwhile addressing ethical considerations. To accelerate the scale and pace of collectively building in public we need to improve our coordination efforts and blockchain presents a potential solution.

Why should we care if AI is built open-source?

Most internet users prioritize performance, affordability, and utility, often overlooking a software's open-source credentials. Given the generally limited technical expertise of the average user, the intricacies of open-source vs. closed-source development are largely irrelevant. Both Linux (open-source) and Windows (closed-source) serve as a testament of widely used operating systems developed under contrasting environments, with the former prioritizing customizability and the latter user-friendliness

Despite often being invisible to end-users, the difference between open-source and closed-source software is indeed relevant. Closed-source development provides a protective umbrella for an organization's intellectual property, granting full control over data and infrastructure. This enhances quality assurance, security, and maintains the confidentiality of proprietary technology. In contrast, open-source development promotes extensive community engagement, transparency, and scrutiny. This openness can help identify and address potential issues or bugs more effectively and invites enhancement suggestions from a diverse and global developer community.

These distinct features of closed and open-source software have fueled an ongoing debate within software communities concerning the ethical standards for software development, particularly for applications with societal impact. Notably, most social media algorithms have remained closed-source, with recent exceptions such as Twitter's algorithm release. When it comes to AI, the stakes of this debate rise significantly. Much like social media algorithms, AI systems subtly shape our behavior and thoughts, echoing the profound influence of many groundbreaking technologies.

Therefore, there's a compelling argument for open-source development, training, and deployment of these systems. It's crucial for stakeholders—developers, users, and policymakers alike—to ensure these systems adhere to societal norms, ethical standards, and regulatory requirements.

However, one of the biggest challenges for open-source software (OSS) developers in AI to compete at the scale and pace of big tech is having access to resources and coordinating efforts. This is where blockchain comes in – providing the infrastructure to coordinate resources and efforts at scale to decentralize different components of the LLM development stack. Additionally, the recent crypto cycle created a now-idle mining ecosystem that uses the same GPUs required for training models — with the right incentives, these could be repurposed for AI.

Accelerating our collective understanding of AI with blockchain

In recent years, we have witnessed the meteoric rise of not only AI, but also blockchain. As these technologies continue maturing, an opportunity emerges for them to converge in mutually beneficial ways. Some companies are already building at this intersection to provide a unique value proposition, and we expect to see more novel approaches as more talent explores this convergence.

The increasing interest in AI models is already creating bottlenecks in computing power and costs. For example, training OpenAI’s GPT-3 cost an estimated $12 million – the recent leak on GPT-4 details estimates that its size is roughly 10x (paywall), so we can expect a significant increase in its training cost. Despite the costs to train foundational models is feasible for well-funded and revenue-generating companies such as OpenAI, it is challenging for newer startups to leverage this technology and innovate due to high costs, especially if they are looking to train models for a specific niche audience, which is what new entrants tend to go after.

As models grow larger, startups have to either invest in hardware at the expense of scalability, or pay increasing prices to cloud providers – the latter being the de facto option. However, relying solely on centralized cloud providers like AWS is becoming prohibitively expensive for new entrants or researchers, with no signs of slowing down unless we hit a significant milestone in hardware development.

The decline in mining currencies, paired with the shift in consensus algorithms from proof-of-work to proof-of-stake, like Ethereum, has surged a now-idle mining ecosystem. Blockchain projects aggregate these resources by offering incentives for CPU and GPU contributions (i.e., computational resources) without intermediation costs. These aggregated resources are traditionally known as compute networks.

Compute networks can be broken down into general-purpose, which function similar to a decentralized cloud provider as they can be used for a variety of applications, or specific-purpose, which are tailored to specific use cases such as machine learning (ML), seeking to solve the challenge of parallelization and verification.

General-purpose compute networks can be used for some steps of the ML development cycle such as data pre-processing, fine-tuning, and inference workloads. Networks offering decentralized cloud services include Akash, Cudos, and iExec. More specialized, but similar approaches include combining Filecoin for decentralized cloud storage and Bacalhau for decentralized compute.

On the other hand, specific-purpose networks, or in this case ML-specific, are required for training as they are optimized to solve for parallelization and verification. Specific-purpose networks include Gensyn and Together. Both networks are following the same approach of aggregating idle resources and selling those to companies, with their key differentiator being that they can parallelize and verify computations that run on these distributed and decentralized resources.

Beyond infrastructure, crypto economic incentives also facilitate AI development. Networks like Bittensor – aside from offering compute resources – incentivize researchers to build on existing models rather than training from scratch. This “composable intelligence” increases productivity and access to knowledge. It is similar to GPT-4’s Mixture of Experts (MoE) approach where an overarching model routes queries to the right combination of “expert models” to improve accuracy and reduce latency of generating an output.

Token-based incentives are also being used for domain-specific reinforcement learning from human feedback (i.e., RLHF). For instance, Hivemapper rewards drivers and map editors for improving real-world mapping data. Another approach involves introducing game elements into data labeling and RLHF to enhance data quality. For example, NEAR Tasks operates as a decentralized data labeling platform where individuals can earn tokens by accomplishing specific, game-like tasks. Such approaches seek to improve the decentralized coordination of resources and efforts at scale.

Lastly, blockchain’s public ledger capabilities can help mitigate a potential drawback of AI by providing immutable source provenance. As deepfakes grow more advanced, blockchain’s feature to cryptographically sign digital content and record it on a tamper-proof ledger will help track origins and combat disinformations.

Conclusion

The rise of AI, especially LLMs, exposes new challenges in the realm of software development, most notably resource constraints and ethical considerations. Open-source development, which emphasizes transparency and community engagement, provides an alternative approach to defining what is ethical, yet faces significant resource and coordination hurdles.

The solution could lie at the confluence of AI and blockchain technology. Blockchain can create decentralized networks that pool computational resources, offering a cost-effective solution for training increasingly complex AI models.

Joining Steel DAO

We’re recruiting and onboarding new members to contribute across the different initiatives. Read more about our evolving mission through our Charter.

Apply to join the Research Division here.
Apply to join Links, our community-as-a-scout program here.