# What products will be built with ZKML?

By [Stateless](https://paragraph.com/@stateless) · 2024-01-23

---

After reading this article, you will know what zkml is and how it will solve the problem of distrust in the post-AI world.

Imagine you run a comics e-commerce store. Some guy offers you a neural model to automate your support service better than other solutions you saw for your support department. You liked the demo version and are now ready to buy a subscription for a year, but you are still deciding. Neural model seller can be unfair and substitute the model for a lighter and dumber one. It will lower the cost of running the model on his side, but you will have a hard time proving that the model got worse.

Did you know that you can be sure that the results you get from some neural model are authentic by utilising some fancy math? Research in this area is immature, but it is positioned to experience exponential growth and enable the universally needed services. This post will help you avoid getting FOMO when it happens.

**Zero-knowledge proofs** can validate that the computation was done correctly without revealing some crucial info - most often the user's input for the model’s computation. Importantly, verification takes less time than the computation itself. That is why zk proofs are handy in blockchains - they let smart contract verify heavy computations without executing them. For example, zk cryptography can help user prove to the medical resource that the medical data he owns contains some diagnosis to get that particular electronic prescription, without revealing any other personal medical information. Zero-knowledge cryptography sees excellent progress in both research and applications these days. ZK applications seem like magic on the first encounter.

**Zero-knowledge machine learning** (zkml) applies to all zero-knowledge tech where computations are done with neural networks.

![Machine learning that is verifiable on decentralized ledgers](https://storage.googleapis.com/papyrus_images/850a9031364e3e715f07b9b21201cc40f113624b8c94193daed10853bda33a82.jpg)

Machine learning that is verifiable on decentralized ledgers

“I don’t trust model’s answers”
-------------------------------

Zkml helps to verify that the model’s output was not manipulated and answers stay consistent with what you expect from the model.

_zkml allows to determine that a particular piece of content was produced by applying a specific model to that particular input_

The tricky part is proving neural network **inference:** model’s output based on some input prompt, image, biometric data, etc. State-of-the-art libraries can only prove computations of relatively simple networks - say, 20 or 30 million parameters. For reference, GPT-4 is claimed to have more than a trillion (100,000 million) parameters, and more simple open-source language models like LLaMA or Mixtral have around 50 billion parameters. Creating proof that we did some LLaMA inference would take a year. Yet the progress in zero-knowledge crypto creates a hope that zkml will improve enough to work with practically useful neural models.

For verifying the model’s outputs, applications fall into two broad categories: validation of the model's authenticity and blockchain AI applications.

**Model authenticity and integrity**, which means verifying that a neural model accessed through an API is indeed the intended one, preventing fraudulent substitution of models. For example, Bob developed a model that is exceptionally good at solving Alice's problem and is ready to provide it as a service to Alice. How can Alice trust that Bob's responses are indeed from the network Alice subscribed to and for which all positive reviews were written? Bob wants to keep the model private, and Alice wants the service to stay the same, [which is the problem many Redditors face with GPT-4](https://www.reddit.com/r/ChatGPT/comments/16t0ac8/does_gpt4_seem_really_dumb_right_now/). If Bob sends Alice zkml proofs along with model’s outputs, Alice can be sure that this output is indeed produced by the model she sbscribed to. This technology can enable trustless chatbot marketplaces where _anyone_ can sell verifiable access to _any_ private model without publishing its weights. On such marketplaces, users interacting with the neural model will be sure that the neural network is indeed the one that was there recently, when all the positive user reviews for it were written. Another similar use case is the competitive market for providing access to open-sourced neural models. People with access to the cheapest servers will win a competition to run the user's prompt. With zkml proofs, they can guarantee the correctness of their model’s inference. This will make the market for any AI task efficient and provide the end user with the cheapest service.

How can we estimate the market for these applications? The generative AI market [is projected](https://www.bloomberg.com/company/press/generative-ai-to-become-a-1-3-trillion-market-by-2032-research-finds/) to reach $200B in 2025 and $1.3T in 2030. Then, at least 10% of the market will be based on models that need to be both verifiable and private. Such models will likely be distributed via decentralized or privately owned AI-bot stores. And that brings us to the $130B estimate of the future market enabled by zkml.

**Accessing neural networks from the blockchain**. Blockchains' computational and storage capacities are too small to meet the needs of neural models. You can’t upload the entire neural model to the blockchain to create decentralised AI apps or services. Good neural models are also large in size. But smart contracts can cheaply verify zk proofs, so with zero-knowledge machine learning, application developers will be able to integrate AI features into dapps. To mention some of them:

*   AI-based trading strategies
    
*   decentralised biometrics verification service, for example, to prove to the smart contract that you are a unique user
    
*   games with an on-chain economy where NPCs are intelligent actors
    
*   competitions with AI with no need to trust any centralised party with money - even with untrusted game developers, users can be sure that the only way to lose money is to lose in the game
    
*   [autonomous worlds](https://lattice.xyz/)
    
*   and most importantly, a [singing game](https://cryptoidol.tech/) with AI judge and on-chain scores and rewards
    

“I don’t trust model’s training”
--------------------------------

For now, we discussed proving the machine learning model’s **execution**, or inference, but the second zkml application is proving the model’s training process. Proving training means creating a cryptographic guarantee that this particular machine learning model was trained on the dataset with specific features. In plain terms, you verify that the knowledge inside the model is good or balanced without repeating the learning process.

When people have a need to prove that the model was trained in a “good” way? When it’s crucial to be 100% sure that:

**The dataset used to train the model was not biased**. Imagine you search for a political expert AI chatbot with balanced reasoning. Still, the most popular chatbot was trained on 1000 documents in support of increasing the defense budget and only 100 documents in support of decreasing the defense budget. Even if the model’s weights were open-sourced, you won’t notice this bias when purchasing the access! But there are still honest people in the world. Along with demo access to this political expert, the developer published hashed weights to commit to the final version of the model but not to publish it. He does not need to show the dataset, either! So the developed added proof of the following statement: “this model was trained on a dataset that has equal numbers of documents in favor or against the defense budget increase, as being assessed by an independent labeling model.” Everyone who wants to use the political expert bot can now expect balanced political judgments, with no dataset or weights ever published.

The more I think about this particular use case, the more value I see in creating language models with balanced judgment. I don’t want my AI assistant, whom I talk to many times daily, to be a hidden propaganda machine. What you hear, that you become - it not only applies to how neural model learns, it also applies to how our brain operates. No one is entirely immune to the force of propaganda, and it’s not one’s fault. Absorbing any attended information is a feature of the human mind.

**The dataset did not contain some backdoor material.** You probably want something safer than a home robot trained on [Hitman 3 gameplay](https://www.youtube.com/watch?v=NrJes9fjBE4) data. How can you ensure that the firmware dataset did not contain instructions on how to kill the homeowner? Validate a zero-knowledge proof of learning on a clean dataset. No magic, just some sophisticated math.

**Where are we now?**
---------------------

It is currently possible to create proofs for models of around 18M parameters in about 50 seconds on a robust online server. How much is 18M parameters for a neural model? This depends on the task. Image classification is a task for differentiation between some known classes of images, like ‘cats’ and ‘trees’, and 18M parameters are sufficient for that. It’s of the order of AI that plays chess or go. It’s much less than we need for a large language model. GPT with tens of millions of parameters is [as intelligent](https://arxiv.org/abs/2305.07759) as a typical 4-year-old. Practical language models are thousands of times that. Guys from [Modulus Labs](https://www.modulus.xyz/), who [raised](https://crypto-fundraising.info/projects/modulus-labs/) $6.3M in Nov 2023, and zkonduit with their [ezkl library](https://github.com/zkonduit/ezkl) are working to move those boundaries higher.

If you want to stay updated on zkml developments, subscribe to the Telegram channel: [t.me/zk\_ml](https://t.me/zk_ml)

Follow Stateless on Mirror for more authentic content about decentralized tech and startups.

![](https://storage.googleapis.com/papyrus_images/687c6565b166459b87c9a2204c94ac292ca6e54fa20c4b7c893f03a51deb18a8.png)

---

*Originally published on [Stateless](https://paragraph.com/@stateless/what-products-will-be-built-with-zkml)*
