As a Ph.D. candidate in artificial intelligence, I am relatively new to blockchain technology. Recently, I noticed BitTensor ($tao), which is known to be a prominent player in the Blockchain+AI field., and surprisingly, with a total market value of around 10 billion dollars. Such a large amount far exceeds a lot of AI unicorns. This raises my curiosity about BitTensor. What does it do? How can it foster AI applications on chain?
To fulfill my curiosity, I start to study this project, more from an AI perspective.
This article is just a simple study by the author, the correctness is not guaranteed.
It is not an investment advice.
Here is some of the materials I’ve read, which are crucial for the reader to understand this article and the whole project.
The official repository of the nodes (miners and validators)
One subnet for language tasks (implemented miner and validator)
I first read the whitepaper in order to get a clear idea of the project. However, the whitepaper is not well-written, and can hardly be called an academic paper. There are a lot of contradictory notations and undefined symbols, making the reader difficult to understand the core idea.
After carefully reading the whitepaper along with the source codes, I finally understand (maybe partly) the incentive mechanism.
First, we focus on a ‘subnet‘ of BitTensor. The goal of the subnet is to learn a function where we have some training data and . Clearly, it is the basic definition of the machine learning task.
In the subnet, the miners learn the while the validators compute , which is the performance term. We say has a better performance, which means fits better, i.e., is closer to for . A typical example is that is a neural network (e.g., ResNet) for image classification, is the training images, and is the labels, and the is the accuracy.
The incentive is rather simple:
Here, is the reward (of the -th miner), is the performance (e.g., classification accuracy), and is the staked token amount.
Therefore, it can be concluded that the miner can get more rewards by the following two ways:
Getting a better performance, e.g., classifying the image more accurately
Staking more tokens.
Take the ‘prompting’ subnet for example. The prompting subnet’s purpose is to train a question-answering model.
In prompting/rewards/, there are several functions to evaluate the performance of the predicted answer and the reference answer, e.g., by comparing the cosine similarity of the embeddings of the predicted answer and the reference answer.
The performance score is then set to the miner (with some little modifications like moving average, and being committed to the blockchain.
See /prompting/forward.py, prompting/base/validator.py.
Once the blockchain receives those scores (confusingly, the variable name is ‘weight‘ instead of ‘reward’), it computes the total incentive, by simply multiplicating it with the staked value of the node. One line of code does it:
let preranks: Vec<I32F32> = matmul( &weights, &active_stake);
(Matmul here is because there are multiple validators)
After all those, we can find that while the whitepaper is full of complicated formulas (some could be irrelevant), the incentive mechanism is pretty simple,
i.e., reward = performance * staked value.
Data privacy issue. The data used to train (practical) AI models are usually private. Large companies have no intention (and probably no rights due to the laws) to share the data. However, the miners and validators both need data to train/validate the AI model. Of course, there are myriad open-source datasets, however, those datasets are mostly used for research instead of practical usage which could generate economic returns.
Winner-takes-all. Nowadays, the AI models are much larger and could be extremely hard to train, which enlarges the gap between different AI companies. For example, OpenAI’s GPT model beats any other language model with an overwhelming advantage. However, BitTensor will reward both the top-1 miner and all other miners, which is not beneficial to the leading players. They can simply sell their services by themselves.
Huge resource consumption. In the BitTensor, multiple miners are learning the same task, resulting in a tremendous amount of repeated works. Especially in machine learning tasks, it is likely that everyone uses the same or similar model.
