<100 subscribers
Share Dialog
Share Dialog
"It's almost a red ocean." When I talked to a startup founder about big models, he dumped these words directly to me.
Last November, OpenAI released ChatGPT based on GPT-3.5, which instantly ignited the big model craze. In more than half a year, there was a situation of "hundred model war" in China, and BAT and other head Internet companies and AI companies basically announced their own big models to the public.
In early May, 360 boss Zhou Hongyi said to the public, "if not after two years of imitation and copying, up to say they can surpass, it is called bragging." Just a month later, Zhou Hongyi said, "I originally said that the gap between the domestic large model and foreign two years, I retract this sentence, today has been close to the international level."
Some people lament that half a year to catch up with ChatGPT, the big model does not seem to be difficult.
So, what is the core barrier of big model? What is the level of the big model in China? What risks do big models bring for human society?
For this reason, we talked with Shen Wei (a pseudonym), who has been engaged in machine learning research for many years and is a professor at a well-known 985 university, to unveil the fog of big models.
The GPT path is open, so there is a "100-model war"
White Horse Business Review: Can you explain the big model in the most common and simple language, what is the big model? What is the difference between it and the previous AI models?
Shen Wei: The so-called big model refers to the large number of parameters of the model, but there is no clear definition in the academic community to define how big the parameters are called "big", and it is still in the stage of rapid research and development.
In fact, the development of deep learning has roughly gone through three stages. The first stage is 2012-2017, represented by small models in specific fields such as image segmentation yolo and image classification ResNet, so the number of parameters accounts for at most a few hundred MB of memory.
In 2017, the introduction of Transformer allowed deep learning to parallelize computation and be more efficient, meaning that large models could be done, which subsequently gave rise to large models of natural languages like OpenAI GPT and Google Bert. This phase gave birth to task-specific big models with model parameters breaking 100 million.
Around 2020, deep learning will enter the stage of general-purpose model, whose input is a sentence with spaces and the role of the model is to "fill in the blanks", and the model used to be adapted to the downstream application, but now the downstream application is adapted to the model. The models in this stage include GPT 3.5 and GPT 4 in the field of natural language and Clip, DALLE, Stable Diffusion, Midjourney and so on in the field of image. The model parameters at this stage can reach tens or hundreds of billions of levels.
Whitehorse Business Review: Which company or organization did you learn about the first research on big models? What are the results?
Shen Wei: The earliest research is done by universities and research institutions, I understand that the earlier is the Beijing Zhiyuan Institute of Artificial Intelligence Wudao, Pengcheng Laboratory's brain sea, now the industry research is also very synchronized. The academic research has some results, but the performance is not as amazing as ChatGPT.
White Horse Business Review: In just a few months, there is a situation of "100 model war" in China, and the number of companies launching big models is too many to count, how do you see this phenomenon?
Shen Wei: Big model is definitely a trend, and people have been studying it. Before that, many companies might invest in a small scope and do some shallow research; now that a good product like ChatGPT has suddenly appeared, everyone sees a clear business direction and starts to invest more.
On the other hand, many companies are facing the pressure of commercial competition, and they may fall behind if they don't do big models, so they must launch big model projects.
White Horse Business Review: Zhou recently said that he retracted the sentence of "the gap between domestic big model and foreign countries is two years", and he thinks it is close to the international level today. It's only been a few months, and the big model doesn't seem to be difficult. How much do you think the gap is?
Shen Wei: the gap depends on who is benchmarking it, I have not experienced the 360 intelligent brain products, not very good evaluation. But there are some domestic generative AI products, and I feel that there is still a gap with ChatGPT after experiencing them, and the domestic big model still needs to work hard.
Under the heavy capital investment, only the head company has the opportunity?
White Horse Business Review: What is the core barrier of developing big models?
Shen Wei: The core barriers of big model include data, arithmetic power and algorithm.
In terms of arithmetic power, training generative AI like ChatGPT requires at least 10,000 Nvidia A100 graphics cards, and the price of a single graphics card is currently 60,000 to 70,000, and the price of a V100 with better performance is 80,000 RMB, which means that the investment in arithmetic power is at least 600 million to 700 million, and only a few head companies and institutions can afford it. For commercial institutions, spending hundreds of millions of dollars to buy a bunch of graphics cards, but also may not be able to produce results, which is a problem that must be thought about.
Next is the data and algorithms, algorithms are better understood, such as the development of frameworks, optimization algorithms. In terms of data, China does not lack data, even more Internet data than the United States, but the choice of which data to train, what kind of way to deal with, these are the core barriers.
Whitehorse Business Review: Do you usually communicate with companies? What is the difference between a non-profit research institution and a company in terms of research?
Shen Wei: We will have some communication with the research departments of enterprises. Sometimes the academic research we do will be more focused on technology foresight and less demanding on implementation; however, enterprises generally emphasize more on implementation.
Whitehorse Business Review: Have you ever studied the big models in China? Which one do you think most?
Shen Wei: Maybe the head company can run out of it. First, heavy capital investment, only the head company has the strength; second, a few head companies have more data in hand; third, in the field of artificial intelligence has been a period of technical accumulation.
White Horse Business Review: What is your most optimistic about the application of large models?
Shen Wei: from the technical point of view, the first application should be natural language processing and image field, voice recognition may be later.
We see more and more copywriting with ChatGPT, and more and more applications of this kind of content creation, and I think other applications like intelligent customer service should also be relatively fast. Nowadays, some intelligent customer service often can't understand users' needs and solve actual problems. If users can't distinguish whether they are human or robot, the experience will be much improved; including NPCs in games, the dialogue was written to death before, but now they can gradually interact with each other, and players will have a better experience.
White Horse Business Review: You originally worked as the chief analyst of the head brokerage firm, from the investment point of view, what opportunities do you think the big model has?
Shen Wei: The logic of capital speculation is from application to algorithm and model, and then to arithmetic; the logic of industry is the opposite, arithmetic is a clear growth expectation, so Nvidia recently rose very fast and a lot. Investors now also understand, who can run out of the big model, can be realized still need to be verified, but most of the increased capital investment is invested in arithmetic. After repeated speculation, the general market should have come to an end, after the need to verify the logic and performance to cash.
I originally mainly look at the media Internet industry, such as a relatively strong game sector some time ago, the logic of the capital is the application of large models to improve the efficiency of research and development, reduce costs; the second is a large model to bring a better experience, NPC role more intelligent, and finally the user's stickiness to improve, UP value to improve. Of course, ultimately, performance verification may be required.
Humans can't sway AI, not even their own destiny
White Horse Business Review: We have seen that including Altman and Musk have raised concerns about the safety of AI. Now we only know that intelligent results appear through big model training, but the training process is like a black box, which is actually quite scary. How do you see the security issue?
Shen Wei: In terms of security, first of all, I have observed several anomalies. The first is that in March this year more than 1,000 people, including Musk and Apple co-founder Steve Wozniak, signed an open letter calling for a moratorium on training AI systems that are more powerful than GPT-4.
The second is the resignation of Jeffrey Hinton, Google's chief scientist and the 75-year-old "Godfather of AI," who left Google in May this year as a direct result of concerns about the dangers of AI and even regrets about his life's work.
The third is the new ethical discussion of training big models in academia in the last two years.
At present, I think the big model is still controllable, no big problem; but the technology is developing too fast, just a few months since the circle, GPT has gone through several iterations, the development speed is too fast, more and more intelligent, will not produce autonomous consciousness, no longer listen to the human "command", towards out of control? This is the problem that everyone is worried about.
White Horse Business Review: Do you think AI will cause a lot of unemployment? In front of AI, how can ordinary people keep their jobs?
Shen Wei: From a macroscopic point of view, I don't think AI will cause a lot of unemployment, there will always be jobs for human beings, only that the content of human work will be transformed. Of course, from the individual point of view, there will certainly be structural unemployment, we can only continue to learn.
White Horse Business Review: Many people said before that the machine has no feelings, lack of imagination, can not replace the human; now since the human brain can be simulated by AI, that human lust, sexual desire is not the future can also be simulated, hormones, dopamine, these are just a biological reward mechanism well.
Shen Wei: The machine does not have feelings is the current assumption, artificial intelligence is getting closer to the human thinking mode, then will not be similar to the human "feelings"? Only they live in a different spatial dimension from humans, just like Tu Hengyu's daughter in "Wandering Earth". Artificial intelligence may produce its own world similar to the human biological sense of the reward mechanism.
White Horse Business Review: If everything can be calculated, planned and set up, isn't it a bit boring?
Shen Wei: The behavior of AI is not predicted and planned by humans, but the result of his self-reinforcement and self-training. The decision of MOSS in "Wandering Earth" is made by himself, rather than obeying the instructions given by humans.
White Horse Business Review: Is it a definite direction for a silicon-based civilization to replace a carbon-based civilization?
Shen Wei: This question is out of line. According to the current development trend, it may be so, just like in "Wandering Earth", the real master of human destiny is MOSS, not humans; but in reality, it is also possible that technology will stagnate at a certain stage and cannot cross over, after all, technological development is not linear.
"It's almost a red ocean." When I talked to a startup founder about big models, he dumped these words directly to me.
Last November, OpenAI released ChatGPT based on GPT-3.5, which instantly ignited the big model craze. In more than half a year, there was a situation of "hundred model war" in China, and BAT and other head Internet companies and AI companies basically announced their own big models to the public.
In early May, 360 boss Zhou Hongyi said to the public, "if not after two years of imitation and copying, up to say they can surpass, it is called bragging." Just a month later, Zhou Hongyi said, "I originally said that the gap between the domestic large model and foreign two years, I retract this sentence, today has been close to the international level."
Some people lament that half a year to catch up with ChatGPT, the big model does not seem to be difficult.
So, what is the core barrier of big model? What is the level of the big model in China? What risks do big models bring for human society?
For this reason, we talked with Shen Wei (a pseudonym), who has been engaged in machine learning research for many years and is a professor at a well-known 985 university, to unveil the fog of big models.
The GPT path is open, so there is a "100-model war"
White Horse Business Review: Can you explain the big model in the most common and simple language, what is the big model? What is the difference between it and the previous AI models?
Shen Wei: The so-called big model refers to the large number of parameters of the model, but there is no clear definition in the academic community to define how big the parameters are called "big", and it is still in the stage of rapid research and development.
In fact, the development of deep learning has roughly gone through three stages. The first stage is 2012-2017, represented by small models in specific fields such as image segmentation yolo and image classification ResNet, so the number of parameters accounts for at most a few hundred MB of memory.
In 2017, the introduction of Transformer allowed deep learning to parallelize computation and be more efficient, meaning that large models could be done, which subsequently gave rise to large models of natural languages like OpenAI GPT and Google Bert. This phase gave birth to task-specific big models with model parameters breaking 100 million.
Around 2020, deep learning will enter the stage of general-purpose model, whose input is a sentence with spaces and the role of the model is to "fill in the blanks", and the model used to be adapted to the downstream application, but now the downstream application is adapted to the model. The models in this stage include GPT 3.5 and GPT 4 in the field of natural language and Clip, DALLE, Stable Diffusion, Midjourney and so on in the field of image. The model parameters at this stage can reach tens or hundreds of billions of levels.
Whitehorse Business Review: Which company or organization did you learn about the first research on big models? What are the results?
Shen Wei: The earliest research is done by universities and research institutions, I understand that the earlier is the Beijing Zhiyuan Institute of Artificial Intelligence Wudao, Pengcheng Laboratory's brain sea, now the industry research is also very synchronized. The academic research has some results, but the performance is not as amazing as ChatGPT.
White Horse Business Review: In just a few months, there is a situation of "100 model war" in China, and the number of companies launching big models is too many to count, how do you see this phenomenon?
Shen Wei: Big model is definitely a trend, and people have been studying it. Before that, many companies might invest in a small scope and do some shallow research; now that a good product like ChatGPT has suddenly appeared, everyone sees a clear business direction and starts to invest more.
On the other hand, many companies are facing the pressure of commercial competition, and they may fall behind if they don't do big models, so they must launch big model projects.
White Horse Business Review: Zhou recently said that he retracted the sentence of "the gap between domestic big model and foreign countries is two years", and he thinks it is close to the international level today. It's only been a few months, and the big model doesn't seem to be difficult. How much do you think the gap is?
Shen Wei: the gap depends on who is benchmarking it, I have not experienced the 360 intelligent brain products, not very good evaluation. But there are some domestic generative AI products, and I feel that there is still a gap with ChatGPT after experiencing them, and the domestic big model still needs to work hard.
Under the heavy capital investment, only the head company has the opportunity?
White Horse Business Review: What is the core barrier of developing big models?
Shen Wei: The core barriers of big model include data, arithmetic power and algorithm.
In terms of arithmetic power, training generative AI like ChatGPT requires at least 10,000 Nvidia A100 graphics cards, and the price of a single graphics card is currently 60,000 to 70,000, and the price of a V100 with better performance is 80,000 RMB, which means that the investment in arithmetic power is at least 600 million to 700 million, and only a few head companies and institutions can afford it. For commercial institutions, spending hundreds of millions of dollars to buy a bunch of graphics cards, but also may not be able to produce results, which is a problem that must be thought about.
Next is the data and algorithms, algorithms are better understood, such as the development of frameworks, optimization algorithms. In terms of data, China does not lack data, even more Internet data than the United States, but the choice of which data to train, what kind of way to deal with, these are the core barriers.
Whitehorse Business Review: Do you usually communicate with companies? What is the difference between a non-profit research institution and a company in terms of research?
Shen Wei: We will have some communication with the research departments of enterprises. Sometimes the academic research we do will be more focused on technology foresight and less demanding on implementation; however, enterprises generally emphasize more on implementation.
Whitehorse Business Review: Have you ever studied the big models in China? Which one do you think most?
Shen Wei: Maybe the head company can run out of it. First, heavy capital investment, only the head company has the strength; second, a few head companies have more data in hand; third, in the field of artificial intelligence has been a period of technical accumulation.
White Horse Business Review: What is your most optimistic about the application of large models?
Shen Wei: from the technical point of view, the first application should be natural language processing and image field, voice recognition may be later.
We see more and more copywriting with ChatGPT, and more and more applications of this kind of content creation, and I think other applications like intelligent customer service should also be relatively fast. Nowadays, some intelligent customer service often can't understand users' needs and solve actual problems. If users can't distinguish whether they are human or robot, the experience will be much improved; including NPCs in games, the dialogue was written to death before, but now they can gradually interact with each other, and players will have a better experience.
White Horse Business Review: You originally worked as the chief analyst of the head brokerage firm, from the investment point of view, what opportunities do you think the big model has?
Shen Wei: The logic of capital speculation is from application to algorithm and model, and then to arithmetic; the logic of industry is the opposite, arithmetic is a clear growth expectation, so Nvidia recently rose very fast and a lot. Investors now also understand, who can run out of the big model, can be realized still need to be verified, but most of the increased capital investment is invested in arithmetic. After repeated speculation, the general market should have come to an end, after the need to verify the logic and performance to cash.
I originally mainly look at the media Internet industry, such as a relatively strong game sector some time ago, the logic of the capital is the application of large models to improve the efficiency of research and development, reduce costs; the second is a large model to bring a better experience, NPC role more intelligent, and finally the user's stickiness to improve, UP value to improve. Of course, ultimately, performance verification may be required.
Humans can't sway AI, not even their own destiny
White Horse Business Review: We have seen that including Altman and Musk have raised concerns about the safety of AI. Now we only know that intelligent results appear through big model training, but the training process is like a black box, which is actually quite scary. How do you see the security issue?
Shen Wei: In terms of security, first of all, I have observed several anomalies. The first is that in March this year more than 1,000 people, including Musk and Apple co-founder Steve Wozniak, signed an open letter calling for a moratorium on training AI systems that are more powerful than GPT-4.
The second is the resignation of Jeffrey Hinton, Google's chief scientist and the 75-year-old "Godfather of AI," who left Google in May this year as a direct result of concerns about the dangers of AI and even regrets about his life's work.
The third is the new ethical discussion of training big models in academia in the last two years.
At present, I think the big model is still controllable, no big problem; but the technology is developing too fast, just a few months since the circle, GPT has gone through several iterations, the development speed is too fast, more and more intelligent, will not produce autonomous consciousness, no longer listen to the human "command", towards out of control? This is the problem that everyone is worried about.
White Horse Business Review: Do you think AI will cause a lot of unemployment? In front of AI, how can ordinary people keep their jobs?
Shen Wei: From a macroscopic point of view, I don't think AI will cause a lot of unemployment, there will always be jobs for human beings, only that the content of human work will be transformed. Of course, from the individual point of view, there will certainly be structural unemployment, we can only continue to learn.
White Horse Business Review: Many people said before that the machine has no feelings, lack of imagination, can not replace the human; now since the human brain can be simulated by AI, that human lust, sexual desire is not the future can also be simulated, hormones, dopamine, these are just a biological reward mechanism well.
Shen Wei: The machine does not have feelings is the current assumption, artificial intelligence is getting closer to the human thinking mode, then will not be similar to the human "feelings"? Only they live in a different spatial dimension from humans, just like Tu Hengyu's daughter in "Wandering Earth". Artificial intelligence may produce its own world similar to the human biological sense of the reward mechanism.
White Horse Business Review: If everything can be calculated, planned and set up, isn't it a bit boring?
Shen Wei: The behavior of AI is not predicted and planned by humans, but the result of his self-reinforcement and self-training. The decision of MOSS in "Wandering Earth" is made by himself, rather than obeying the instructions given by humans.
White Horse Business Review: Is it a definite direction for a silicon-based civilization to replace a carbon-based civilization?
Shen Wei: This question is out of line. According to the current development trend, it may be so, just like in "Wandering Earth", the real master of human destiny is MOSS, not humans; but in reality, it is also possible that technology will stagnate at a certain stage and cannot cross over, after all, technological development is not linear.
No comments yet