idll holder, future believer of web3
idll holder, future believer of web3

Subscribe to Lenson

Subscribe to Lenson
Share Dialog
Share Dialog
<100 subscribers
<100 subscribers
China's big-model entrepreneurs are already assembled at the crossroads. They range from scientists who have been working on natural language understanding for nearly 40 years, to former entrepreneurs who have already made a name for themselves, to young people who have just graduated with a PhD. Entrepreneurs are competing at every level. The crossroads is even physical - the one outside the east gate of Tsinghua University. The companies are also geographically close to each other, with the closest just a few floors away.
The Sohu network building is on one side of the intersection. It is probably the office building with the highest density of talent for large models in China. Wang Huiwen's Light Years Away is on the third floor. Wisdom Spectrum AI, which was incubated from Tsinghua's computer department, rents the seventh to eleventh floors, and all nine floors and above are still empty, preserving traces of when Sogou was working here, with "Sogou Milestones" posted in the hallway. Sogou founder Wang Xiaochuan opened a media communication meeting in a conference room on the second floor, announcing the start of a big model venture and the establishment of a new company, Baichuan Intelligence, but he was ready to choose a site in a nearby park, "I don't roll with them here. These companies endure office rents that are more expensive than the average price of Beijing CBD, just to be physically "close to the best AI talent in China".
On the other side of the road, there are "Tsinghua-system" representative teams Lingshen Intelligence and ShenYan Technology. The former was founded by Huang Minlie, an associate professor of computer science at Tsinghua University, and has been researching its own "super-analogue human model" since the end of 2021, while the latter's founding team is almost entirely from Tsinghua's NLP lab, with the lab's academic leader, Professor Sun Maosong, serving as the company's chief scientist. When founder and CEO 岂凡超 wants to communicate with the professor, he only has to walk a few hundred meters back to school.
Their startup opportunities are different. Wisdom Spectrum AI was founded in 2019 and is the earliest of them to start. At the beginning of the venture, they made applications based on Google's BERT big model launched in 2018. Light Years Away was officially launched in early April 2023. Huewen Wang saw the opportunity of the big model at the beginning of the year and decided to start up again "within a few days after making the decision".
They were both blown away by the "talent" ChatGPT showed. A Big Model entrepreneur asked ChatGPT to list the shortest path from Beijing to Shanghai using dynamic planning methods, with the mileage of each path divisible by 3. A veteran tech investor asked ChatGPT to translate Japanese record introductions; ChatGPT could accurately translate "N Rings" to "NHK Symphony Orchestra", which is a "black word" only a fairly veteran classical music enthusiast knows. The "black language". A scientist at an AI startup asked ChatGPT to write stories about humans and AI, constantly asking for new characters to be added, such as a husky, and the constant stream of words still organized naturally.
In mid-March, GPT4, which had just been released for a few days, had an accuracy rate of over 70%, while the average accuracy rate of released domestic big models in the same period was 20%. in May, the average accuracy rate of domestic big models In May, the average accuracy rate of domestic big models had already caught up to over 50%.
The entrepreneurs who were blown away by the big model's capabilities compared it to "the next generation computer," "the invention of fire," and "the God of human creation"; using all kinds of metaphors to explain the changes they expected magnitude, "Cambrian," "Industrial Revolution," "Renaissance," "Great Voyage" "Apple-Microsoft moment," "Blackberry era," and so on.
The qualitative change starts with GPT-3, which will be available in 2020. OpenAI's vision was confirmed: when the data is large enough, the model can learn examples of the various tasks it contains, such as translation, arithmetic, programming, etc., and thus become more general. Natural Language Chair Scientist Jiaxing Zhang, quoting the famous line from Three Bodies, "Physics doesn't exist anymore", lamented at an event that "traditional NLP (natural language processing) technology doesn't exist anymore".
"Big models focus on data, models, algorithms that can be implemented at scale, and traditional NLP research focuses on models that do a lot of fine-tuning, but a lot of that is no longer valid on big models with big data." Huang Minlie, Founder of Lending Heart Intelligence and Associate Professor of Computer Science at Tsinghua University, explains.
Entrepreneurs who were originally in the AI field are also active. Companies such as Shang Tang, Fourth Paradigm, and KDDI have launched big models one after another. AI startup out of the gate CEO Li Zhifei lamented that "there is more supply of big models than one would expect." He initially thought the financial and technical threshold for big models was high, and there were at most two or three domestic companies that could do it. A month and a half later, he had a feeling that the market competition for big models might be more intense than the last AI boom.
How to understand OpenAI's success means, in part, how these entrepreneurs will approach their competition. Zhifei Li sees OpenAI's success as "switching from a research paradigm to a product-driven one. Ming Zhou, founder of Lanzhou Technology and former vice president of Microsoft Asia Research Institute, believes this company has taken data cleaning, training speed and other aspects to the extreme and integrated all capabilities, including excellent algorithms, engineering and even PR. while Huiwen Wang believes OpenAI's success is "the success of the right mission, vision, values, and the right organizational approach ".
The entrepreneurs' views on AGI (General Artificial Intelligence), the ultimate goal of the big model, vary greatly from definition to understanding.
Wang Xiaochuan was convinced that " AGI is already here" after only a few rounds of simple chat with ChatGPT. He believes ChatGPT confirms his judgment from six or seven years ago: strong AI has arrived when machines have mastered language. In a small sharing session, several entrepreneurs in the AI space defined ChatGPT's progress in terms of functionality only.
"People think of this thing as small." Wang Xiaochuan said. He received a call after the meeting from someone in attendance who asked, "Xiaochuan, are you faking it again?" A few days later, that person called again, "This time you're right again."
According to Wang Huiwen, "The perception of AGI can flip many times as the facts are grasped and the results unfold."
The common thread is that they are both convinced that the big model technology change is bigger than any they have ever experienced, and they stand at the beginning of this potentially decades-long wave of change.
"This wave of AI should be a big wave that lasts for decades and is made up of multiple smaller waves. It won't be completed in one wave, and different innovations will emerge in different waves." Wang Huiwen said.
He agrees with Elad Gil, an American investor, that in some tech waves, all the value can be captured by startups, while in other waves, most of the value will go to established companies, or will be divided between startups and established companies. According to Wang Huiwen, the AGI wave belongs to the latter because the large model technology is sufficiently differentiated from the past technology, leading to unpredictability in the market, and startups thus have room to grow.
Until ChatGPT educated the domestic market
In October 2022, multiple U.S. investors mentioned to Li Zhifei that an AIGC application called Jasper was very profitable. Based on the GPT-3 model and fine-tuned for marketing scenarios, Jasper opened up the market by generating marketing copy, and its 2022 ARR (a revenue measure for SaaS or subscription businesses) was about $80 million.
"The moment I saw it, I really felt like a fool." Li Zhifei said.
A Sequoia USA investor told Li Zhifei, "Your time has come." The other party also mentioned that the managing partners of Sequoia USA only discussed AIGC projects and did not look at anything else. At that time, the focus of the investment community was more on the application rather than the underlying big model.
Jasper solved the problem that Zhifei Li had started to think about two years ago: What scenario is GPT-3 suitable for? Li Zhifei thought about copywriting scenarios, but only got half of the answers "right". "In the past, we did error correction, retouching and rewriting, not expecting to fully generate a piece of content." He made an assisted writing app in 2020 based on the self-researched large model UCLAI, which was not eventually marketed and promoted because he did not expect good business prospects.
AI startup Fourth Paradigm has also made similar attempts. In 2018, Google launched the BERT big model, which significantly improved performance in all aspects, and Tu Weiwei and his peers thought that "that was the inflection point of NLP". He was getting more and more requests for assisted writing. Some of the customers said frankly that they wanted AI to help generate "eight-legged" reporting materials, "AI can play chess, but this can't be written?"
Tu Weiwei's team tried to do assisted writing application based on BERT and GPT series models, but they could only write two or three sentences, and the accuracy was not high, so they did not release it to the public.
Startups have limited arithmetic resources and are destined to tilt towards the main business with higher input-output ratio. These first-mover experiments with large models were also difficult to get external support at that time. in June 2020, GPT-3 was launched, and Zhifei Li, a Google scientist by training, saw the ability of larger models to be more general. He formed a research group with engineers and read papers "like an addiction".
A few months later, at a hiking event for tech entrepreneurs, Zhifei Li spent an hour explaining to his peers what the Big Model was. He spoke excitedly, while others "just listened as a story" and kept questioning, "So what? How do you commercialize it?" One of the entrepreneurs euphemistically said, "Zhifei, you're just fit to be a scientist, not to start a business." Zhifei Li realized, "There's no way anyone will invest in you to do this." The large Chinese model they developed eventually stopped at 6 billion parameters, and there was not enough capital to support it to the moment of "emergence" - today practitioners generally consider 40-50 billion parameters to be the threshold of model capability. The threshold of "emergence".
The venture capital community has not yet realized the business space behind GPT-3. In 2021, Yu-Sen Dai, managing partner of True Partner, approached two large model startup teams that also wanted to do AI-assisted writing or novel continuation similar to Grammarly. At that time, Yu-Sen Dai was not optimistic, and thought the application scenario was rather limited.
Enterprise customers are more realistic. Zhou Ming started his business at the end of 2020 and visited hundreds of customers, but the feedback he got was often, "We can't afford to use big models if you do." Most of Zhou Ming's customers are central state-owned enterprises, so in order to privatize data, they have to deploy the big model locally, and at least invest tens of millions of yuan in training costs. Even if they do not do training, only local deployment inference, the cost is also in one or two million yuan. The customer thought it was uneconomical.
It was not until January 2023 that ChatGPT educated the domestic market. By this time, Zhifei Li had restarted his self-research on big models for more than 3 months, and he found that people who "seemed to have nothing to do with big models" also came to ask how much he had to spend and what people he could recruit to do. Tu Weiwei contacted customers from all walks of life, including the "agriculture, forestry, animal husbandry and fishery" industry, who asked for Big Model cooperation.
On February 10, a "declaration of artificial intelligence" circulated. "50 million dollars, with capital to join the group, do not care about the position, salary and title, seeking team." Three days later, the manifesto became a more widespread AI "hero list". Wang Huiwen announced his determination: to build China OpenAI.
His downfall intensified the intensity of this round of AI equipment race. One employee of a large model startup said that "Wang's dedication" made him realize that the track was much hotter than he thought. Algorithm resources are clearly strained, and one startup complained, "I begged my father and mother to get some machines.
China's big-model entrepreneurs are already assembled at the crossroads. They range from scientists who have been working on natural language understanding for nearly 40 years, to former entrepreneurs who have already made a name for themselves, to young people who have just graduated with a PhD. Entrepreneurs are competing at every level. The crossroads is even physical - the one outside the east gate of Tsinghua University. The companies are also geographically close to each other, with the closest just a few floors away.
The Sohu network building is on one side of the intersection. It is probably the office building with the highest density of talent for large models in China. Wang Huiwen's Light Years Away is on the third floor. Wisdom Spectrum AI, which was incubated from Tsinghua's computer department, rents the seventh to eleventh floors, and all nine floors and above are still empty, preserving traces of when Sogou was working here, with "Sogou Milestones" posted in the hallway. Sogou founder Wang Xiaochuan opened a media communication meeting in a conference room on the second floor, announcing the start of a big model venture and the establishment of a new company, Baichuan Intelligence, but he was ready to choose a site in a nearby park, "I don't roll with them here. These companies endure office rents that are more expensive than the average price of Beijing CBD, just to be physically "close to the best AI talent in China".
On the other side of the road, there are "Tsinghua-system" representative teams Lingshen Intelligence and ShenYan Technology. The former was founded by Huang Minlie, an associate professor of computer science at Tsinghua University, and has been researching its own "super-analogue human model" since the end of 2021, while the latter's founding team is almost entirely from Tsinghua's NLP lab, with the lab's academic leader, Professor Sun Maosong, serving as the company's chief scientist. When founder and CEO 岂凡超 wants to communicate with the professor, he only has to walk a few hundred meters back to school.
Their startup opportunities are different. Wisdom Spectrum AI was founded in 2019 and is the earliest of them to start. At the beginning of the venture, they made applications based on Google's BERT big model launched in 2018. Light Years Away was officially launched in early April 2023. Huewen Wang saw the opportunity of the big model at the beginning of the year and decided to start up again "within a few days after making the decision".
They were both blown away by the "talent" ChatGPT showed. A Big Model entrepreneur asked ChatGPT to list the shortest path from Beijing to Shanghai using dynamic planning methods, with the mileage of each path divisible by 3. A veteran tech investor asked ChatGPT to translate Japanese record introductions; ChatGPT could accurately translate "N Rings" to "NHK Symphony Orchestra", which is a "black word" only a fairly veteran classical music enthusiast knows. The "black language". A scientist at an AI startup asked ChatGPT to write stories about humans and AI, constantly asking for new characters to be added, such as a husky, and the constant stream of words still organized naturally.
In mid-March, GPT4, which had just been released for a few days, had an accuracy rate of over 70%, while the average accuracy rate of released domestic big models in the same period was 20%. in May, the average accuracy rate of domestic big models In May, the average accuracy rate of domestic big models had already caught up to over 50%.
The entrepreneurs who were blown away by the big model's capabilities compared it to "the next generation computer," "the invention of fire," and "the God of human creation"; using all kinds of metaphors to explain the changes they expected magnitude, "Cambrian," "Industrial Revolution," "Renaissance," "Great Voyage" "Apple-Microsoft moment," "Blackberry era," and so on.
The qualitative change starts with GPT-3, which will be available in 2020. OpenAI's vision was confirmed: when the data is large enough, the model can learn examples of the various tasks it contains, such as translation, arithmetic, programming, etc., and thus become more general. Natural Language Chair Scientist Jiaxing Zhang, quoting the famous line from Three Bodies, "Physics doesn't exist anymore", lamented at an event that "traditional NLP (natural language processing) technology doesn't exist anymore".
"Big models focus on data, models, algorithms that can be implemented at scale, and traditional NLP research focuses on models that do a lot of fine-tuning, but a lot of that is no longer valid on big models with big data." Huang Minlie, Founder of Lending Heart Intelligence and Associate Professor of Computer Science at Tsinghua University, explains.
Entrepreneurs who were originally in the AI field are also active. Companies such as Shang Tang, Fourth Paradigm, and KDDI have launched big models one after another. AI startup out of the gate CEO Li Zhifei lamented that "there is more supply of big models than one would expect." He initially thought the financial and technical threshold for big models was high, and there were at most two or three domestic companies that could do it. A month and a half later, he had a feeling that the market competition for big models might be more intense than the last AI boom.
How to understand OpenAI's success means, in part, how these entrepreneurs will approach their competition. Zhifei Li sees OpenAI's success as "switching from a research paradigm to a product-driven one. Ming Zhou, founder of Lanzhou Technology and former vice president of Microsoft Asia Research Institute, believes this company has taken data cleaning, training speed and other aspects to the extreme and integrated all capabilities, including excellent algorithms, engineering and even PR. while Huiwen Wang believes OpenAI's success is "the success of the right mission, vision, values, and the right organizational approach ".
The entrepreneurs' views on AGI (General Artificial Intelligence), the ultimate goal of the big model, vary greatly from definition to understanding.
Wang Xiaochuan was convinced that " AGI is already here" after only a few rounds of simple chat with ChatGPT. He believes ChatGPT confirms his judgment from six or seven years ago: strong AI has arrived when machines have mastered language. In a small sharing session, several entrepreneurs in the AI space defined ChatGPT's progress in terms of functionality only.
"People think of this thing as small." Wang Xiaochuan said. He received a call after the meeting from someone in attendance who asked, "Xiaochuan, are you faking it again?" A few days later, that person called again, "This time you're right again."
According to Wang Huiwen, "The perception of AGI can flip many times as the facts are grasped and the results unfold."
The common thread is that they are both convinced that the big model technology change is bigger than any they have ever experienced, and they stand at the beginning of this potentially decades-long wave of change.
"This wave of AI should be a big wave that lasts for decades and is made up of multiple smaller waves. It won't be completed in one wave, and different innovations will emerge in different waves." Wang Huiwen said.
He agrees with Elad Gil, an American investor, that in some tech waves, all the value can be captured by startups, while in other waves, most of the value will go to established companies, or will be divided between startups and established companies. According to Wang Huiwen, the AGI wave belongs to the latter because the large model technology is sufficiently differentiated from the past technology, leading to unpredictability in the market, and startups thus have room to grow.
Until ChatGPT educated the domestic market
In October 2022, multiple U.S. investors mentioned to Li Zhifei that an AIGC application called Jasper was very profitable. Based on the GPT-3 model and fine-tuned for marketing scenarios, Jasper opened up the market by generating marketing copy, and its 2022 ARR (a revenue measure for SaaS or subscription businesses) was about $80 million.
"The moment I saw it, I really felt like a fool." Li Zhifei said.
A Sequoia USA investor told Li Zhifei, "Your time has come." The other party also mentioned that the managing partners of Sequoia USA only discussed AIGC projects and did not look at anything else. At that time, the focus of the investment community was more on the application rather than the underlying big model.
Jasper solved the problem that Zhifei Li had started to think about two years ago: What scenario is GPT-3 suitable for? Li Zhifei thought about copywriting scenarios, but only got half of the answers "right". "In the past, we did error correction, retouching and rewriting, not expecting to fully generate a piece of content." He made an assisted writing app in 2020 based on the self-researched large model UCLAI, which was not eventually marketed and promoted because he did not expect good business prospects.
AI startup Fourth Paradigm has also made similar attempts. In 2018, Google launched the BERT big model, which significantly improved performance in all aspects, and Tu Weiwei and his peers thought that "that was the inflection point of NLP". He was getting more and more requests for assisted writing. Some of the customers said frankly that they wanted AI to help generate "eight-legged" reporting materials, "AI can play chess, but this can't be written?"
Tu Weiwei's team tried to do assisted writing application based on BERT and GPT series models, but they could only write two or three sentences, and the accuracy was not high, so they did not release it to the public.
Startups have limited arithmetic resources and are destined to tilt towards the main business with higher input-output ratio. These first-mover experiments with large models were also difficult to get external support at that time. in June 2020, GPT-3 was launched, and Zhifei Li, a Google scientist by training, saw the ability of larger models to be more general. He formed a research group with engineers and read papers "like an addiction".
A few months later, at a hiking event for tech entrepreneurs, Zhifei Li spent an hour explaining to his peers what the Big Model was. He spoke excitedly, while others "just listened as a story" and kept questioning, "So what? How do you commercialize it?" One of the entrepreneurs euphemistically said, "Zhifei, you're just fit to be a scientist, not to start a business." Zhifei Li realized, "There's no way anyone will invest in you to do this." The large Chinese model they developed eventually stopped at 6 billion parameters, and there was not enough capital to support it to the moment of "emergence" - today practitioners generally consider 40-50 billion parameters to be the threshold of model capability. The threshold of "emergence".
The venture capital community has not yet realized the business space behind GPT-3. In 2021, Yu-Sen Dai, managing partner of True Partner, approached two large model startup teams that also wanted to do AI-assisted writing or novel continuation similar to Grammarly. At that time, Yu-Sen Dai was not optimistic, and thought the application scenario was rather limited.
Enterprise customers are more realistic. Zhou Ming started his business at the end of 2020 and visited hundreds of customers, but the feedback he got was often, "We can't afford to use big models if you do." Most of Zhou Ming's customers are central state-owned enterprises, so in order to privatize data, they have to deploy the big model locally, and at least invest tens of millions of yuan in training costs. Even if they do not do training, only local deployment inference, the cost is also in one or two million yuan. The customer thought it was uneconomical.
It was not until January 2023 that ChatGPT educated the domestic market. By this time, Zhifei Li had restarted his self-research on big models for more than 3 months, and he found that people who "seemed to have nothing to do with big models" also came to ask how much he had to spend and what people he could recruit to do. Tu Weiwei contacted customers from all walks of life, including the "agriculture, forestry, animal husbandry and fishery" industry, who asked for Big Model cooperation.
On February 10, a "declaration of artificial intelligence" circulated. "50 million dollars, with capital to join the group, do not care about the position, salary and title, seeking team." Three days later, the manifesto became a more widespread AI "hero list". Wang Huiwen announced his determination: to build China OpenAI.
His downfall intensified the intensity of this round of AI equipment race. One employee of a large model startup said that "Wang's dedication" made him realize that the track was much hotter than he thought. Algorithm resources are clearly strained, and one startup complained, "I begged my father and mother to get some machines.
No activity yet