Minlie Huang, Tsinghua University: Finding the most promising AI track after GPT

In early July, Inflection AI, the highest valued AI company under OpenAI, was born. This previously unknown company raised $1.3 billion in a new round of financing, and its valuation jumped past $4 billion. Its emergence pierced the logic of the track narrative that after OpenAI, the big models are only big companies competing for the fight. At the same time, the lead investor list of this financing is also starry, gathering two giants of Silicon Valley and a host of bigwigs, such as Microsoft, Bill Gates, Google's former CEO Eric Schmidt and Reid Hoffman, founder of Collage, etc., and even NVIDIA, which has just begun to dabble in the downstream enterprise investment of AI.

The company has only one product, Pi, which just launched two months ago. if ChatGPT is a human efficiency amplifier, then Pi is a human emotion masseur. Unlike ChatGPT's more tool-oriented setup, Pi's main characteristics are empathy, simplicity, humor and innovation.

Pi's popularity illuminates another path that has been hidden by the light of intelligent AI like ChatGPT - Emotion AI. a track that brings understanding, attention, and care to the user, which may have a larger potential market than cold efficiency-boosting intelligent AI.

Prof. Huang Minlie of Tsinghua University is the researcher who chose to embark on the path of emotion AI in China. In his view, GPT is undoubtedly a paradigm breakthrough, but there is no way for it to meet the needs of different domains, especially in terms of emotion. This path of exploration can be traced back to MIT's Psychotherapy Conversation AI in 1966, which is a much older starting point than current general purpose task assistants like GPT.

Prof. Huang believes that beyond the need for efficiency improvement, the important emotional needs of human beings are now far from being met by AI, and this is a huge and should be explored need. Although AI now only has some basic forms of personality, through specialized data training, AI can already take on some of the work of a primary counselor. In addition, Prof. Huang completely agrees with what Hurley said: once AI understands emotions, it is more likely to control human behavior and even PUA conversation partners, which will lead to more problems of AI abuse. Therefore, in the process of technological exploration, the limitations and governance of AI are also very urgent. But the governance path is clear: it takes only two years to weave a safety net, which is enough time to outrun all predictions of AI extinction. Concerns are legitimate, but panic is not necessary.

The following is the full text of the interview:

Industry needs can't be solved by language modeling alone

Tencent Technology: What was your first encounter with ChatGPT like? Is it a paradigm breakthrough?

Huang Minlie: When ChatGPT first came out, the main feature was its high level of intelligence and its position as a general-purpose task assistant. In the past, when we did similar task assistants, such as ordering food and tickets, which were very traditional tasks, ChatGPT was able to handle various open tasks in one model, and the level of capability really overturned our previous knowledge, as it was able to accomplish many different tasks in the same system at a high level. This can be understood as a paradigm breakthrough. It's very different from the technology path we've taken in the past.

Tencent Technology: many researchers, including Likun Yang, believe that ChatGPT is technically relying on the Transformer model from 2017, and therefore has little innovation. Then how does OpenAI make its model do better than any other model?

Minlie Hwang: ChatGPT's bottom layer is based on Transformer's architecture, so there really isn't much innovation in the model's architecture (there are some new innovations in some recent model designs). In fact, its success is actually an innovation in the form of data plus engineering plus integration at the system level.

Integrated innovation includes, for example, at the data level, OpenAI has actually done a lot of data accumulation and data engineering, as well as high-quality manual collection, labeling, cleaning and so on. Engineering level it is actually also facing some big challenges, in the past we may need to do dozens of GPU cards can be (simpler), but the scale of the model and the data to the extent that it requires thousands of tens of thousands of cards, which will involve a lot of parallel algorithms scheduling and other aspects of the engineering challenges. Finally on the system side, we've actually seen that OpenAI has been iterating on GPT as a product for the last couple of years. In contrast, those models we had before were more or less developed as a kind of project, and after I made the model good and open-sourced it, it had no further iterative update mechanism.

Tencent Technology: As you said, OpenAI as an enterprise will have continuous product iteration. Academia basically develops with a single project or experiment. So how do you, as an expert who has been involved in both academia and the corporate world, see the difference and correlation between OpenAI as an enterprise, and research in academia?

Minlie Huang: The difference is mainly that OpenAI (as a business) has a very strong team of algorithms and engineering, which is the first point. The second point is that it has a lot of computing power resources. Now you see in the academic world to do (artificial intelligence), we are unlikely to have so much computing power resources, and the second is unlikely to have such a large engineering team. So academics are now focusing on some basic problems, such as the big models we see now may produce the problem of hallucinations, the problem of safety, including accurate computation, that is, the model is not able to carry out accurate computation very effectively.

Those engineers and scientists at OpenAI are actually very good at academics, and then they have very good algorithms and engineering skills, so they can do this exceptionally well. I think the future of real AGI must be a product of close cooperation between the top academic institutions and industry.

Tencent Technology: Some time ago one of Google's engineers published an internal meme saying that big language models can have no moat. Including OpenAI doesn't have one, and neither does Google, and everyone will probably surpass them soon. Do you recognize this statement?

Minlie Huang: I think that's also a degree of misunderstanding. From Google's point of view, if you really want to do it seriously, I think it should not be too difficult to catch up with OpenAI, because it has the computing power, data team, and talent. However, if other companies say that it is easy to go beyond, I think there is a feeling of talking on paper. Like saying, the principle of the atomic bomb looks simple, but it is not easy to really do it.

Because in fact here computing power, money, talent, data and other aspects are actually need to spend time to accumulate and precipitation, including these domestic companies now claiming to do China OpenAI. In fact, everyone is catching up, but you can catch up to 80 points 90 points has been very impressive. And people are constantly iterating, constantly progressing, so I think this thing is actually quite complex, it's a system level problem. And not just that there is no innovation in the model structure, that is no moat. In itself, it is a comprehensive strength of the consideration, it is not only the model of the structure of the algorithm of innovation, more may be the arithmetic funds, and then the data, and then the whole engineering level (caused) such a barrier.

Tencent Technology: Do you think that some companies like OpenAI or Google have already established a moat?

Huang Minlie: There is no doubt that OpenAI has its own moat, and it is not easy for others to catch up with him.

For example, the details of GPT4 have not been released, but its multimodal capabilities are still very strong. In addition to OpenAI is also constantly utilizing the data flywheel to continue to progress. China, we also have some companies in the leading stage, but in fact how the future development, who can ultimately win, depends on one is the overall positioning, the other is in this area can continue to invest, how long you can insist. This is basically the logic.

Tencent Technology: Now everyone is using the same model, recently there are some new research can reduce the overall cost of training. Do you think that if Chinese companies are going to break through and catch up with OpenAI, there are some other paths to choose from, rather than taking the exact same route?

Minlie Huang: I think that's a very good question. In fact, now everyone is squeezing the big language modeling track, but in fact I think the future of AGI does not exclude some other routes. A lot of people also question that a big language model like ChatGPT can't actually create anything new at all. So it's very likely that there will be new routes coming out in the future, but people can't see (the specific direction) yet. It's just that right now we're finding that the path of big language modeling might be closer to AGI or a route that's easier to implement. Now to be honest, the other paths it faces is, for example, symbolism, it has a lot of symbolic operations based on symbols, how it can be scaled up in engineering, that's one of the most realistic difficulties.

And now the big language model has been able to not only make it very big and use a lot of data, but also very capable, so I think that's the silver lining that we're seeing at the moment. But in the future I think there's definitely going to be something else, it's possible that in using the big language model as a framework, it's going to fit some other things in, like the semiotics school.

Tencent Technology: before Lu Qi also put forward, do not go back to the knowledge graph, this point of view you agree with?

Huang Minlie: I don't know the background of his sentence. As far as I know, the big model as a knowledge base to do Q&A is quite far from the ability of other traditional methods on benchmark data sets, and some people have done such research. When the current GPT goes to work on some mathematical calculations, it basically just messes up the answers. Because math problems are exact reasoning, you don't say 1+1=3. 1+1=2 (this narrative), it's either 1 (true) or 0 (false), it only has the probability of 0 and 1, it doesn't say the probability between 0 and 1. So symbolic reasoning is very important in many cases.

Tencent Technology: Before Sam Altman also mentioned in the interview, if the general-purpose model is developed very quickly, a lot of tasks it can be done very well. Does it make sense for us to develop verticals?

Minlie Huang: It definitely makes a lot of sense. Doing industry modeling, domain modeling on a pedestal basis, this is actually very necessary. General intelligence model we actually do not need to solve the problem of the final delivery. When you go to an industry to a field, I definitely want to solve some of the real needs of the industry and field, some pain points, which will involve a lot of industry knowledge and rules.

In the process of sinking the big language model, domain and industry-specific training, optimization methods, including how to inject some industry knowledge and rules into it, which is very important to be able to really let it produce value and play a role in the actual business.

For example, in healthcare, there are some cases where you can never be wrong. Here you need some additional algorithms, modular processing. When doing psychological counseling, one of the scenarios we face is to use a large model to correspond to a depressed user, who is very prone to suicide, and may have a mental breakdown while chatting, and then he says he wants to find a rooftop to jump off. At this time you need to immediately detect his state and implement interventions, such as receiving a manual service up. One of the things we do, is to make a very strong classifier to see if he will have suicidal tendencies, as long as in the detection of the relevant tendency will immediately terminate the human-machine dialogue.

In addition, if you are in the financial scenario, you need to be dynamic and real-time information. We are now cooperating with CICC and Ant to do the application of some big models in financial scenarios, which is a breakthrough in how to get dynamic real-time work.

The other thing is that you can't talk nonsense, and at the same time you need to be compliant when recommending stocks and buying funds. This kind of compliance can not be realized through a simple model data-driven approach.

Tencent Technology: What areas do you think are now currently likely to be the first to be changed by the intervention of AI?

Huang Minlie: I think probably the easiest to see are some related to writing, such as the improvement of the efficiency of writing code, as well as like marketing, digital marketing - I can write a marketing copy, and then input a lot of material, and use AIGC to produce the output. Education is also a big scene, for example, now AI-assisted teachers can guide children to be able to better think, better to understand the plot and values in the story.

Other games are also a big scene. Another relatively more difficult to do (field), such as medical, financial. Because it has a lot of dynamic real-time information and knowledge base, we need to better deal with this kind of field and business-related things, which may be a little slower, but will potentially play a big role and value.

Tencent Technology: What are the most important capabilities that companies in the market need to develop vertical domain models?

Huang Minlie: I think one is to have some underlying capabilities, such as pre-training model fine-tuning, reinforcement learning, etc., is still very important.

On the other hand, it means that you have to have an understanding of the industry and the domain. Knowledge of the industry means that I know when and how to embed this industry knowledge and rules into the model. It's not just a matter of taking data and training it, it's a matter of combining it with the underlying algorithms and models.

But there are a lot of details here that are not as easy as they seem. It is not as simple as saying that I take the data over and then train (training) a bit, get a good result, for example, can do 80 points, but in fact, you ultimately away from the delivery of customer demand may be 95 points, this time you this 15 points how to improve? Is the need for some industry experts to participate.

The following is the full text of the interview:

Industry needs can't be solved by language modeling alone

Tencent Technology: What was your first encounter with ChatGPT like? Is it a paradigm breakthrough?

Tencent Technology: Do you think that some companies like OpenAI or Google have already established a moat?

Huang Minlie: There is no doubt that OpenAI has its own moat, and it is not easy for others to catch up with him.

Tencent Technology: before Lu Qi also put forward, do not go back to the knowledge graph, this point of view you agree with?

Tencent Technology: What areas do you think are now currently likely to be the first to be changed by the intervention of AI?

Tencent Technology: What are the most important capabilities that companies in the market need to develop vertical domain models?

Huang Minlie: I think one is to have some underlying capabilities, such as pre-training model fine-tuning, reinforcement learning, etc., is still very important.

Minlie Huang, Tsinghua University: Finding the most promising AI track after GPT

The following is the full text of the interview:

Industry needs can't be solved by language modeling alone

Tencent Technology: What was your first encounter with ChatGPT like? Is it a paradigm breakthrough?

Tencent Technology: Do you think that some companies like OpenAI or Google have already established a moat?

Huang Minlie: There is no doubt that OpenAI has its own moat, and it is not easy for others to catch up with him.

Tencent Technology: before Lu Qi also put forward, do not go back to the knowledge graph, this point of view you agree with?

Tencent Technology: What areas do you think are now currently likely to be the first to be changed by the intervention of AI?

Tencent Technology: What are the most important capabilities that companies in the market need to develop vertical domain models?

Huang Minlie: I think one is to have some underlying capabilities, such as pre-training model fine-tuning, reinforcement learning, etc., is still very important.

More from yiyao

yiyao

Aug 5

Is the Ether deflation model a failed design? What motivates new projects to opt out?

Ether is starting to realize that it is failing. Builders are leaving the ecosystem in droves. Look at the new projects coming into Web3 - they are far more likely to choose to build on competing L1s or Rollups.This article will explain why this is happening. There was a time when Ether started to focus on being a "robust currency". A number of improvement proposals were used to achieve this, including EIP-1559, "mergers" and a focus on deflating ETH all played a role. These proposals directl...

Subscribe to yiyao

<100 subscribers

Subscribe to yiyao

<100 subscribers

Minlie Huang, Tsinghua University: Finding the most promising AI track after GPT

yiyao

More from yiyao

No activity yet

Minlie Huang, Tsinghua University: Finding the most promising AI track after GPT

More from yiyao

yiyao

yiyao

Minlie Huang, Tsinghua University: Finding the most promising AI track after GPT

More from yiyao

yiyao

No activity yet

More from yiyao

No activity yet

No activity yet