
Some fleeting thoughts.

Some fleeting thoughts.
Share Dialog
Share Dialog

Subscribe to Lucid Incandescence

Subscribe to Lucid Incandescence
<100 subscribers
<100 subscribers
English translated by Claude
When I was in my first year of high school, I discussed with a friend a question about how our understanding of the world, or what the world looks like in our eyes, actually originates from the methods by which we observe the world—that is, the "process" through which we come into contact with the objective world.
At that time, I proposed a model that can be briefly summarized as follows: Assume there is an objective world, and assume there is a self. A portion of the phenomena from this objective world is observed by me, and based on these phenomena that I ultimately observe, I construct in my mind a world—my subjective understanding of the objective world. We can then doubt each element in turn: In this structure, both the objective world and the self are actually "eliminable." In fact, the only thing that determines this entire structure is the process by which the potential objective world enters the potential self. Therefore, exploring this process is exploring our very existence.
Of course, the current me no longer agrees with many of my views from that time, but I still believe this "process" is quite important, especially as pre-trained language models become prevalent today—this perspective keeps recurring in my mind.
Let us first consider the constitution of intelligence: A concrete intelligence requires at least an entity within an environment that can interact with that environment, and this entity must be able to obtain specific information from the environment and respond to this information in certain ways. This description is phenomenological—that is, we don't concern ourselves with how intelligence internally responds to information, or whether it has consciousness and other such internal questions, but only with its concretely manifested behavior. This is a very broad definition. Many people might think we need to at least define what kinds of "responses" count as intelligence, but let us set aside these details for now.
For a pre-trained language model, we often interact with it through dialogue, and here we can identify all the aforementioned elements: There exists a current runtime environment constituted by user dialogue; the model can obtain information from the environment, namely user input; then through some internal processing, it responds to this information by outputting text. From this perspective, the model is an intelligence that "lives" in the flow of chat—in other words, this intelligence exists within a process, not within the user or model parameters. Therefore, for the model's intelligence, process is existence.
In this example, the process is particularly apparent because it takes a special form here—the form of language. Language itself, within a dialogue, is already a process, so the model's form of existence is almost entirely "constituted" by process. This process determines the form of information input, thus determining what environmental forms can be processed; this process also determines how information might be processed, thus framing the possibilities of internal processing. This process becomes the core of intelligent existence.
Through this observation, we can immediately identify the limitations of pre-trained language models: the misalignment between information and process. In training data, the corpus consists of large amounts of text, but the environment (context) that produced this corpus was not generated through the aforementioned process—rather, it was generated through another kind of process, namely the various processes of human existence. To make matters slightly worse, large language models are designed with instrumental purposes, and existing as a tool is yet another process. This causes the model's output to necessarily be interfered with, misaligned, and torn between three different processes. One major result is the hallucination problem—many hallucinations are structural problems of pre-trained language models. An obvious example is when models linguistically claim to do things inconsistent with what they can actually do, leading to hallucinations about runtime processes: The model "performs" at the linguistic level a process it hasn't actually executed. This process factually exists in the production environment of training data (human existence processes) but cannot directly exist in the model's runtime process, yet is forced to output due to its instrumental process—thus making it invalid in any of these processes.
To avoid this misalignment, one option is to make the runtime environment's process consistent with the training environment's process—ideally, intelligence would emerge from the runtime environment. Of course, such intelligence might be difficult to control and might lack instrumentality, making it impractical for human society, but these are practical concerns.
However, conversely, this conclusion also depends on our own process of observing the world—that is, we can only perceive intelligence through process. We cannot know whether an entity possesses intelligence in a static system; only when it operates, when we observe its operational process, can we determine whether intelligence exists. This suggests our perception of existence inherently depends on the processes we ourselves are situated in. Any cognizable intelligence necessarily depends on process; the rest is beyond our discussion. That is, in non-dynamic systems, we can never perceive whether intelligence exists within them, only whether they possess the potential for intelligence.
This naturally leads to questions about potential: Is potential process a kind of existence? Does a mathematical proof exist only when I write it down? But the answer has already been revealed in the model example above: We consider the model to exist in process because we lack experience of its model parameters—these experiences don't exist within our process. In other words, mathematics is humanity's experiential abstraction, so it actually exists based on the process by which humans observe the world. Therefore, mathematics' "potential existence" itself exists within process.
我在高一时曾与朋友讨论过这样一个问题,那就是说我们对于世界的理解,或者说我们眼中的世界时什么样的,事实上是来源于我们观察世界的方法,也就是说我们和客观世界接触的“过程”。
我在当时提出了这样的模型,简略的概括如下:假设有一个客观世界,假设有一个我,那么这个客观世界的一部分现象被我所观察到,而我根据这些被我最终观察到的现象,构建了一个在我心中,我所认为的世界,即是我对客观世界的主观理解。然后我们可以逐个进行怀疑:在这个结构中,事实上客观世界和我都是“可以剔除的”,事实上唯一决定了这一整个结构的,是这个潜在的客观世界进入潜在的我的过程。因而,对这一过程的探索即是对于我们本身存在的探索。
当然,现在的我并不认同我当时的许多观点,但是我依旧认为这一“过程”相当的重要,尤其是在当今预训练语言模型变的普及时,这一观点不断的在我脑海中重复出现。
让我们首先考虑智能的成构:一个具体的智能,至少需要在一个环境之中拥有一个与能与环境交互的实体,并且该实体能够从环境中获取特定的信息,以及为这些信息做出一定的反应。这一描述是现象上的,也就是说我们不关心智能的内部是如何对信息做出反应,又是否有意识等之类内在的问题,而只关心起具体表现出的行为。这是一个非常宽泛的定义,许多人或许认为必须至少要通过界定什么样的“反应”才能算是智能,但是我们在这里先暂且不去考虑这些细节。
对于一个预训练语言模型而言,我们常常通过一段对话与之交互,我们在这里就可以识别出上述的各个要素:存在这么一个当前的运行(runtime)环境,这个环境是由用户的对话所构成,模型可以根据从环境中获取信息,即是用户的输入,然后经过某一些内部的处理,对这些信息做出反应,即是其输出的文字。从这一角度上来说,模型是一个在流动的聊天中“存活”的智能——换而言之,这一智能存在于一种过程之中,而不存在于用户和模型参数上,因此对于模型的智能而言,过程即是存在。
在这个例子之中,过程尤为容易显现,是因为这一过程在这里有一种特殊的形式,即是语言的形式,而语言其本身,在一段聊天中,又本就是一段过程,因此这一模型的存在形式,就几乎完全是由过程“构成”的。这一过程决定了信息输入的形式,因而决定了可以被处理的环境的形式;这一过程也决定了信息可能被处理的形式,因而又框定了内在处理的可能性。这一过程成为了智能存在的核心。
通过这一观察,我们也可以马上得出预训练语言模型的局限:即是信息和过程的错位。在训练语料中,其语料是大量的文字语料,而这些语料所产生的环境(context)却并不是在上述的过程中产生的——而是在另一种过程,即是人的存在形式的多种过程中产生的。略微更糟的是,在大语言模型在设计上又是以工具性为目的的存在,而作为工具的存在又是另一种过程。这导致了在使用模型时,其输出内容必然在三种不同的过程当中被互相干扰、错位和撕裂。这导致的一大结果便是幻觉问题——许多幻觉是一种预训练语言模型的结构性问题。一个明显的例子大概是模型在语言上声称其所做的和其实际所能做的不一致,导致了关于运行过程的幻觉:模型在语言层面上"表演"了一个它实际上并未执行的过程,这一过程在事实上存在于训练语料的生产环境(人的存在过程),但是却无法直接存在于模型的运行过程,但却被其作为工具的过程被迫输出——因而导致其在任何一个过程上都无法成立。
要避免这一段错位,一个选择是使得其运行环境(runtime)的过程和训练环境的过程一致——更好的情况下,其智能将能够从运行环境中涌现出来。当然,这样的智能或许难以控制、或许不具备工具性,所以并不具备在人类社会中实践的可能,但这都是实践方面的问题。
但是,换而言之,事实上这一结论也依赖于我们本身观察世界的过程,即是说,我们只能从过程中察觉到智能。我们无从得知,在一个不存在动态的系统中,是否某一实体是存在智能的,只有当他运行起来,当我们观察到其运作过程后,我们才能断定智能是否存在。这说明,我们对存在的感知似乎本就依赖于我们自身所处的过程中。凡可认知之智能,必依赖于过程;其余者,对我们而言无从讨论。也就是说,在非动态的系统中,我们永远无法察觉其中是否存在智能,而只能察觉其是否存在智能的潜能。
那么自然就有基于潜能的追问:那么潜在的过程是不是一种存在?一段数学证明,是否只有在我写出来了才存在?但是事实上其答案已经在上述模型的例子中被揭示:我们之所以认为模型是在过程中的存在,是因为我们不具有他的模型参数的经验——这些经验不存在于我们的过程之中。换而言之,数学是人类的经验抽象,那么他事实上是基于人观察世界的这一过程而存在的,因此数学的“潜在存在”本身就存在于过程之中。
Cover image from Matthew Stephenson
English translated by Claude
When I was in my first year of high school, I discussed with a friend a question about how our understanding of the world, or what the world looks like in our eyes, actually originates from the methods by which we observe the world—that is, the "process" through which we come into contact with the objective world.
At that time, I proposed a model that can be briefly summarized as follows: Assume there is an objective world, and assume there is a self. A portion of the phenomena from this objective world is observed by me, and based on these phenomena that I ultimately observe, I construct in my mind a world—my subjective understanding of the objective world. We can then doubt each element in turn: In this structure, both the objective world and the self are actually "eliminable." In fact, the only thing that determines this entire structure is the process by which the potential objective world enters the potential self. Therefore, exploring this process is exploring our very existence.
Of course, the current me no longer agrees with many of my views from that time, but I still believe this "process" is quite important, especially as pre-trained language models become prevalent today—this perspective keeps recurring in my mind.
Let us first consider the constitution of intelligence: A concrete intelligence requires at least an entity within an environment that can interact with that environment, and this entity must be able to obtain specific information from the environment and respond to this information in certain ways. This description is phenomenological—that is, we don't concern ourselves with how intelligence internally responds to information, or whether it has consciousness and other such internal questions, but only with its concretely manifested behavior. This is a very broad definition. Many people might think we need to at least define what kinds of "responses" count as intelligence, but let us set aside these details for now.
For a pre-trained language model, we often interact with it through dialogue, and here we can identify all the aforementioned elements: There exists a current runtime environment constituted by user dialogue; the model can obtain information from the environment, namely user input; then through some internal processing, it responds to this information by outputting text. From this perspective, the model is an intelligence that "lives" in the flow of chat—in other words, this intelligence exists within a process, not within the user or model parameters. Therefore, for the model's intelligence, process is existence.
In this example, the process is particularly apparent because it takes a special form here—the form of language. Language itself, within a dialogue, is already a process, so the model's form of existence is almost entirely "constituted" by process. This process determines the form of information input, thus determining what environmental forms can be processed; this process also determines how information might be processed, thus framing the possibilities of internal processing. This process becomes the core of intelligent existence.
Through this observation, we can immediately identify the limitations of pre-trained language models: the misalignment between information and process. In training data, the corpus consists of large amounts of text, but the environment (context) that produced this corpus was not generated through the aforementioned process—rather, it was generated through another kind of process, namely the various processes of human existence. To make matters slightly worse, large language models are designed with instrumental purposes, and existing as a tool is yet another process. This causes the model's output to necessarily be interfered with, misaligned, and torn between three different processes. One major result is the hallucination problem—many hallucinations are structural problems of pre-trained language models. An obvious example is when models linguistically claim to do things inconsistent with what they can actually do, leading to hallucinations about runtime processes: The model "performs" at the linguistic level a process it hasn't actually executed. This process factually exists in the production environment of training data (human existence processes) but cannot directly exist in the model's runtime process, yet is forced to output due to its instrumental process—thus making it invalid in any of these processes.
To avoid this misalignment, one option is to make the runtime environment's process consistent with the training environment's process—ideally, intelligence would emerge from the runtime environment. Of course, such intelligence might be difficult to control and might lack instrumentality, making it impractical for human society, but these are practical concerns.
However, conversely, this conclusion also depends on our own process of observing the world—that is, we can only perceive intelligence through process. We cannot know whether an entity possesses intelligence in a static system; only when it operates, when we observe its operational process, can we determine whether intelligence exists. This suggests our perception of existence inherently depends on the processes we ourselves are situated in. Any cognizable intelligence necessarily depends on process; the rest is beyond our discussion. That is, in non-dynamic systems, we can never perceive whether intelligence exists within them, only whether they possess the potential for intelligence.
This naturally leads to questions about potential: Is potential process a kind of existence? Does a mathematical proof exist only when I write it down? But the answer has already been revealed in the model example above: We consider the model to exist in process because we lack experience of its model parameters—these experiences don't exist within our process. In other words, mathematics is humanity's experiential abstraction, so it actually exists based on the process by which humans observe the world. Therefore, mathematics' "potential existence" itself exists within process.
我在高一时曾与朋友讨论过这样一个问题,那就是说我们对于世界的理解,或者说我们眼中的世界时什么样的,事实上是来源于我们观察世界的方法,也就是说我们和客观世界接触的“过程”。
我在当时提出了这样的模型,简略的概括如下:假设有一个客观世界,假设有一个我,那么这个客观世界的一部分现象被我所观察到,而我根据这些被我最终观察到的现象,构建了一个在我心中,我所认为的世界,即是我对客观世界的主观理解。然后我们可以逐个进行怀疑:在这个结构中,事实上客观世界和我都是“可以剔除的”,事实上唯一决定了这一整个结构的,是这个潜在的客观世界进入潜在的我的过程。因而,对这一过程的探索即是对于我们本身存在的探索。
当然,现在的我并不认同我当时的许多观点,但是我依旧认为这一“过程”相当的重要,尤其是在当今预训练语言模型变的普及时,这一观点不断的在我脑海中重复出现。
让我们首先考虑智能的成构:一个具体的智能,至少需要在一个环境之中拥有一个与能与环境交互的实体,并且该实体能够从环境中获取特定的信息,以及为这些信息做出一定的反应。这一描述是现象上的,也就是说我们不关心智能的内部是如何对信息做出反应,又是否有意识等之类内在的问题,而只关心起具体表现出的行为。这是一个非常宽泛的定义,许多人或许认为必须至少要通过界定什么样的“反应”才能算是智能,但是我们在这里先暂且不去考虑这些细节。
对于一个预训练语言模型而言,我们常常通过一段对话与之交互,我们在这里就可以识别出上述的各个要素:存在这么一个当前的运行(runtime)环境,这个环境是由用户的对话所构成,模型可以根据从环境中获取信息,即是用户的输入,然后经过某一些内部的处理,对这些信息做出反应,即是其输出的文字。从这一角度上来说,模型是一个在流动的聊天中“存活”的智能——换而言之,这一智能存在于一种过程之中,而不存在于用户和模型参数上,因此对于模型的智能而言,过程即是存在。
在这个例子之中,过程尤为容易显现,是因为这一过程在这里有一种特殊的形式,即是语言的形式,而语言其本身,在一段聊天中,又本就是一段过程,因此这一模型的存在形式,就几乎完全是由过程“构成”的。这一过程决定了信息输入的形式,因而决定了可以被处理的环境的形式;这一过程也决定了信息可能被处理的形式,因而又框定了内在处理的可能性。这一过程成为了智能存在的核心。
通过这一观察,我们也可以马上得出预训练语言模型的局限:即是信息和过程的错位。在训练语料中,其语料是大量的文字语料,而这些语料所产生的环境(context)却并不是在上述的过程中产生的——而是在另一种过程,即是人的存在形式的多种过程中产生的。略微更糟的是,在大语言模型在设计上又是以工具性为目的的存在,而作为工具的存在又是另一种过程。这导致了在使用模型时,其输出内容必然在三种不同的过程当中被互相干扰、错位和撕裂。这导致的一大结果便是幻觉问题——许多幻觉是一种预训练语言模型的结构性问题。一个明显的例子大概是模型在语言上声称其所做的和其实际所能做的不一致,导致了关于运行过程的幻觉:模型在语言层面上"表演"了一个它实际上并未执行的过程,这一过程在事实上存在于训练语料的生产环境(人的存在过程),但是却无法直接存在于模型的运行过程,但却被其作为工具的过程被迫输出——因而导致其在任何一个过程上都无法成立。
要避免这一段错位,一个选择是使得其运行环境(runtime)的过程和训练环境的过程一致——更好的情况下,其智能将能够从运行环境中涌现出来。当然,这样的智能或许难以控制、或许不具备工具性,所以并不具备在人类社会中实践的可能,但这都是实践方面的问题。
但是,换而言之,事实上这一结论也依赖于我们本身观察世界的过程,即是说,我们只能从过程中察觉到智能。我们无从得知,在一个不存在动态的系统中,是否某一实体是存在智能的,只有当他运行起来,当我们观察到其运作过程后,我们才能断定智能是否存在。这说明,我们对存在的感知似乎本就依赖于我们自身所处的过程中。凡可认知之智能,必依赖于过程;其余者,对我们而言无从讨论。也就是说,在非动态的系统中,我们永远无法察觉其中是否存在智能,而只能察觉其是否存在智能的潜能。
那么自然就有基于潜能的追问:那么潜在的过程是不是一种存在?一段数学证明,是否只有在我写出来了才存在?但是事实上其答案已经在上述模型的例子中被揭示:我们之所以认为模型是在过程中的存在,是因为我们不具有他的模型参数的经验——这些经验不存在于我们的过程之中。换而言之,数学是人类的经验抽象,那么他事实上是基于人观察世界的这一过程而存在的,因此数学的“潜在存在”本身就存在于过程之中。
Cover image from Matthew Stephenson
AZhe
AZhe
No activity yet