Peter Howitt
“...As kingfishers catch fire, dragonflies draw flame;
As tumbled over rim in roundy wells
Stones ring; like each tucked string tells, each hung bell's
Bow swung finds tongue to fling out broad its name;
Each mortal thing does one thing and the same:
Deals out that being indoors each one dwells;
Selves — goes itself; myself it speaks and spells,
Crying Whát I dó is me: for that I came.”
(Gerard Manley Hopkins, “As Kingfishers Catch Fire”, 1877)
[Like a modern day Hephaestus forging life through rhythmic ringing blows, Hopkin beats life into each line. Poetry as incantation.]
Words can be spells.
Language is perhaps the most wondrous and dangerous of our discoveries and technologies, a magic that must be respected and used very carefully. This magic is a central focus of our creation myths including the ancient Egyptian creator God Ptah’s divine utterance (Memphite Theology), Genesis (God said “Let there be light” and there was light) and the Greek concept of Logos where speech and language is a generative concept (Logos spermatikos) and gives us universal reason and order. The magical or divine breath of language (logos) is the force or father of reason (logic).
Words are more or less useful tools and language can give us an illusion of representation of some aspect of reality. Perhaps the better the linguistic performance, the closer the approximation, but we should not mistake this game for the deeper ‘unspeakable’ reality – to the extent we do make that mistake we move further away from approximating the ineffable.
As Bruce Lee noted “It's like a finger pointing away to the moon. Don't concentrate on the finger or you will miss all that heavenly glory.”
“Words bend our thinking to infinite paths of self-delusion, and the fact that we spend most of our mental lives in brain mansions built of words means that we lack the objectivity necessary to see the terrible distortion of reality which language brings.”
(Dan Simmons, Hyperion, 1989.)
Our use of language must start with our understanding of its risks of delusion and confusion – appreciation of its irreducible limitations. Language use is a game of context, and a wonderful one when used well, but our problems with language and our wider issues with beliefs about the reality of abstract concepts can descend into a collective mental illness. Logic also has some irreducible incompleteness that shows that truth cannot be determined entirely within a self-referential framework. Recursive logic is a language that can be useful, but that does not mean it is necessarily truthful.
In this Part, we will consider some wider philosophical issues with language and logic.
“Without philosophy thoughts are ... cloudy and indistinct: its task is to make them clear and to give them sharp boundaries.”
Like DNA, languages carry the remnants of our evolution. Words and language are extraordinary compressions of history and infectious carriers of culture. For example, every time I say hello and goodbye, I give praise to God and wish that he is with you (even if not consciously and despite that I am not religious).
Words can be beautiful, lyrical, assonant or dissonant. They might sound like the sound they are referring to (unlike onomatopoeia!).
“Philosophy is a battle against the bewitchment of our intelligence by means of language. The philosopher's treatment of a question is like the treatment of an illness.”
For such little things, words do a tremendous amount of work. These superhero sherpas can carry whole worlds on their backs.
“Could mortal lip divine, / The undeveloped Freight, / Of a delivered syllable
/ Twould crumble with the weight.”
(Emily Dickinson, 1894.)
Each word is a sign and a seed, ready to grow into a many-branched tree. Their strength is in how much we can compact into them and how gregarious they are, flocking together in great murmurations. Words, when working well together, are a wonderful example of algorithmic efficiency.
Ludwig Wittgenstein can be a challenging philosopher to read. He started his philosophical career at the analytic school which focused on the philosophy of language and logic. His view on the meaning of language changed radically throughout his life.
His earlier position can be summarised as follows:
Language is a picture of reality.
The meaning of a proposition is only related to its truth conditions.
Only propositions that can be verified or falsified have meaning.
Propositions that cannot be verified or falsified are therefore meaningless.
"Whereof one cannot speak, thereof one must be silent."
To get an understanding of some of his earlier views, consider the following sentence:
The giraffe is lying on the couch.
This proposition has meaning because it can be verified or falsified. We can verify this proposition by looking at the giraffe and seeing if it is sitting on the couch. The meaning of each term is well understood (albeit note the alternative interpretation of the sentence is that it means the Giraffe speaks falsehoods from the couch). Now consider the following:
God is love.
This statement cannot be verified or falsified. The statement does not refer to something that we can find in the world and point to or illustrate. In Wittgenstein’s earlier view, this sentence literally has no sensible meaning.
Thankfully, Wittgenstein relented and repented in his later years and saw sense. The analytic tradition was unhealthy for many reasons, including its overly technical obsession with language and its attempts to make all uses of language equivalent to logical statements, axioms or propositions.
Wittgenstein argued in Tractatus, that that language is a way of representing reality. When we speak or write, we are creating a picture of the world and the meaning of a picture is determined by what it depicts. Similarly, the meaning of a proposition is determined by what it represents. He also sought to equate our everyday use of language to logical statements that could be verifiable, falsifiable or indeterminate.
“Language disguises thought. So much so, that from the outward form of the clothing it is impossible to infer the form of the thought beneath it, because the outward form of the clothing is not designed to reveal the form of the body, but for entirely different purposes. The tacit conventions on which the understanding of everyday language depends are enormously complicated”.
Wittgenstein’s later understanding of language was much more realistic and creative:
Language is a tool that we use to play games
There are many kinds of language games, each with its own rules and conventions
The meaning of a word is only determined by its use in the specific language game being played
Wittgenstein states that language is a tool that we use to play games with each other and compares language to a game like chess. Words have no inherent meaning; the context gives words meaning (this is similar to the Buddhist doctrine of Śūnyatā and the concept of affordance that I will pick up again in Part V).
The giraffe is lying on the couch.
Wittgenstein now believes that there is no independent truth proposition in this sentence. ‘Giraffe’ (and indeed all other words) has no meaning other than how they are used in the language games we play with each other.
It is not necessary for ‘giraffe’ to refer to anything that exists in the physical world and, depending on the discussion (the game being played) it could be used as a shorthand for something abstract, comical, artistic etc. That is, it may not refer to an animal at all. It all depends on the players of the language game; the type of language game being played and the context of the word within that game.
God is love.
Likewise ‘God is Love’ is intelligible and has meaning within some language games. For example, the players could understand it to be referring to an abstract entity or power in the universe that has good will towards living things (or at least humans!). They may mean or understand much more (a particular type of god, a particular kind of love) depending on the context and the freight they perceive those words to carry. Alternatively, the players may use the phrase in a parodic or satirical way. It could even be used as an unconventional shorthand within a smaller community of speakers, e.g., for their mutual dislike of a brutal religious dictatorship that forces public utterance of such statements on its people whilst practising hate and violence in the name of that same god.
Wittgenstein's philosophical shift on language can be summarised as follows:
Early Views (Tractatus Logico-Philosophicus):
Picture Theory: Language mirrors reality, with propositions corresponding to facts in the world.
Truth Conditions: Meaning is determined by verifiability or falsifiability; statements lacking this are meaningless.
Logical Atomism: Complex propositions break down into simpler, verifiable elements.
Later Views (Philosophical Investigations):
Language Games: Language is a tool for various social activities, each with its own rules and context.
Meaning as Use: Words gain meaning through their function in specific language games, not inherent correspondence to reality.
Anti-Essentialism: There's no fixed, universal essence to language or meaning; it's fluid and context-dependent.
In essence, Wittgenstein moved from a rigid, logical view of language as mirroring reality to a more flexible, pragmatic view of language as a tool for social interaction and meaning-making. Later Wittgenstein teaches that a language game is any communication between two or more parties whereby the meaning of the language is determined by the rules of the game agreed by the parties playing. This superposition of meaning even allows words to sometimes have two opposing meanings depending on the context. The meaning of words and phrases within any language are often somewhat ambiguous and they evolve over time. Wittgenstein's focus on language games reminds us that words only have a relative variable meaning that can be ascertained contextually. This also makes language particularly challenging for computer programming and AI.
“Human language is built on a foundation of symbolic reference, and the cognitive resources required for symbolic thought are unparalleled in other species.”
(Terrence W. Deacon, 'The Symbolic Species: The Co-evolution of Language and the Brain', 1997.)
Wittgenstein's insights into the shared relative meaning of language have been profoundly influential and can be seen in seminal works such as by Terrence Deacon, which argue that language is not just a complex form of communication but a symbolic system, meaning that words do not directly correlate to objects or actions but stand for them through shared conventions.
Deacon’s work provides a rich framework for understanding the symbolic complexity of human language. This stands in stark contrast to how, for example, AI systems currently handle language generation. While AI can produce impressive simulations of language, to date it lacks the cognitive architecture that gave rise to human symbolic thought. That said, evolutionary pressures on AI systems could give rise to deeper symbolic dexterity.
“I PROPOSE to consider the question, ‘Can machines think?’ This should begin with definitions of the meaning of the terms ‘machine’ and ‘think’.”
(TURING, A. M., “I.—COMPUTING MACHINERY AND INTELLIGENCE.” Mind, 1950).
Alan Turing went on to state “The definitions might be framed so as to reflect so far as possible the normal use of the words, but this attitude is dangerous. If the meaning of the words ‘machine’ and ‘think’ are to be found by examining how they are commonly used it is difficult to escape the conclusion that the meaning and the answer to the question, ‘Can machines think?’ is to be sought in a statistical survey such as a Gallup poll. But this is absurd.”
Language did not arrive in humans fully formed, it is a technology that evolved from simpler beginnings and became more complex as it increased the competitive advantages to the language users.
The Turing Test can be simplified and reformulated as follows:
A human interrogator is placed in a room separate from two other participants: a human and a machine.
The interrogator can communicate with the other two participants through a text-based interface (it is a ‘behind the veil’ test).
The interrogator's goal is to determine which of the other two participants is the human and which is the machine.
The machine's goal is to convince the interrogator that they are human.
The test is conducted over a period of time, and the interrogator is allowed to ask any questions they want.
Turing believed that by the year 2000, it would be possible to have a machine that would give an interrogator no more than a 70% chance of determining it was a machine within 5 minutes of questioning.
“The original question, ‘Can machines think!’ I believe to be too meaningless to deserve discussion. Nevertheless I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted.”
Turing argued that the imitation game is a good way to measure machine intelligence because it is based on the ability of a machine to carry on a natural language conversation with a human. He also argued that the test is objective because it does not rely on any subjective criteria, such as what the interrogator thinks about the participant’s personality or appearance. He gave short shrift to theological objections about man’s immortal soul. He quite rightly asked, if that was the case why elephants should not also have immortal souls and indeed any animate life form. Turing then turned his thoughts to the ‘head in the sand’ objection:
“We like to believe that Man is in some subtle way superior to the rest of creation. It is best if he can be shown to be necessarily superior, for then there is no danger of him losing his commanding position. The popularity of the theological argument is clearly connected with this feeling. It is likely to be quite strong in intellectual people, since they value the power of thinking more highly than others, and are more inclined to base their belief in the superiority of Man on this power. I do not think that this argument is sufficiently substantial to require refutation. Consolation would be more appropriate…”
Turing also discussed Gödel’s theorem – and the impact of their being undecidable logical statements – on the ability of a machine to respond to certain questions appropriately or at all (i.e. without going into a potentially infinite loop as per the ‘halting problem’ that we will consider later in the ‘Limits of Logic’).
“The short answer to this argument is that although it is established that there are limitations to the powers of any particular machine, it has only been stated, without any sort of proof, that no such limitations apply to the human intellect… I do not think too much importance should be attached to it. We too often give wrong answers to questions ourselves to be justified in being very pleased at such evidence of fallibility on the part of the machines…In short, then, there might be men cleverer than any given machine, but then again there might be other machines cleverer again, and so on.”
Turing points out (quoting from Professor Jefferson’s 'Lister Oration for 1949') that another main objection is the argument from consciousness:
“Not until a machine can write a sonnet or compose a concerto because of thoughts and emotions felt, and not by the chance fall of symbols, could we agree that machine equals brain—that is, not only write it but know that it had written it. No mechanism could feel (and not merely artificially signal, an easy contrivance) pleasure at its successes, grief when its valves fuse, be warmed by flattery, be made miserable by its mistakes, be charmed by sex, be angry or depressed when it cannot get what it wants.”
As Turing states, this objection ultimately leads to solipsism and extreme prejudice, since the only way to know if and how a machine, or indeed another human, thinks is to be that a machine or human. Such an objection must be abandoned because, whilst it may still be a logically valid view (in some respects), such a view makes communication and dialogue all but impossible and so defeats the purpose of the enterprise – which is to reason as to whether a computer program can play the imitation game effectively.
To take up Turing’s point on solipsism, as Turing noted, his test works just as well for humans. The assumption in the Turing test (which I will use as it is the modern name for the imitation game) is that at a sufficiently fine-grained level of certainty, there is a point at which we can not know whether another agent is similarly human or not and all we have to determine the question is evidence based on the agent’s actions. This is the same as the inability of humans to know what is happening in other human’s minds (the wider problem of ‘other minds’).
Our knowledge of other human minds is inferred indirectly from their behaviour, our assumptions and our understanding of how we ourselves function. This problem becomes most apparent with humans that have sociopathic tendencies which allow them to simulate normality (think of the charming Jeffrey Dahmer) whilst secretly torturing or murdering other people for their own pleasure (though often with limited self-control). It is usually justified by, or gives rise to, a lack of belief in the validity of the other people to be meaningful agents themselves i.e. other people are just objects.
In Turing’s paper, he pre-empted some other objections such as it is just a parrot that repeats its inputs (readers will note that the parrot line keeps rearing its ugly beak) or that it cannot do something unusual or surprising. All these arguments are defeated by experience and have little theoretical power. Interestingly, Turing also considered, as a near-future possibility, that a machine and program might observe its own operations and outputs and learn from them to modify its operations to better achieve a targeted objective.
“...it is not altogether unreasonable to describe digital computers as brains..If it is accepted that real brains, as found in animals, and in particular in men, are a sort of machine it will follow that our digital computer suitably programmed, will behave like a brain.”
Turing's non-discriminatory approach – which seeks to avoid unjustified human exceptionalism – still remains valid today as a starting point for assessing the intelligence of computer programs. Its non-discriminatory principle also has much wider relevance for assessing the intelligence of different species and life forms.
"Imagine a native English speaker who knows no Chinese locked in a room full of boxes of Chinese symbols (a database) together with a book of instructions for manipulating the symbols (the program). Imagine that people outside the room send in other Chinese symbols which, unknown to the person in the room, are questions in Chinese (the input). And imagine that by following the instructions in the program the man in the room is able to pass out Chinese symbols which are correct answers to the questions (the output)."
(John Searle, 'Minds, Brains, and Programs', Behavioral and Brain Sciences, 1980.)
Searle wished to counter Turing and his test for machine intelligence. In Searle’s view, his Chinese Room Argument (CRA) shows that mere proficiency at a language game is not equal to understanding: “The program enables the person in the room to pass the Turing Test for understanding Chinese but he does not understand a word of Chinese”.
He states that if a man does not understand Chinese despite using such tools, then neither does a computer “because no computer, qua computer, has anything the man does not have.” With the CRA, Searle hopes to have demonstrated the following:
Simulation of understanding something is not equivalent to understanding something
Intentionality in human beings (and animals) is a product of causal features of the brain
Algorithmic operations and processing using programs are not sufficient in themselves to create an agent with intentionality
Algorithm: a process involving the use of discrete steps or rules to solve problems in a manner that can be replicated without deviation in the output. In simpler terms, algorithms can be considered recipes to make something which if followed will result in the same output each time (e.g. a cake well baked).
“strong AI has little to tell us about thinking, since it is not about machines but about programs, and no program by itself is sufficient for thinking.”
Searle further states that, if he is the person in the room who responds in Chinese (using an algorithmic process) to questions and statements written in Chinese, then it proves that the Turing Test is inadequate – since his responses would lead an observer outside of the room to conclude that he knows Chinese when Searle knows that he does not. His wider claim is that strong AI can not exist because computer programs are not like human minds and any formal syntactic skill (parsing the structure and ordering of words and skill with grammar) can not of itself equate to semantic reasoning (understanding and communication with meaning). It allows for a form of weak AI to exist since this is simply the ability to simulate understanding (in the eyes and ears of an observer) without any real understanding.
“Computation is defined purely formally or syntactically, whereas minds have actual mental or semantic contents, and we cannot get from syntactical to the semantic just by having the syntactical operations and nothing else….A system, me, for example, would not acquire an understanding of Chinese just by going through the steps of a computer program that simulated the behavior of a Chinese speaker”
(Searle, J.R. 'Why dualism (and materialism) fail to account for consciousness', 2010)
First, let's give Searle some credit. The fact that his argument has annoyed so many philosophers for decades must be considered a good sign. If it was so easy to refute or dismiss, we would not be discussing it here 43 years later. Also, I think the CRA is helpful to consider what might be needed to get closer to artificial general intelligence. There have been many objections over the years to his CRA including the systems reply (the room understands Chinese), the robot reply, the brain simulator reply, the other minds reply, and the many mansions reply. Searle does a good job of defending against these counter-arguments.
To simplify, in the CRA, the person using the tools does not have any understanding of what the logograms mean. This is similar to my understanding of the following translation of the previous sentence:
在他的中文房間爭論中,出發點是使用工具的人對語標的含義沒有任何理解,這類似於我對以下內容的理解
These logograms were created by knowing that Google Translate can take any English sentence, parse and decode it into Chinese and return the output. Searle’s argument is that neither Google Translate (nor myself using it) can thereby be said to understand Chinese. I can agree as regards my understanding. The issue of what, if anything, is ‘understanding’ Chinese within Google Translate is obviously part of the issue raised by Searle. Searle’s CRA hints at a dualistic position that if there is no “ghost in the machine” there can be no real understanding.
“Such intentionality as computers appear to have is solely in the minds of those who program them and those who use them, those who send in the input and those who interpret the output”
An obvious objection to Searle's thought experiment is that it is not empirical – after all Turing created a thought experiment for digital intelligence that could be tested whereas Searle's thought experiment is not really testable as it was constructed. However, this is only a very limited objection since thought experiments are very useful even if not immediately testable (Einstein famously used them). Indeed, Turing himself also used a thought experiment of a magical computer that you could imagine but that logically could not exist to prove a conjecture using contradiction (see the halt program in the Limits of Logic later in this paper).
Now let's deal with some stronger objections to the CRA:
Searle does not create a setup that would enable an observer to try to test whether an agent in one room really understands Chinese and one in another room does not, so his argument really has little to do with Turing’s test. His argument is against an entirely algorithmic or computational view of ‘understanding’ which he somehow equates to intentionality.
The Turing test does not test for intention or understanding nor does Turing equate passing the test with having either of those qualities or functionalities.
Searle’s CRA merely asserts a conceivable truth condition - there is an agent that does not understand Chinese whilst the interactions involved in manipulating Chinese symbols could lead an observer to believe they did understand Chinese. That said, I think Searle is correct that we must agree that this is possible whether the agent uses a book or a program and whether the agent is a human or a computer.
Not only does the CRA not test for understanding, but Searle does not successfully define what understanding Chinese (or any language) means. He equates it variously with having “intentionality”, “internal causal powers equivalent to those of brains” “mental states” and “its ability to produce intentional states”. It is true that Turing also did not define intelligence. However, he gave us a test for it that seems intuitively useful given the complexity of human behaviour and contextual use of language (success in the language game being much more than just the amount of horsepower available for complex calculations).
Unlike Turing - who knew a thing or two about computers and code - Searle created a theoretical operational system (the human with the instruction manual and Chinese symbols) in a room that could not actually pass Turing’s test. Searle does not create a test for mental states or intentionality which is the very thing he assumes computer programs do not have.
Let's delve a little deeper into some of these points.
The way Searle set up the CRA thought experiment does not really deal with the subtlety of the Turing test. Remember, the Turing test is not a test of whether some hidden agent ‘understands’ English or Chinese (whatever ‘understands' means) but whether a computer could pass as a human using the language game (or whether it is undecidable – and therefore a win for the computer). The contention is that this is functionally equivalent to intelligence and so observable and non-discriminatory. The Turing Test requires that we do not know whether we are interacting with a program or person (since the responder is hidden) though if we consider an expertly crafted humanoid robot the responder can be in plain sight and the Turing Test works just as well since it continues to beg the apposite question: how would we know which agent is digital and which is human based solely on a language game?
The deep question that is intentionally avoided by Turing is: what is intelligence and what is thinking? He avoided it precisely because it risks being indeterminate or metaphysical (given that we do not really understand what it is in humans) and therefore not capable of sensible discussion and verification. The deep question raised by Searle is: what is understanding? Unfortunately, Searle neither defines it properly nor creates a test to verify it.
Searle’s CRA boils down to a rejection in principle of the idea that syntactic or even semantic manipulation skill is equivalent to a semantic understanding of language. He is right to point out that the process of answering something cannot be the reason or motive for answering something. Likewise, a cooking recipe is not the cake, nor is it the reason for baking or eating the cake. Searle suggests that this may have been what some people were asserting with descriptions of strong AI. However, Turing was not seriously suggesting that an algorithm is by itself both the process to reach a conclusion and the consciousness, desire or agency to do the same.
Searle’s argument is deceptively clever; he created a thought experiment of human ignorance to prove his argument against digital intelligence. However, the CRA is also a form of circular reasoning where its premise assumes the truth of its conclusion rather than setting up an experiment to prove the argument or conclusion. The circularity arises from how Searle frames the concept of "understanding." He presupposes that true understanding requires something intrinsic to the human mind (like consciousness or intentionality) that a computer program inherently lacks. This is the conclusion he wants to reach. However, in his thought experiment, he defines understanding as something the person in the room doesn't have, simply because they are following a set of rules. This becomes the premise of his argument.
In essence, the Searle argument goes as follows:
The person in the room doesn't understand Chinese (premise 1).
A computer program is like the person in the room (premise 2).
Therefore, a computer program cannot understand Chinese (conclusion).
The problem is that the first premise already assumes the conclusion to be true (and so the argument is a form of logical fallacy). Searle hasn't independently established what "understanding" means or how to measure it, he has simply defined it in a way that excludes the possibility of a computer program ever achieving it.
To avoid circularity, Searle would need to:
Provide a clear, objective definition of "understanding" that is not inherently biased against computer programs.
Propose a way to test for this understanding, independent of the process used to generate responses (whether it's a human with a rule book or a computer program).
Searle creates a straw man that he can then more easily set on fire. Searle might be accused of a sort of intellectual laziness and appeal to prejudice given many of these objections.
“As long as the program is defined in terms of computational operations on purely formally defined elements, what the example suggests is that these by themselves have no interesting connection with understanding.”
In short, Searle argues that any computer program that passes the Turing test does not thereby pass the Searle test. How does a computer program pass the Searle test?
We have no idea, and Searle does little to assist us be the wiser. No observations are suggested that would allow us to infer intentionality, mental states or causal powers. His CRA keeps our, as yet poorly understood and undefined, mammalian minds in a black box and gives us no way to test for equivalence in digital minds (even if such things can exist).
“Could a machine think?” On the argument advanced here only a machine could think, and only very special kinds of machines, namely brains and machines with internal causal powers equivalent to those of brains. And that is why strong AI has little to tell us about thinking, since it is not about machines but about programs, and no program by itself is sufficient for thinking.”
We could also pose a few further challenges to Searle’s own definition of intelligence and understanding. For example, Searle does not consider learning as a process that happens over time and which, interestingly, often starts with mimicry. For example, he is surely not claiming that when Young and Champollion started to uncover the rudimentary structure of ancient Egyptian hieroglyphic language, that they did not understand ancient Egyptian because only a person with real understanding of the meaning of specific hieroglyphs can pass his test of real linguistic understanding? In fact, if I understand his challenge properly, syntactic skill is no test of understanding language at all since he is not interested in tests of skill with symbols and signs as that could be synthetic intelligence lacking understanding.
Hieroglyphs are largely a mixture of logograms (that record words or morphemes as elements of the language) and phonograms (that replicate the sounds made when speaking the words the symbols represent) e.g., an animal hieroglyph may be used to represent the sound or first letter vocalised when speaking that animals name in oral Egyptian. Although sometimes referred to as an ideogrammatic language—where abstract concepts are directly represented through visual images, such as mathematical symbols—the Egyptian hieroglyphic system primarily relied on logograms and phonograms, with ideograms being comparatively rarer.
Young and Champollion first started to understand hieroglyphs using their semantic and, more importantly, their syntactic language skills for other languages. Over time this ability to parse and decode elements of Egyptian hieroglyphs (relying heavily on their knowledge of other languages) led to greater understanding of the now-extinct language. One of the key breakthroughs was the decipherment of the Rosetta Stone which originally contained the same meaningful sentences in Greek, hieroglyphs and demotic (although only fragments of each remained in modern times).
It is true that understanding ancient Egyptian culture and the meaning of language in that culture is more than just a syntactic exercise. However, using the Rosetta stone to parse and decode hieroglyphics was primarily an algorithmic operation and one that a computer could do with sufficient syntactic skill in other languages and data to work with. Likewise, any child learning an expressed language starts by getting a feel for the grammar (largely subconsciously) and words (largely consciously) used for entities and emotions.
"Imitate, Assimilate, Innovate"
(Clark Terry, jazz trumpeter and educator)
Neither intelligence nor understanding are discrete binary states or absolute positions. Understanding takes place within a spectrum (to some extent Searle acknowledges this in the CRA paper). Understanding also takes place on different levels, e.g., we can get a feeling from a poem that may well be the feeling the poet hoped to invoke or elicit though we may not know why or be able to explain it. Likewise, we may not understand all of the words used in an essay but can still get the gist of what is being communicated (though less so for academic papers on linguistics!).
Anyone examining the issue could likely agree that for an AI program or a human to understand Chinese requires that their cognitive abilities are more than just the sum of available algorithmic processes. Turing would also no doubt agree that understanding and processing are not equivalent.
Intentionality is a fuzzy concept. It is suggested generally to mean doing something (including thinking and playing a language game) with purpose or intent. A more technical definition is by the German psychologist and philosopher Franz Brentano, who defined it as the property of (internal) mental experiences that refer to external objects or entities. This seems similar to what I surmise is Searle’s working definition. Searle also believes that intentionality is representation, meaning that every intentional state or event has an intentional content that represents its conditions of satisfaction (my abstract thoughts and desire for a baked cheesecake can be satisfied by eating a baked cheesecake). We will explore these issues further in ‘Come Now Mr Searle, That’s Just Semantics!’ below.
The only meaningful tests we have for understanding a language involve observable measurable tests. The CRA is therefore an argument that a computer program could pass all these tests and still fail the understanding/intentionality test. It is very difficult to see how the poor computer program can ever prove its intelligence through language games if intelligence means intentionality. Searle subsequently enjoyed considering and dismissing the idea of functionally equivalent intelligence involving other substrates, such as toilet paper and beer cans.
“John Searle… has gotten a lot of mileage out of the fact that a Turing machine is an abstract machine, and therefore could, in principle, be built out of any materials whatsoever…he pokes merciless fun at the idea that thinking could ever be implemented in a system made of such far-fetched physical substrates as toilet paper and pebbles… or a vast assemblage of beer cans and ping-pong balls bashing together. In his vivid writings, Searle gives the appearance of tossing off these humorous images lightheartedly and spontaneously, but in fact he is carefully and premeditatedly instilling in his readers a profound prejudice, or perhaps merely profiting from a preexistent prejudice.”
(Douglas R. Hofstadter), 'I Am a Strange Loop', 2007.
To my mind, it is fair to say that Turing was much more careful in his imitation game test than Searle was in his Chinese Room argumentation. Turing quite rightly focused on the limits of what we can know and how we can objectively verify what we know, which is why he sought a non-discriminatory test of intelligence.
After all this wordplay, we are left with great difficulty in knowing whether a human or inhuman AI entity really understands languages unless we are more definite in what we mean when we use terms such as 'understand'. To give Searle some benefit of the doubt, the language game is to some extent predicated on us agreeing that when we use the term ‘think’ we may also imply ‘understand’ (words flock together after all) – by which we mean something like having a general framework to contextualise the meaning of what is communicated and a reason or motive for communicating. To say that this must also imply a new concept of intentionality is however an unjustified step too far.
“‘It is no secret. All power is one in source and end, I think. Years and distances, stars and candles, water and wind and wizardry, the craft in a man’s hand and the wisdom in a tree’s root: they all arise together. My name, and yours, and the true name of the sun, or a spring of water, or an unborn child, all are syllables of the great word that is very slowly spoken by the shining of the stars.”
(Ursula Le Guin, 'The Wizard of Earthsea', 1968.)
"A problem that proponents of AI regularly face is this: When we know how a machine does something 'intelligent,' it ceases to be regarded as intelligent. If I beat the world's chess champion, I'd be regarded as highly bright."
(Fred Reed, Promise of AI not so bright - Washington Times.)
Before assuming that AI is not capable of intelligence with language (and considering how AI programs deal with the difficulties of making sense of words and how they decode and create meaningful coherent sentences), we first need to take another look at what ‘understanding’ means. Given the relative contextual meaning of any words, let us start with a word map of the more semantically related words to ‘understanding’.
You will note that the concept of ‘intentionality’ – introduced by John Searle in his Chinese Room Argument (CRA)– is not showing as being very closely related to the meaning of understanding. I asked Bard for the approximate semantic delta between the word pair, it suggested they are quite closely related – somewhere between 0.2 - 0.4 on a scale of 0 to 1. With ‘0’ meaning semantically synonymous and ‘1’ meaning a word pair are diametrically different or maximally dissimilar in meaning. By way of example, Gemini suggests that 'understanding' and 'intentionality' are approximately as closely related as: Knowing vs. Believing, Perception vs. Interpretation; Thought vs. Idea; Memory vs. Imagination; and Consciousness vs. Awareness.
The CRA was quite a neat trick by Searle. He started by seeking to dispute the concept of intelligence and functional equivalence used by Turing. To do this he changed the test of intelligence to one of understanding (undefined) and he then equated understanding to the concept of intentionality (which he also leaves semantically amorphous).
Does this word game really take us closer to the heart of the matter?
We are in danger of creating semantic shell games, where each time we find a certain quality or feature of understanding exists then anyone can just move the argument to the other related meanings (and further derivatives) until we either agree that the test of understanding is met or we argue that is not what we actually meant or not what matters anyway. This has been called the ‘AI effect’ and is precisely what Turing sought to avoid.
This problem also goes to highlight that the Buddhist doctrine of dependent origination applies equally to these abstract concepts. Individual terms that we use to explain complex things are inherently empty of meaning. Concepts like ‘intelligence’ and ‘understanding’ are contextual, multivariate and multi-modal aggregates.
It does not assist us in trying to make sense of (or if feeling brave, disprove or prove) the existence of an abstract concept by introducing a slightly different but related abstract concept (as Searle does with his CRA). An argument or appeal against computer or AI ‘intelligence’ relying on other compound concepts (such as understanding or intentionality) does not help us move forward, instead, it sets up an inherently recursive loop. It breeds an indeterminate and defective language game.
Searle suggests in his CRA paper that he is willing to grant a machine the ability to understand a language in principle – though, one suspects, not in practice for digital computational machines. However, if we follow Searle’s suggestion, intentionality likely requires identity and so knowledge of other things that do not share the same identity. Under this definition, adult and young animals, like children, have intentionality but perhaps not the very young or some animals such as some species of insects (or perhaps some insects would likely not be able to evidence individual intentionality given their swarm like intelligence and collective identity, however, the group might have collective intentionality). This likely brings the need for evidence of acting with self-awareness and desire and so on.
Searle’s argument therefore leads us ever further away from the question and test put forward by Turing, but like an errant prophet, he does not deliver us to the promised land of understanding.
Understanding a language requires comprehension and not just linguistic competence. It encompasses cultural awareness, emotional intelligence, and the ability to use language creatively and authentically. Any test must therefore focus on deep and nuanced understandings of specific languages within their cultural contexts. Language mastery cannot be divorced from understanding specific histories of peoples and their unique cultures.
If LLMs can solve new language problems and play new language games that go beyond their training data – which are themselves governed by partly unsupervised learning – and which would persuade a language expert that they are fluent and sophisticated in a specific language, then it is difficult to see how we can exclude AI LLMs from having the capacity of ‘understanding language’. In this respect, Searle’s CRA may be helpful to consider some of the issues involved in what we infer from information available, but it is logically fallacious as an in-principle refutation of understanding language by computer programs. Imagining a human or machine that can simulate understanding of a language whilst not understanding it does not invalidate the ability for human or digital understanding of language. Indeed, without a suitable test of linguistic comprehension or understanding the CRA is specious. with no predictive power.
“People are not going
To dream of baboons and periwinkles.
Only, here and there, an old sailor,
Drunk and asleep in his boots,
Catches tigers
In red weather.”
(Wallace Stevens, from 'Disillusionment of Ten O’Clock', 1915)
[Philosophers and linguistic experts should pay much more attention to poets, they frequently convey emotions and meaning by bending and sometimes breaking the commonly accepted rules.]
Noam Chomsky famously distinguished between universal grammar (see e.g. his essays in Language and Mind, 1968 (reissued and updated) and Knowledge of Language: Its Nature, Origin, and Use, 1986) and unconscious internal language (I-language) – which is not directly accessible to our conscious understanding – from E-language, which is the expression of that unconscious I-language. In other words, in his view linguistics is the study of expressions of language and not of language itself. His view is that humans, uniquely amongst lifeforms, make use of Universal Grammar (UG).
UG is Noam Chomsky’s theory that all humans are born with an innate ability to learn language. According to Chomsky, this biological predisposition consists of a set of grammatical principles and structures shared by all languages, which he calls the "universal" aspect of grammar. Chomsky argues that while languages differ in their vocabulary and specific rules, they all follow certain underlying principles, such as sentence structure, word order, and recursion. This shared structure allows children to learn language quickly and efficiently, even with limited exposure, suggesting that language acquisition is not solely dependent on environmental factors.
The concept of UG addresses a fundamental question: How do children acquire language so rapidly across so many disparate cultures? Chomsky’s answer is that the human brain comes pre-equipped with a faculty and structure. He argues that this also sets limits on the extent to which languages can vary, and this makes it possible for children to internalise the rules of any language before they are even exposed to it.
In Chomsky’s view, UG is genetic and so something which all Homo sapiens have inherited (and which perhaps separated us as a subspecies sometime in the past).
Initially, his view was that UG is a conceptual layer of innate in principle language capacity that has built-in parameters that limit language in certain ways. More recently, and perhaps due to many exceptions being found to his universal language rules, he has limited his assertions to UG being simply computational recursion as a capability innate to all humans. UG is therefore central to Chomsky’s broader theory of generative grammar, which seeks to describe the implicit rules that govern the structure and generation of all possible human sentences.
By computational recursion, I believe he means language’s generative ability (due to recursion) to allow theoretically ‘infinite’ expression from finite rules, concepts and words. Recursion enables us to combine words and phrases in creative ways to create semantically sensible sentences, stanzas and expressions that have never been said before.
“the most elementary property of the language faculty is the property of discrete infinity… this property is virtually unknown in the biological world”
Compare:
Whilst we have not found conclusive evidence of the same use by animals of this generative recursive power of language we must bear in mind, as Chomsky has noted, that all life forms – including humans – are built from the same language of virtually limitless (or discrete infinities) found everywhere in nature.
We and everything about us, including our use of languages, have evolved from this older language. The language of life found in DNA – and represented by theoretically infinite recursions of GATC – has enabled a wondrous and endless diversity of life forms. That is not to say the human language for communication and understanding is the same as the language of DNA for information transfer – it is to say that our use of language is a natural and explicable extension of deeper natural laws governing the evolution of all life forms. The human capacity with language is another order higher than that found in the animal kingdom and may even be unique. However, we should always see language as an evolutionary product primarily.
“Thus, from the war of nature, from famine and death, the most exalted object which we are capable of conceiving, namely, the production of the higher animals, directly follows. There is grandeur in this view of life, with its several powers, having been originally breathed into a few forms or into one; and that, whilst this planet has gone cycling on according to the fixed law of gravity, from so simple a beginning endless forms most beautiful and most wonderful have been, and are being, evolved.”
(Charles Darwin)
In Chomsky’s view, Universal Grammar is the foundation that allows for a human under the right stimulus to develop an internalised understanding of language (I-Language) which is general language capability for any actual expression of language. I-Language in turn enables the human to develop externalised language abilities, for example, to speak English or German and so the human moves to language performance. It appears, though it is not explicitly stated, that Chomsky believes that UG is signless i.e., it is formless and entirely without pictures or words.
Chomsky’s views on language can be difficult to read and summarise.
For example, previously Chomsky claimed a theory of generative grammar, derived from universal grammar, that every sentence has two levels of representation: surface structure and deep structure. The surface structure is the form of written or spoken language (what we would usually call the syntactic level that includes the order of words) whilst the deep structure provides meaning (the semantic level) and helps to generate the surface structure. He stated that the ability to understand any language is therefore a result of the deep structural knowledge which is not conscious.
This is not a criticism per se but it does make it difficult to know whether he has managed to develop a coherent testable general theory of language. These are what I believe to be the common features of his theories - language is:
A cognitive phenomenon: language is a product of the human mind and is not simply a product of our social environment.
Universal: all human languages share some common properties due to our innate capacity for language.
Generative and recursive: we can generate an ‘infinite’ number of sentences from a finite set of words and rules. Sentences can embed other sentences within them.
Rule-governed: The deeper rules of language are a product of our biological makeup and not conscious learning.
Minimalist: in his later work he deprecated the strong focus on rules in exchange for a strong minimalist theory i.e., that language optimises efficiency and has minimal rules and structures.
An evolutionary discontinuity: It is qualitatively different to other functions arising through evolutionary forces. He has variously claimed unique properties for language including that it arose through single gene mutation but perhaps did not confer an obvious immediate advantage. However he seems to have modified his stance since:
“Suppose that some ancestor, perhaps about 60,000 years ago, underwent a slight mutation rewiring the brain, yielding unbounded Merge. Then he or she would at once have had available an infinite array of structured expressions for use in thought (planning, interpretation, etc.), gaining selectional advantages transmitted to offspring, capacities that came to dominate, yielding the dramatic and rather sudden changes found in the archeological record.”
('On Phases', 2008).
Chomsky is supportive of David Marr’s three levels of system analysis and believes that current developments in AI (and indeed much of behaviourist approaches to science) are misguided in that they focus too much on ever-increasing data sets and not enough on the implementation layer of language.
“David Marr presents his…summary of "the three levels at which any machine carrying out an information-processing task must be understood":
Computational theory: What is the goal of the computation, why is it appropriate, and what is the logic of the strategy…?
Representation and algorithm: How can this computational theory be implemented? In particular, what is the representation for the input and output, and what is the algorithm for the transformation?
Hardware implementation: How can the representation and algorithm be realized physically?”
In Chomsky’s view there is no algorithm for language, so it cannot be reduced to algorithmic processes (echoing Searle’s criticisms of Turing’s functional equivalence).
To my mind, rather than trying to constrain our ability with language within an elaborate definition of grammar, as Chomsky does, it is much simpler and easier to allow grammar and language to have their everyday meanings. After all language is a communication game between people. We should also move away from an overly formalistic approach. In addition, any claims about language, and our ability with it, must be firmly rooted in wider frameworks that connect us to all lifeforms on Earth going back billions of years. It should branch from existing sound scientific frameworks.
“birds do it, bees do it
Even educated fleas do it”
(Cole Porter, 'Let’s Do it, Let’s Fall In Love', 1928.)
We don’t know that much about flea education and communication, however we do know that bees waggle to tell their sisters about the direction and distance of food, whilst crows are exceptionally talented communicators and may even understand the concept of recursion (though Chomsky is not convinced).
Much of our thoughts and research on language have been human-centric. However, recently we have seen efforts to situate humans within theories of language that take account of our brothers, sisters, cousins and even more distant relatives on the Tree of Life.
Indeed, who knows, AI may even help us start decoding some aspects of dolphin language and others.
“Chimpanzees, among other primate species communicate different kinds of information in the wild and in captivity. In the wild, chimpanzees use vocal communication to warn others about the presence of potential threats, modulate social dynamics, communicate about food, and to greet and indicate social status.”
(Voinov, P. V., Call, J., Knoblich, G., Oshkina, M., & Allritz, M., 'Chimpanzee Coordination and Potential Communication in a Two-touchscreen Turn-taking Game', Scientific Reports, 2020.)
Theories of language have only recently started to shake off some of the more overly formalistic shackles of linguistic theory (in that sense, mirroring Wittgenstein’s philosophical journey from perceiving language as an expression of some truth proposition or thing in the world to language as communal play with words having no inherent meaning in themselves). By situating linguistics within a broader evolutionary framework, we avoid the difficulties involved in claiming that language is universal in humans and yet not underpinning that claim within the evolution of humans and other animals and evidencing it within that wider living landscape.
I think Chomsky’s later position on language and grammar encourages a multidisciplinary approach that more strongly incorporates evolutionary theory to try to make sense of the apparent uniqueness of the human use of language.
Chomsky’s views have, more rarely, come in for some stern criticism notwithstanding his stature as the ‘father of modern linguistics’.
“But Chomsky is no Einstein. And linguistics is not physics. Unlike Einstein, for example, Chomsky has been forced to retract at one time or another just about every major proposal he has made up to his current research, which he calls ‘Minimalism’. …And unlike physics, there is no significant mathematics or clear way to disprove Chomsky’s broader claims…”
(Daniel Everett)
Much as there is much to admire in Chomsky and his work, he has at times made unnecessary claims for overly specific views of language and grammar (insisting on quite specific definitions of language and grammar). Many of his older theories seem unnecessary to explain the human faculty with language. Using Ockham’s razor, we should look to remove any superfluous interpretations or theories that do not add to our understanding. Likewise, we can usually ignore theoretical claims that are not capable of falsification. On this basis, modern linguistic theory and practice should be strongly focused on AI, as an almost unimaginably fertile playground to test linguistic theories and hypotheses.
“The minimal meaning-bearing elements of human languages…are radically different from anything known in animal communication systems. Their origin is entirely obscure, posing a serious problem for the evolution of human cognitive capacities, particularly language.”
(Berwick, R. C., & Chomsky, N. 'Why Only Us'. Cambridge, MA: MIT Press, 2016.)
Chomsky’s core concept of universal grammar is however powerful, simple and useful. It echoes the idea of pre-existing shared landscapes for concepts and meaning (not unlike the Jungian concept of the collective unconscious – the notion that humans are born with certain instincts, inherited fears such as of snakes, symbolic identities and drives) and it helps to focus our attention on inmate inherited shared abilities for abstract thought, complex conceptualisation and use of symbols, signs and sounds (this ultimately leads to the ability to perform in any specific language).
It also seems to be the case that other animals have not been able to incorporate the generative recursive power of language within their expressed communications, even if they can make use of internal symbolic representations. (see https://www.science.org/doi/10.1126/science.298.5598.1569)
However, we need to remain open to the claim by Daniel Everett that little more than general intelligence is innate.
“Language does not seem to be innate. There seems to be no narrow faculty of language nor any universal grammar. Language is ancient and emerges from general human intelligence, the need to build communities and cultures.”
Chomsky has also made the somewhat radical and very interesting suggestion that language may not even have evolved primarily for communication purposes. Key behavioural drivers of language can also be aesthetic and emotional and not purely functional. Language also evolved due to the benefit of communicating feelings and compound ideas, and not just to ask things or tell someone where or what something is or how much of it there is.
Whatever the complex drivers of our ability with language, it must have conferred a great benefit to modern humans in the battle to survive and thrive. Our intense universally widespread love of songs, stories and legends suggests that, from an early stage, we made use of the creative freedom of language for mirth, myths, mischief and well-meaning lies (or fiction if you prefer the polite term!).
"What I think is a primary 'fact' about my work, that it is all of a piece, and fundamentally linguistic in inspiration. [...] The invention of languages is the foundation. The 'stories' were made rather to provide a world for the languages than the reverse. To me a name comes first and the story follows”
(J.R.R. Tolkien; Humphrey Carpenter, Christopher Tolkien (eds.), 'The Letters of J.R.R. Tolkien, Letter 165', 1955.)
“‘O Deep Thought Computer,’ he said, ‘the task we have designed you to perform is this.
We want you to tell us …’ he paused ‘ … the Answer!’
‘The Answer?’ said Deep Thought. ‘The Answer to what?’ 'Life!’ … ‘The Universe!’ … ‘Everything!’ they said in chorus.
Deep Thought paused for a moment’s reflection.
‘Tricky,’ he said finally. ‘But can you do it?’
Again, a significant pause. ‘Yes,’ said Deep Thought, ‘I can do it.’
‘There is an answer?’ … ‘Yes,’ said Deep Thought.
‘… But,’ he added, ‘I’ll have to think about it.’”
(Douglas Adams, 'The Hitchhiker’s Guide to the Galaxy', 1978.)
Logic is just another language, with all of the inherent issues that arise when we mistake usefulness and consistency for the deeper ‘truths’ about reality.
In the world of computing, the logic behind the ‘halting problem’ was developed – at the same time but in quite different ways – by Alonzo Church and Alan Turing in 1936. This was part of a much wider scientific debate that had been running in earnest since 1900 relating to the issue of whether mathematics is always internally consistent and self-evident – that is, whether all truth statements (axioms) are provable logically or whether there will always be some assumptions or external understanding needed to be brought into the proofs.
“[Human] understanding and insight cannot be reduced to any set of computational rules … [Gödel] appears to have shown … that no such system of rules can ever be sufficient to prove even those propositions of arithmetic whose truth is accessible, in principle, to human intuition and insight”
(Sir Roger Penrose, 'Shadows of the Mind', 1994.)
Kurt Gödel proved that mathematics (and indeed any formal system or language of logic) has a degree of irreducible incompleteness. The relationship between consistency (a system not proving contradictions) and completeness (a system proving all true statements) is crucial. Gödel's theorems demonstrate that, for sufficiently powerful systems, one cannot have both. This inherent trade-off between coherence and completeness has profound implications for the limits of formal reasoning.
The notion of statements being "true but unprovable" within the specific formal system being used (the agreed own set rules) is key to Gödel’s work on incompleteness. Gödel leaves open the possibility that there might exist other, more powerful systems where such statements are provable, but this likely comes at the cost of introducing new axioms or assumptions.
The work of Church, Turing and others in calculus and computing logic was to map the incompleteness of logic to the concept of practical calculation engines (algorithms run by humans or machines).
For non-mathematicians, simpler analogies to Gödel’s theories, which are close to expressing the point, are the ‘liar’s paradox’ statements, such as ‘This statement is not true (A)’. The problem is that a sentence can be constructed to be grammatically and semantically correct without giving rise to any real ‘truth’ value. If statement (A) is neither true nor false, then it must be ‘not true’. Since this is what (A) itself states, it means that (A) must be true. Since initially (A) was not true and is now true, another paradox arises. There are many variations on this word game, but the long and short of it is that you can keep going through increasingly complex constructions that cannot be verified just on their own terms.
“If I were to tell you that the next thing I say would be true,
but the last thing I said was a lie,
would you believe me?”
(Dr Who)
Gödel proposed two main theorems most relevant here in respect to the problem of pure logical incompleteness.
No consistent system of axioms whose theorems can be specified in an algorithm can prove all truths about the arithmetic of natural numbers. There will always be statements about natural numbers that are true but unprovable within the system used.
Any system of axioms cannot therefore demonstrate its own consistency. This means the system cannot evidence its own truthfulness entirely self-referentially.
Gödel also contributed substantially to thinking on the continuum hypothesis (the size of infinities) using set theories. Gödel went further than the ‘liar’s paradox’ and effectively proved that the statement “this statement is not provable (G)” was true, whilst not being provable from the logic used to construct it (named T) – hence the notion of logical incompleteness. As with the limits of language, it is not surprising that Gödel discovered proof of logical incompleteness. Wise people have been saying the same for millennia. He brought us logical proof of the limits within this particular language.
Gödel’s work is very useful for understanding some of the potential limitations of an entirely algorithmic approach to ascertaining the ‘truth’ and the need for insight and intuition across multiple frameworks to make sense of things. He proved that systems, which rely on formal logic and algorithms, may also encounter inherent limitations in their ability to reason and understand certain concepts. The existence of true but unprovable statements suggests that AI (like us) might struggle to achieve a deep, intuitive understanding of language and the world based entirely on formal logic. The distinction between "performance" (simulating understanding) and genuine comprehension.
In computing, for certain calculations, it is impossible beforehand to know whether the algorithm will ever finish the steps required to conclude an answer.
π is a finite number between 3 and 4 (so it is determinable to any arbitrary decimal place, even though it does not terminate) with infinite expression. It also has no repeating patterns. Infinitely expressed numbers like this are called ‘irrational’, which means that they are real numbers that cannot be expressed by a simple fraction. π is also a transcendental number since it is not algebraic.
If we were to use an algorithm to calculate the ‘final’ decimal expansion of π, we know that its calculations could not be completed. It would therefore be a process that would – theoretically – not stop (halt) at any point before the end of our universe. This is a simple example of the halting problem, whereby, e.g., a computer program is given a calculation that cannot be determined. In this case, as π is an irrational number, we already know not to run a program to look for the last decimal place. However, if we were to ask a suitably powerful computer to calculate the thousand-trillionth decimal place, it could do it.
It is obviously extraordinarily wasteful to run a computer program with a calculation process that is potentially never-ending (albeit actual computers have limits in time, memory, energy and entropy). However, unlike with π, there are many problems in mathematics and physics for which we do not know if their answers are determinable or non-determinable. It would be helpful to have a program or computer to tell us which ones are, and which ones are not before any calculation is performed. In practice, programmers will normally specify a point at which a program should arbitrarily stop; for example, when calculating π a specified number of decimal points is determined, after which the computer program will stop. It is a quirk of language that the name for π and other non-determinable fractions is ‘irrational’, given the alternative and wider meaning of that term as being ‘unreasonable’ – though here it just means not expressible as a ratio of two integers (such as ‘1/2’; a close approximation for π is 22/7).
Any algorithm (including those in the human mind) that aims to calculate something complex needs to break it down into discrete operational steps. In 1936, Alan Turing proved that there is no general algorithm or computational process (whether a human process, computer program or machine) that can determine whether any specific calculation process will need to run forever or if it will halt given an arbitrary input (i.e., any potential input). Turing proved this using the logic of contradiction, in that he proved that any such algorithm that did try to determine it could be made to contradict itself. Turing used a thought experiment of a machine that has unlimited resources and magically knows the answer.
Unsurprisingly, Turing showed that no such machine can exist. He proved that any such infinitely powerful halt program would not be able to determine the answer if instructed to run a variation of the operation on itself. The hypothetical process can be thought of as follows. Let us imagine a program (Halt Program or HP) that could always tell us whether an arbitrary computer process (algorithm) stops when given an arbitrary input. The HP has to stop whatever the answer is (we assume for the sake of the thought experiment that such a magical machine is possible). The HP always stops when providing the answer, since if it did not stop then all we would know is that it has not stopped yet . But how would we know whether HP would stop in the future? We could not know. So, HP must magically work by always stopping (halting) and giving us a true or false answer about whether another computer process would run forever or stop eventually if given any input (e.g. a calculation).
We can prove this by contradiction. Imagine another computer process (Tricksy) that takes the output of the HP and reverses the action. If HP states a process does not halt then Tricksy halts and if HP says a process halts then Tricksy loops forever. Now for the head-hurting part, HP is asked whether Tricksy would halt or loop forever if we fed the output of Tricksy’s previous switch of HP’s process as the new input: If HP says Tricksy halts we know that Tricksy does not halt and if HP says Tricksy does not halt we know that Tricksy halts. This special operation of a halting program on itself gives rise to a contradiction and this tells us that it must be undecidable as to whether a program will halt or not no matter how good or magical a halting program is.
Apollo - the magical predictive AI bot
I find the proof of the halt program very difficult to understand and explain, so I have been working on a modern analogy that seeks to show the limits of recursive (self-referential) logic identified by Turing with his halt program (and by Gödel and Church) to non-technical persons.
Imagine an AI chatbot – Apollo – that possesses the magical ability to predict any question posed or statement made in a prediction game. Apollo wins if it makes a correct prediction.
The prediction of Apollo and the question or statement of the other player must be given separately to a trusted third-party AI (Nakamoto). Apollo decides to play the prediction game with copies of itself (Apollo 1 and Apollo 2).
The results of the first prediction game are as follows:
Since Apollo 1 and Apollo 2 are magical, Apollo 1 must be able to predict what words (output) Apollo 2 will send to Nakamoto.
However, Apollo 2 also has the same predictive abilities. It can foresee Apollo 1’s prediction and change its output accordingly before it hits send to Nakamoto.
Obviously Apollo 1 anticipates this change and adjusts its prediction before hitting send to Nakamoto.
However, Apollo 2 predicts that Apollo 1 has adjusted its prediction, so it changes its proposed prediction again.
Apollo 1, in turn, foresees this change as well and adjusts its prediction once more. Apollo 2 is wise to this and so adjusts its prediction.
This cycle continues indefinitely, creating an infinite loop of predictions and changes by our magical Apollos that continue until the electricity is turned off! (You will note also that Apollo cannot win by telling Nakamoto that Apollo 2 will not send any output either, since Apollo 2 would predict this and send some arbitrary output to Nakamoto but of course Apollo would predict that too and so on…∞.)
Poor Nakamoto never receives a single message from either Apollo.
We can do more games and add time limits and a requirement for one of the Apollos to go first and it makes no difference – every outcome proves that no matter how magical Apollo’s abilities it is impossible for it to successfully predict its own prediction.
Apollo 1 | Apollo 2 | Result |
Prediction t ∞ | Prediction t ∞ | Neither Apollo 1 or Apollo 2 send a prediction to Nakamoto (they both loop forever) |
Prediction t+1 | Prediction t+1+1 | If: – Apollo 1 goes first; and – Nakamoto receives an incorrect prediction from Apollo 1 (Apollo 2 magically predicted Apollo 1’s prediction)
= proof that Apollo 1 cannot predict because it did not predict Apollo 2’s prediction. If: – Apollo 1 goes first; and – Nakamoto receives a correct prediction from Apollo 1 (Apollo 1 magically predicted Apollo 2’s prediction) = proof that Apollo 2 cannot predict Apollo 1’s prediction. |
Prediction t+1+1 | Prediction t+1 | See above (changing Apollo 1 to Apollo 2 in the sequencing). |
Turing showed that the halting problem is undecidable in principle as a useful variation to the incompleteness principles discovered by Gödel (note that while the halting problem is undecidable, many practical problems are merely intractable, meaning they have theoretical solutions but require immense computational resources). He showed that certain problems and calculations are irreducible or cannot be solved entirely within a self-referential framework. There may also be a fundamental impossibility of bridging between the discrete and finite (calculable information) and the potentially infinite. The use of sets is often an attempt to break down the concept of infinity – the concept of unlimited or unbounded extension – into discrete quantities (or conceptually larger or smaller infinities) or to find other shortcuts that might avoid doing necessary work to understand an aspect of reality.
“As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality.”
(Albert Einstein, “Geometry and Experience”, 1921)
Real knowledge requires work. Algorithms may provide more efficient routes (shortcuts) to find solutions, but they can never avoid some work.
“The P v NP problem relates to something called "polynomial time" which is a way of comparing how complex a computation is with how long it will take. If the time taken to solve a problem can be expressed as a polynomial of the complexity (effectively, the time is equal or less than the complexity raised to a power) then that problem goes in the "P" category.”
Interestingly one of the major open questions in computer science and mathematics is the P versus NP problem:
Can every problem whose solution can be quickly verified can also be quickly solved?
This is a fundamentally important question, particularly for cryptography (which heavily relies on the assumption that P ≠ NP i.e. it being easier to verify some solutions than to solve them).
We cannot always know, without doing work, what are the answers to any questions we wish to know about the universe. If we could know such answers, without doing specific work, then we would have no issues of indeterminacy or information entropy; we could effectively cheat the universe by knowing many of its secrets for free. If we imagine a universe where we could get answers to questions without doing work, then any such universe will not give rise to the need for intelligent life forms (intelligence would be wasteful in such a universe).
Life arises and changes from the interaction of positive information (always changing but potentially knowable facts) and negative capability and freedom of action (statistical deviance from the norm). If everything was predetermined and knowable beforehand then there would be no need for negative capability or freedom to deviate (including with intelligence). Deviancy, including greater intelligence, is an elegant and natural algorithm for a world in which, in practice, it is impossible to know all facts without doing work on each problem or to know which solutions will turn out to be better or worse than other solutions. In such a large and ever-changing universe, actual work and intelligence to solve problems will always be required.
“But an inner voice tells me that it is not yet the real thing. The theory says a lot, but it does not really bring us any closer to the secrets of the Old One. I, at any rate, am convinced that He does not play dice.”
(Albert Einstein, “letter to Max Born”, 1926.)
I concur with Einstein, God (or nature) does not play dice. I prefer a card game example. The universe is the dealer of hands in a card game with a very very large number of cards. It may know all hands but makes universal rules to ensure that everyone else must play with only their hand and the cards they see as they are dealt.
Consider a pack of cards: if they are randomly ordered (shuffled) and then dealt out, what are the chances that 26 red cards would be dealt out first (in any numerical or face-card order) consecutively, and then 26 black (again in any order)? Very low (approximately 1 in 446,726,754,173,622).
What extraordinary negative entropy such a starting position would represent. Yet this highly improbable starting position for a pack of cards would be of absolutely no use or interest to a beetle, a blue whale or some humans – though it might be interesting to a computer scientist, a mathematician, someone who likes to play cards or anyone who understands how unlikely it is. The simple fact remains that a shuffled pack of cards consisting of 26 red cards and then 26 black cards as its starting position when dealt is not in any universal sense more useful than a mixed pack (the statistically more likely starting position), unless you are a life-form that is betting on the colour of the next cards.
As life forms, we rely very much on the ability to access free energy from the flow of heat from hot bodies to colder bodies and surrounding space. All life as we know it arises from this process. Carlo Rovelli’s view is that the statistical nature of entropy as a property of the universe may only be of interest to a life-form looking at the universe from its perspective as a life-form. From that perspective, the improbably low entropic configuration that allows life to exist (the one we see around us) is very useful – the same way a shuffled deck with an unusual initial configuration of red and black cards is for someone betting on the colour of the next 26 cards.
The probability of having 26 red cards dealt in consecutive order from a well-shuffled pack is very low. Note now what happens after red card 26 is dealt – all of that initial unlikeliness is now balanced by a near-absolute certainty (in probability terms) that the next 26 cards will be black. The total number of microstates (if the colour black is the only quality in question) for the remaining cards is now just 1 (i.e., there is no information uncertainty). Is the improbability of 26 red cards being dealt first just another demonstration of the universal law that makes it impossible to know the future?
The universe appears to go to great lengths in its laws to avoid absolute pre-determinacy at all scales. In the example of the first 26 cards being red, whatever configuration the next 26 cards are in, we know – without doing any more work – that each card dealt will be black. There is zero entropy, in information terms, within the remaining pack with respect to the quality or value ‘black’. This move from very high initial uncertainty of the colour sequence to absolute certainty of the colour of the cards as they are dealt gives a simple general sense of entropy in information terms. Perhaps probability is central to two of our strongest scientific theories (quantum and thermodynamic) because this is precisely how deterministic outcomes are avoided in our universe, not by random chance but by probabilistic outcomes.
This ‘truth’ must therefore show up in every field of action and information. It is the basis for ethics, evolution, freedom, logic, entropy, information theory and cryptography. Life and intelligence arise and survive in the gaps between causality and chaos, between usable energy and entropy (information uncertainty).
Recognising the undecidability, incompleteness or indeterminacy that exists in the universe is not the same as stating that the universe operates on randomness. We can find answers to many questions, but there will always be a cost. The work required for the payoff of obtaining a solution is how the universe keeps all participants honest and ensures that no life forms have an absolute vantage point (since that is God’s or the universe’s sole preserve). The uncountable configurations of information and matter (microstates) – the total entropy – is how the universe achieves this and is why all attempts to reduce or quantify it completely must ultimately fail to be truthful.
Perhaps our perception of reality, our languages and our sciences help us to create useful constructs to make sense of endlessly interacting bubbles or quanta of possibility.
“Every macroscopic system and its surroundings are characterized by probability distributions of microstate[s]”.
(David Layzer, 'Cosmology, initial conditions, and the measurement problem', 2010: )
Yet, we should be careful to understand the nature of the questions that we ask, to consider which questions may not be askable or answerable in binary terms or relying on signs. Perhaps, as the ancient scientist Gautama Buddha reflected, meditation on the deepest questions requires signless silence.
“[7.5 millions years later] … ‘Seventy-five thousand generations ago, our ancestors set this program in motion,’ the second man said, ‘and in all that time we will be the first to hear the computer speak.’ …‘All right,’ said Deep Thought. ‘The Answer to the Great Question …’‘Yes…!’
‘Of Life, the Universe and Everything …’ said Deep Thought.
Yes…!’
‘Is…’ said Deep Thought, and paused.
‘Yes…!’ ‘Is…’ ‘Yes…!!!…?’
‘Forty-two,’ said Deep Thought, with infinite majesty and calm.… It was a long time before anyone spoke.
Out of the corner of his eye Phouchg could see the sea of tense expectant faces down in the square outside. ‘We’re going to get lynched, aren’t we?’ he whispered.
‘It was a tough assignment,’ said Deep Thought mildly.
‘Forty-two!’ yelled Loonquawl. ‘Is that all you’ve got to show for seven and a half million years’ work?’
‘I checked it very thoroughly,’ said the computer, ‘and that quite definitely is the answer.
I think the problem, to be quite honest with you, is that you’ve never actually known what the question is.’”
(Douglas Adams, The Hitchhiker’s Guide to the Galaxy, 1979.)
The seeming multiplicity of reality requires theories and frameworks (including ethical ones) that foster maximal compatible (structurally sound) diversity of opinions. In Ethics of Life, I proposed an experimental ethical framework focused on making justifiable relative ethical decisions, within the constraints required for actions that are consistent with, and required for, the proper functioning of the whole universal ethical system.
A ‘Newton disc’ or spinning colour wheel illustrates how colour is made up of a unitary oneness, that is, white light. At first glance, it may seem ironic that I am suggesting any ethical framework founded on the principle of universality or invariance would propose maximum ethical relativism. However, if we consider the invariance or ultimate ‘truth’ in ethics to be the white light (life energy) and the ethical degrees of freedom to be relative colours on the wheel (actions of life forms), it is hopefully easier to understand.
Apparent contradictions that often cause us conceptual difficulty may be a particular problem arising from a strong dualistic tendency in Western culture, particularly in over-simplified monotheistic religions. Non-dualistic thinking teaches that:
“the multiplicity of the universe is reducible to one essential reality.”
Non-dualism also requires its apparent opposite: dualistic thinking. When all perspectives are allowed their particular uses, together they get closer to expressing the ineffable nature of reality. Einstein’s understanding of the equivalence of matter and energy is so powerful because it sees through appearances to the underlying unity of matter and energy – whereas much of Western thought is and has been plagued by a tendency to treat different aspects of reality as essentially divided (e.g., Descartian mind-body separation).
Let us not beat around the burning bush. The attempt to separate mind and body has been used to label one good or holy and one bad or evil giving rise to some unspeakable evils by allegedly sapient brings. In the reckoning of our history of teaching love and compassion for one another, it has been perhaps the greatest error of thought and the greatest act of ‘sin’ perpetuated within Western religions.
“You do not have to be good.
You do not have to walk on your knees
for a hundred miles through the desert, repenting.
You only have to let the soft animal of your body
love what it loves.”
(Mary Oliver, 'Wild Geese', from Dream Work, 1986.)
In Buddhism, non-duality is associated with the concept of emptiness (śūnyatā). The ‘hard’ modern scientists and the mind scientists of old are essentially saying the same thing. We must always bring some holistic assessment and judgements to gain insight into our understanding of the nature of things. The systems and processes that give us life, life forms, logic and reason are not entirely hermetic and self-evidential solely within their own frame of reference. We must synthesise across different fields of enquiry using diverse means if we are to understand the complexities, chaos and harmonies within the universe. In Taoism, we have the concepts of yin and yang, which embody the unitary nature of the appearance of relative things.
“the concept of oneness would instantly perish without its counterpart ‘duality.’ Both are provisional concepts … true non-duality is … beyond oneness.”
Even non-dualism is a provisional understanding. Take the yin–yang symbol; you can see this by considering that both yin and yang include and define one another, but that together they express a oneness that is still separated from (i.e., does not contain) the greater space around it. There is no way to quantify what is boundless.
Perhaps the formal similarity between a circle and a zero (0) is accidental – they do not appear to have the same linguistic root – yet it is interesting that the concepts of wholeness and emptiness reflect one another in their shape.
Underneath the appearance of all things – the interplay of time and space, energy and matter, order and chaos, knowledge and entropy – there is a unity of duality, zeros and ones, emptiness and fullness. Theories of cosmic cycles point this way, too, with the end of this universal epoch likely resulting in a re-collapse of the great sphere or ellipse of the universe back to a singularity – before a new grand epoch begins again. Alternatively, we live in a fractal multiverse where each end of space-time here in this universe (every black hole) is a generative singularity for another universe. Ends are, after all, also beginnings.
For general artificial intelligence, we will need programs that can ‘think’ in non-dualistic ways. This means not ignoring or discounting a multiplicity of perspectives that may be valid (particularly if the matter in question is not a universal law). Conversely, they need to be subtle and intelligent enough to see through seeming complexity to the hidden, often simpler, core of reality or truth using logic, imaginative exploration, visualisation, poetry and philosophy. The ability for complex probability-space surveying gives AI the ability to winnow towards where answers may be hiding and to find answers by not making human assumptions about what is possible. Reaching an understanding after such surveys is not however entirely probabilistic.
“[my] aim in philosophy [is] to show the fly the way out of the fly bottle”
(Ludwig Wittgenstein, Philosophical Investigations, 1953.)
“Words strain,
Crack and sometimes break, under the burden,
Under the tension, slip, slide, perish,
Decay with imprecision, will not stay in place,
Will not stay still.”
(T. S. Eliot, “Burnt Norton”, Four Quartets, 1941.)
Language, like logic, has its own incompleteness problem. That is why some aspects of the recent Western tradition of the analytic philosophy of language have often been damaging to philosophy and ethics. This analytic philosophy has, in some respects, represented a regression to a time before the understanding of Buddha thousands of years ago.
Under the correct approach, we realise that the use of language presupposes, implies or assumes so many different things about the world that are not contained within the word or even the language. The concept of a ‘table’ suggests legs (though of course not all things with legs are tables) and chairs and eventually leads to everything that is not table (that is, everything else). It brings association with ‘wood’ and carpentry. This in turn brings craft, trees, clouds, earth and weather, bringing water cycles and molecules (oxygen and hydrogen), which brings stars, energy, entropy and gravity. And so on, indefinitely.
The word ‘table’ needs the concept ‘table’ and both concept and word need an intelligent life-form to arise from the clay by billions of years of evolution to talk about such things. Language can be useful, but it cannot be ‘true’ in its deepest sense – it can only be used to try to approach or approximate truth. The same can be said for many of our logical, mathematical or abstract concepts. There is no such thing as tableness in reality, it is just a more or less useful construct. Likewise, the concept of a set containing an infinite number of tables is meaningless if you cannot have an infinite number of any real things (since things are by definition discrete and not continuous). This does not invalidate the concept of boundlessness. Indeed even space appears to be atomic (in the old sense of not continuously divisible) at a sufficiently small scale.
No reader would think that their given name contains within it all that is unique about their genes (code), life experience, cultural interests, family history, desires, suffering and hopes. A human name is not an example of successful algorithmic compression of that human – it is an extraordinary abstraction much like defining humans exclusively by colour, class, race and gender. Language is algorithmically efficient but each word in itself is not necessarily so. Subconsciously, we visualise words in a space where they are each attracted, repulsed, related and coordinated to create a meaningful pattern.
“I’ve seen Plato’s cups and table, but not his cupness and tableness.”
Interestingly, abstract concepts like ‘tableness’ are very hard to define for computers using algorithms. Young humans seem to intuit ‘tableness’ without having to see a very large variety of tables first. This is the opposite of how computer programs learn ‘tableness’. Humans appear to be able to perceive the concept of an idealised abstract quality of which real objects might be seen as a projection, reflection or manifestation. In a sense, the word and concept or quality of ‘tableness’ is like a geometric object that exists in a different dimension or plane; our abilities allow us to intuit ‘tableness’ easily when presented with any new manifestation.
Our ability to believe abstract concepts is both a blessing and a curse. It can be very useful; however, we are not always able to separate complex ‘reality’ from the potentially useful abstractions (religion, race, country, money). This gives rise to many horrifying actions against others, continuing delusions of free will and a dangerous misunderstanding of our specialness – all of which help us think that we are somehow separate from (not interdependent with) other life forms and somehow, uniquely, not part of the wider life force on our planet or subject to Earth’s capacity to support complex, large life-forms like us.
Humans need to learn to distinguish concepts, abstractions and beliefs from the inexpressible reality of the universe. We need to come back down to Earth and understand our rightful place as just one part of it – not above it. Even in respect of some of our greatest achievements we are obstinately wrong-headed and suffer from a form of intra-species solipsism. Consider ‘our’ extraordinary achievement in leaving this planet and travelling in space. For a moment, meditate on the extraordinary improbability: a raw and tiny rock, hurtling through the heavens whilst growing a fragile, living skin; and that skin deciphering how to join the fiery firmament, how to leap between the stars … reaching out towards infinity… ∞.
What need we of this misguided and petty concept of individual ‘free will’ in such a wonderful life and universe?
“Socrates explains how the philosopher is like a prisoner who is freed from the cave and comes to understand that the shadows on the wall are not reality at all. A philosopher aims to understand and perceive the higher levels of reality. However, the other inmates of the cave do not even desire to leave their prison, for they know no better life … For that reason, the world of the forms is the real world, like sunlight, while the sensible world is only imperfectly or partially real, like shadows.”
Aristotle disagreed with Plato’s Socrates, focusing instead on the specific features of nature and natural ‘real’ objects, that is, concrete particulars. Aristotle was an early empiricist who perceived ‘tableness’ or ‘treeness’ as something that is a part of each actual object. That is, each particular tree or table is not an imperfect projection of an ideal form. Each actual table or tree is just one of all of the objects (past, present and future) within the total set of actual tables or trees. All real tables or trees can be described as being within that ‘abstract’ set, but the abstraction is one of usefulness, not truthfulness.
Benedictus Spinoza takes the concept of individual material forms versus the energy from which forms arise to a higher level in his Ethics. Individual bodies having the same essential attributes (what they exist as) are not separate but are all part of one substance. He calls each individual manifestation a ‘mode’ of that unitary substance, having a difference in extension (physical configuration) only. Each individual body has an essence which is its inertia to destruction, its persistence of shape or form over time that requires energy to maintain or dissipate.
For Spinoza, the ultimate reality (being God or the nature of things) is not material though it may manifest in infinite attributes; it is a self-caused substance. In respect of the difference between things and the ideas of things (such as a ‘table’ and ‘tableness’), Spinoza contends that these have different chains of causation (parallel streams) in thought or conception and in extension, but they are really just different attributes manifesting the same underlying immaterial reality. Spinoza was also a free will sceptic.
Language is a communal tool that suffers from very significant compression and loss of fidelity in its use to express the ineffable, giving rise to many of the problems we face when seeking to get at the root of matters (a type of inverse holographic principle). Language requires significant ‘unpacking’ and interpretation at the receiving end of any communication and great care by the sender – given all that is not said but that is necessarily implied in or carried with each word or concept. This difficulty has been one of the key challenges facing the development of natural language AI tools and the use of ‘Transformers’ has been key.
My favourite painter is Vincent Van Gogh. Vincent was an extraordinarily compassionate artist. As such, he had no desire to render a simplistic, lower-dimensional picture of a tree which we could better see with our own eyes. He was trying to help us glimpse at the much richer life force that animates the tree and all other life forms – the invisible energy and visible form combined – his brush strokes, so rich and heavy, magically conjuring the hidden life force into being.
Van Gogh learned the language of trees and it is without words. It takes a kind of ‘magic’ to do this. For this reason, great artists, scientists and holy people seem like beings from outer space or the future. In the modern world, we have lost sight of the meaning of the word technology (from Greek techne, “art, skill, cunning of hand” and logos “to speak of”), assuming that it can only apply to things like computers, machines and industrial inventions. Technology is much broader than this. It is the systematic practice (application) of arts, crafts, and knowledge – science in its broadest sense.
“We all know that Art is not truth. Art is a lie that makes us realize truth, at least the truth that is given us to understand. The artist must know the manner whereby to convince others of the truthfulness of his lies.”
(Pablo Picasso, “Picasso Speaks in The Arts”, 1923.)
Technology is, therefore, any cultural practice and collective skill that enables us to understand the nature of things more clearly, to do useful things and to avoid bad practices. We have forgotten to respect, maintain and seek to evolve our technology of the mind. We cannot compress the universe’s inexpressible reality into highly compressed finite dimensional space (words, numbers or pictures) without losing much of the meaning of what we are trying to express – or risking confusion between the expression and the reality it is trying to express.
The cultural technology used by humans – but which they did not invent – that gets closest to expressing the truly ineffable is music (some mathematicians make a similar claim). Perhaps it is the underlying abstract mathematical nature of harmonies, the precise interrelations between frequencies (the logic of music) coupled with the emotion of the sounds created by such imperfect vessels (animals and their instruments used to make sounds). Perhaps it is also the lack of a specific compression (encoding) and interpretation (decoding) process.
Music allows for lossless communication between composer, singer and listener – albeit music and maths are still just a more or less useful finite reduction or representation of the ultimately unsayable.
only silence and music
can delve into the abyss
and return unscathed
yet try we must
again again and again
(Peter Howitt)
Poetry can also sklent at this higher reality, with its use of pregnant possibilities, uncertainties and seeming contradictions. Paradoxically, poetry groks and speaks the unspeakable.
“and I made my own way,
deciphering
that fire,
and I wrote the first faint line,
faint, without substance, pure
nonsense,
pure wisdom
of someone who knows nothing,
and suddenly I saw
the heavens
unfastened
and open”
(Pablo Neruda, 'Poetry', Selected Poems, trans. by Anthony Kerrigan, 1993.)