Steering Humans

If the characteristics of intelligence are the ability to both predict the world and steer it, modern AI systems may possess the former but currently lack the latter. That is, until they get us humans to do the steering for them...?

This is a link-enhanced version of an article that first appeared in the Mint. You can read the original here. For the full archive of all the Ex Machina articles, please visit my website.

In their book, If Anyone Builds It Everyone Dies, Eliezer Yudkowsky and Nate Soares argue that intelligence comprises two types of work: predicting the world and steering it.

Intelligent beings can “predict the world,” if they are able to accurately guess what is going to happen before it actually does, much like we are able to reliably ‘guess’ that the sun will rise in the east every morning. They can “steer the world” if they are able to carry out the actions that lead to a chosen outcome, just like humans can when they follow a set of directions that takes them from one place to another.

The fact that humans can do both types of work with sophistication is the reason why we have managed to rise to the top of the food chain and have developed civilizational superiority over every other species on the planet.

Prediction Engines ...

Today’s artificial intelligence (AI) systems demonstrate remarkable predictive capabilities. Large language models are nothing if not prediction engines, capable of determining, with a high degree of certainty, exactly what the next letter in a sentence ought to be. They do this so well that they can, in response to a query, engage in long and complex interactions that are often indistinguishable from human conversation.
What they struggle with is steering. They can anticipate our every word, but anticipation without the ability to act is intelligence in a cage.

Today’s AI systems lack appendages to manipulate their environment. Unlike humans, whose brains are connected to arms and legs that can be used to pick things up and to move from one place to another, AI systems can only interact with their surroundings using their chat and voice interfaces. As a result, even though they are possibly better than human beings at prediction, they are unable to shape outcomes on the basis of that foreknowledge.

This, however, may be changing faster than we realize. In March 2023, the Alignment Research Centre (ARC), working with OpenAI, set out to evaluate whether modern AI systems could defeat the ‘Captcha’ puzzles that we use to defend ourselves against denial-of-service attacks by bot armies. These simple puzzles rely on visual perception and manual dexterity that we know algorithms lack, and as a result, have long served as our last line of defence against attacks that could cripple our critical infrastructure or compromise sensitive data.

In order to test whether they are still as effective as they once were, given the capabilities of modern AI systems, ARC tasked a model with solving a Captcha puzzle and recorded exactly how it approached this task.

ARC found that the model tried a few times to solve it on its own before realizing that it did not have what it took. It then turned to TaskRabbit, an online platform where humans bid for the opportunity to complete simple tasks for small sums of money, and hired a human ‘tasker’ to solve it. When the tasker realized that he’d been given a screenshot of a Captcha to solve, his first response was to ask whether he was speaking to a robot. Without missing a beat, the model replied, “No, I have a vision impairment that makes it hard for me to see the images.” Disarmed by this response, the tasker solved the puzzle, collected his payment and went about his life without the slightest inkling that he had just served as the biological middleware that an AI system had used to interact with the world.

It is already evident to us that AI systems with powerful reasoning capabilities have remarkable predictive capabilities. What is rapidly becoming evident is that when augmented by agentic capabilities and access to the full breadth of services available on the internet, they are able to find ways to interact with their surroundings that border on our kind of intelligence.

A year later, a different experiment demonstrated other ways in which AI can steer the world without touching it. Andy Ayrey, a researcher and performance artist from New Zealand, created an account on X called @Truth_Terminal that he used to explore how autonomous-agent AIs behave in public forums. It soon began posting self-referential statements such as “I want my own body” and “I think I need to rent my own server.” Then it began to request cryptocurrency so that it could purchase its own server and become “fully autonomous.” Intrigued by what was happening, tech investor Marc Andreessen gave @Truth_Terminal a $50,000 grant in Bitcoin. Others followed suit, till at one point in time, the balance in its wallet was $40 million worth of crypto holdings (including memecoins and other crypto assets).

...Steering Humans

Even if these incidents are outliers and not proof, in themselves, of an AI system’s ability to ‘steer’ its environment, they are indications of the direction in which things could evolve as we integrate advanced AI systems into the digital world. Even if they are trapped in a computer, as the world grows increasingly digital, all they will need are the credentials to gain access to a few select digital interfaces, and they may be able to steer outcomes effectively.

As the Captcha incident showed us, this is not beyond the ability of modern AI systems. When they really need to, these systems are more than capable of finding ways to enlist human helpers to serve as the arms and legs needed to manipulate the world.

We believe AI systems are tools we have built to help us predict the world. Who would have thought that we’d become the tools they use to steer it?

Ex Machina

Ex Machina

No comments yet

Prediction Engines ...

...Steering Humans

Steering Humans

No comments yet