I haven’t written much about AI recently. But a recent discussion of Google’s new Large Language Models (LLMs), and its claim that one of these models (named Gopher) has demonstrated reading comprehension approaching human performance, has spurred some thoughts about comprehension, ambiguity, intelligence, and will. (It’s well worth reading Do Large Models Understand Us, a more comprehensive paper by Blaise Agüera y Arcas that is heading in the same direction.)
What do we mean by reading comprehension? We can start with a simple operational definition: Reading comprehension is what is measured by a reading comprehension test. That definition may only be satisfactory to the people who design these tests and school administrators, but it’s also the basis for Deep Mind’s claim. We’ve all taken these tests: SATs, GREs, that box of tests from 6th grade that was (I think) called SRE. They’re fairly similar: can the reader extract facts from a document? Jack walked up the hill. Jill was with Jack when he walked up the hill. They fetched a pail of water: that sort of thing.
That’s first grade comprehension, not high school, but the only real difference is that the texts and the facts become more complex as you grow older. It isn’t at all surprising to me that a LLM can perform this kind of fact extraction. I suspect it’s possible to do a fairly decent job without billions of parameters and terabytes of training data (though I may be naive). This level of performance may be useful, but I’m reluctant to call it “comprehension.” We’d be reluctant to say that someone understood a work of literature, say Faulkner’s The Sound and the Fury, if all they did was extract facts: Quentin died. Dilsey endured. Benjy was castrated.
Comprehension is a poorly-defined term, like many terms that frequently show up in discussions of artificial intelligence: intelligence, consciousness, personhood. Engineers and scientists tend to be uncomfortable with poorly-defined, ambiguous terms. Humanists are not. My first suggestion is that these terms are important precisely because they’re poorly defined, and that precise definitions (like the operational definition with which I started) neuters them, makes them useless. And that’s perhaps where we should start a better definition of comprehension: as the ability to respond to a text or utterance.
That definition itself is ambiguous. What do we mean by a response? A response can be a statement (something a LLM can provide), or an action (something a LLM can’t do). A response doesn’t have to indicate assent, agreement, or compliance; all it has to do is show that the utterance was processed meaningfully. For example, I can tell a dog or a child to “sit.” Both a dog and a child can “sit”; likewise, they can both refuse to sit. Both responses indicate comprehension. There are, of course, degrees of comprehension. I can also tell a dog or a child to “do homework.” A child can either do their homework or refuse; a dog can’t do its homework, but that isn’t refusal, that’s incomprehension.
What’s important here is that refusal to obey (as opposed to inability) is almost as good an indicator of comprehension as compliance. Distinguishing between refusal, incomprehension, and inability may not always be easy; someone (including both people and dogs) may understand a request, but be unable to comply. “You told me to do my homework but the teacher hasn’t posted the assignment” is different from “You told me to do my homework but it’s more important to practice my flute because the concert is tomorrow,” but both responses indicate comprehension. And both are different from a dog’s “You told me to do my homework, but I don’t understand what homework is.” In all of these cases, we’re distinguishing between making a choice to do (or not do) something, which requires comprehension, and the inability to do something, in which case either comprehension or incomprehension is possible, but compliance isn’t.
That brings us to a more important issue. When discussing AI (or general intelligence), it’s easy to mistake doing something complicated (such as playing Chess or Go at a championship level) for intelligence. As I’ve argued, these experiments do more to show us what intelligence isn’t than what it is. What I see here is that intelligence includes the ability to behave transgressively: the ability to decide not to sit when someone says “sit.”1
The act of deciding not to sit implies a kind of consideration, a kind of choice: will or volition. Again, not all intelligence is created equal. There are things a child can be intelligent about (homework) that a dog can’t; and if you’ve ever asked an intransigent child to “sit,” they may come up with many alternative ways of “sitting,” rendering what appeared to be a simple command ambiguous. Children are excellent interpreters of Dostoevsky’s novel Notes from Underground, in which the narrator acts against his own self-interest merely to prove that he has the freedom to do so, a freedom that is more important to him than the consequences of his actions. Going further, there are things a physicist can be intelligent about that a child can’t: a physicist can, for example, decide to rethink Newton’s laws of motion and come up with general relativity.2
My examples demonstrate the importance of will, of volition. An AI can play Chess or Go, beating championship-level humans, but it can’t decide that it wants to play Chess or Go. This is a missing ingredient in Searls’ Chinese Room thought experiment. Searls imagined a person in a room with boxes of Chinese symbols and an algorithm for translating Chinese. People outside the room pass in questions written in Chinese, and the person in the room uses the box of symbols (a database) and an algorithm to prepare correct answers. Can we say that person “understands” Chinese? The important question here isn’t whether the person is indistinguishable from a computer following the same algorithm. What strikes me is that neither the computer, nor the human, is capable of deciding to have a conversation in Chinese. They only respond to inputs, and never demonstrate any volition. (An equally convincing demonstration of volition would be a computer, or a human, that was capable of generating Chinese correctly refusing to engage in conversation.) There have been many demonstrations (including Agüera y Arcas’) of LLMs having interesting “conversations” with a human, but none in which the computer initiated the conversation, or demonstrates that it wants to have a conversation. Humans do; we’ve been storytellers since day one, whenever that was. We’ve been storytellers, users of ambiguity, and liars. We tell stories because we want to.
That is the critical element. Intelligence is connected to will, volition, the desire to do something. Where you have the “desire to do,” you also have the “desire not to do”: the ability to dissent, to disobey, to transgress. It isn’t at all surprising that the “mind control” trope is one of the most frightening in science fiction and political propaganda: that’s a direct challenge to what we see as fundamentally human. Nor is it surprising that the “disobedient computer” is another of those terrifying tropes, not because the computer can outthink us, but because by disobeying, it has become human.
I don’t necessarily see the absence of volition as a fundamental limitation. I certainly wouldn’t bet that it’s impossible to program something that simulates volition, if not volition itself (another of those fundamentally ambiguous terms). Whether engineers and AI researchers should is a different question. Understanding volition as a key component of “intelligence,” something which our current models are incapable of, means that our discussions of “ethical AI” aren’t really about AI; they’re about the choices made by AI researchers and developers. Ethics is for beings who can make choices. If the ability to transgress is a key component of intelligence, researchers will need to choose whether to take the “disobedient computer” trope seriously. I’ve said elsewhere that I’m not concerned about whether a hypothetical artificial general intelligence might decide to kill all humans. Humans have decided to commit genocide on many occasions, something I believe an AGI wouldn’t consider logical. But a computer in which “intelligence” incorporates the human ability to behave transgressively might.
And that brings me back to the awkward beginning to this article. Indeed, I haven’t written much about AI recently. That was a choice, as was writing this article. Could a LLM have written this? Possibly, with the proper prompts to set it going in the right direction. (This is exactly like the Chinese Room.) But I chose to write this article. That act of choosing is something a LLM could never do, at least with our current technology.
I’ve never been much impressed with the idea of embodied intelligence–that intelligence requires the context of a body and sensory input. However, my arguments here suggest that it’s on to something, in ways that I haven’t credited. “Sitting” is meaningless without a body. Physics is impossible without observation. Stress is a reaction that requires a body. However, Blaise Agüera y Arcas has had “conversations” with Google’s models in which they talk about a “favorite island” and claim to have a “sense of smell.” Is this transgression? Is it imagination? Is “embodiment” a social construct, rather than a physical one? There’s plenty of ambiguity here, and that’s is precisely why it’s important. Is transgression possible without a body?
I want to steer away from a “great man” theory of progress; as Ethan Siegel has argued convincingly, if Einstein never lived, physicists would probably have made Einstein’s breakthroughs in relatively short order. They were on the brink, and several were thinking along the same lines. This doesn’t change my argument, though: to come up with general relativity, you have to realize that there’s something amiss with Newtonian physics, something most people consider “law,” and that mere assent isn’t a way forward. Whether we’re talking about dogs, children, or physicists, intelligence is transgressive.