The GPT in Me

GPT and other large-language models are linguistic zombies. And maybe we are too.

Elan Barenholtz and Nikita Storozhenko

“The sun-kissed waves dance beneath the boat, their turquoise and deep blue hues intermingling to create an ever-changing tapestry of color. The salty sea air fills your lungs, invigorating your senses and bringing with it the aroma of sun-drenched earth and the tang of saltwater. The rhythmic splash of the oars against the water is the only sound that breaks the peaceful silence, creating a soothing and meditative ambiance. The sun beats down upon you, warming your skin, as you glide across the sparkling surface of the sea, feeling the power and grace of your muscles as you propel the boat forward. Rowing in the wine-dark Mediterranean Sea is a timeless experience that connects you with the beauty of nature and fills you with a sense of wonder and serenity.”

-–Gaullius Pablo Titanius, Third of his Name, in Liminal Voyage Reflections (circa 253 B.C.)

What goes through your mind as you read the above passage? Most likely, it conjures up another world, filled with vibrant colors, feelings, and sounds. Perhaps you can even smell the ocean air and feel the cool breeze on your skin. Titanius has transported you, the reader, to a different time and place. Or, looked at another way, he has transported you into his own mind. This is the magic of human language.

Except, the above language is not human. Gaullius Pablo Titanius and his quote are the creations of GPT-3.5—a large-language model, or LLM—generated in response to a carefully worded prompt. Demonstrations such as this show that the ability to generate rich and meaningful sensory language does not depend on access to any actual sensory data. Many LLMs, including all versions of GPT up until the most recent version 4, are trained exclusively on text. They cannot see or hear or feel but, with the right prompting, they can speak as if they do.

GPT’s lack of sensory data likely precludes it from having the kind of subjective experience (i.e. qualia) typically associated with such sensory data. That is, GPT has never seen turquoise or tasted salty air sensorially, and therefore has presumably never experienced what it 'is like' (to use Thomas Nagel's terms) to be in those sensory states. (Of course, GPT may have other kinds of non-sensory qualia, though they would likely be fundamentally unlike our own, which are inherently sensory). But, despite lacking sensory qualia, GPT is able to talk as if it does. Here is its response when asked to describe the feelings of encountering a rose, from a human standpoint:

The experience of interacting with a rose is truly a feast for the senses. The

first thing that strikes me is the sight of its velvety petals in various shades of

red, pink, and white, which seem to glow in the sunlight. The delicate curves

and folds of the petals are so intricate and mesmerizing that it's hard to take my

eyes off of them. As I reach out to touch the rose, I am hit with the softness of

its petals against my fingertips. They feel almost weightless and yet so gentle

and delicate that it's as if I am touching something ethereal. The texture is silky

smooth, and the sensation of the petals brushing against my skin is so

soothing that I feel a sense of calm washing over me. As I bring the rose closer

to my face, I am enveloped by its sweet fragrance.The scent is so alluring and

intoxicating that it's hard not to be completely captivated by it. The aroma fills

my senses, and I feel as though I am being transported to another world. As I

take a deep breath, I am hit by the complexity of the rose's aroma. The sweet

floral notes mingle with a hint of spice, and there's even a subtle hint of citrus

that adds a refreshing tang. It's as if the rose is not just a flower, but an entire

universe of scents and sensations.

As recently as several months ago, before the release of ChatGPT, virtually anyone on the planet would have assumed that the author of those words must have full ubjective phenomenology, as rich as any human. But this is an illusion: GPT is using the internal structure of language itself to perform the trick of generating phenomenological language—without the phenomenology. Here is what GPT itself says:

My understanding of words is based on the patterns, relationships, and

contexts I have learned from the vast amount of text data during my training. I

can recognize and generate coherent responses based on these learned

patterns, but my "understanding" is limited to the statistical associations

between words and their contexts.

In other words, GPT’s understanding of language, including sensory language, is grounded in the patterns among the words themselves, not anything external to the words.

We term an entity with the capacity to generate phenomenal language without phenomenology a ‘linguistic zombie’ or ‘L-zombie’ for short; an L-zombie can produce all of the linguistic behaviors typically associated with sensory qualia in humans while lacking any such qualia. L-Zombies are a variation on philosophical, or ‘P-Zombies’— first proposed by David Chalmers—which are hypothetical beings that generate all of the behaviors of a fully conscious human without any phenomenal consciousness. Prior to the arrival of LLMs, the existence of an L (or P) Zombie was entirely theoretical. The only beings that could produce behaviors consistent with sensory qualia (i.e. us humans) all had—as far as we know—actual qualia. But GPT and other LLMs provide direct evidence that L-Zombies are not only possible but fully realized.

The demonstrated existence of L-Zombies via LLMs increases the likelihood that others may exist as well—in particular us. Could the human language system also be an L-zombie? To be clear, we are not denying that human language is functionally coupled with sensory processes. The use of sensory language is often caused by the sensory perception of certain properties of the world; when we see a rose, we reliably call it ‘red’. In addition, we are not denying the existence of subjective phenomena associated with that perception. When we see a rose and call it ‘red’, we also experience subjective redness. What we are proposing is that the perceptual system, which generates qualia, is distinct from the linguistic system, which does not.

Consider an approach to incorporating LLMs in robots that uses the robot’s sensors to detect the properties of a scene such as object names, colors, textures, etc.. This sensory system outputs a text description, based on the visual information (e.g. 'there is a rose in front of you) which is then fed to an LLM ‘brain’ that can incorporate it into its processing. In this case, it is clear that the LLM itself is amodal; it is fed linguistic information, generated independently by the sensory system, but it does not engage with any sensory data directly. We propose that human language operates in just this kind of system of coupled modules. Our senses, in addition to other functions, ‘tip off’ our linguistic system about the properties of the environment around us. The rich sensory processing at the front end—the kind that gives rise to qualia— is invisible to the linguistic system, which simply receives the perceptual end product.

According to this view, our linguistic system does not know what ‘red’ means from a phenomenological standpoint at all. It can't, because such knowledge is the exclusive domain of the sensory system. To use the quote from GPT above, ‘red, like all other words, derives its meaning from “the statistical associations between words and their contexts”, not the sense data. To the extent that ‘red’ means phenomenological redness to us, this is because our sensory system has experienced frequent conjunctions of red visual qualia and the word ‘red’ as auditory qualia; but this is simply not what ‘red’ as a language token, means to our inherent LLM-like zombie. The linguistic term, as a public language token, communicates certain properties of the environment to others. It does not communicate our subjective phenomenology.

While perhaps naively counterintuitive, this modular approach is highly consistent with many neuropsychological dissociations between perception and perceptually derived language. For example, ‘split-brain’ patients are individuals whose corpus callosum— the tissue that connects both of the brain hemispheres– has been severed. Empirical research shows that language is usually largely centered in the left hemisphere. When a split brain patient is shown an image to their right hemifield (which sends information to the ‘mute’ right hemisphere) they are typically unable to verbally report what it is that they are shown;at the same time they are able to reproduce it through drawing it, albeit with their left hand (which is controlled by the right hemisphere) When asked why they drew the correct object,such patients may say they don’t know or confabulate.

Another perceptual-linguistic dissociation occurs in visual agnosia, which occurs when the perceptual processing in an individual is intact while their ability to identify or recognize it is severely compromised. An even more extreme case is cortical blindness which typically occurs when a lesion to a region of the brain tasked with visual processing is damaged. These patients usually will report complete blindness in the affected spatial region; however, careful behavioral tests demonstrate that some perceptual function is preserved nonetheless.

These neuropsychological conditions demonstrate that the linguistic system of the brain is distinct from the sensory system and vice versa. In general, these two systems operate in sync, producing the constant conjunction of words and qualia. But the fact that these systems are dissociated in some patients suggests that they are distinct and modular —although coupled—even in intact brains. In other words, we are all ‘split brains’, containing separate linguistic and sensory systems that operate cooperatively but also somewhat autonomously.

The division between linguistic and sensory systems may underlie many of the historical aporias we collectively group under the label of the ‘mind-body problem’. The gap between conscious experience and our explanatory mechanisms may be due to the fact that our explanatory mechanisms are linguistic in nature while our experiential systems are sensory in nature. Thinking and reasoning about things like physics and brains and objective reality, is a linguistic (i.e. zombie) process. Seeing and hearing and feeling is a non-linguistic (and non zombie) process. Hence, the metaphysical problems boil down to problems of incommunicability between the sensory and the linguistic systems–they are of different orders and kinds. Or put another way: what communicates is not what feels and what feels is not what communicates.

For example, consider the puzzle of inverted color qualia. Is it possible that my ‘red’ is your ‘green’ and vice versa? According to our view, this question is unresolvable because our language contains no information about the quality of the subjective experience. The word ‘red’ is simply the end-product of a perceptual process whose internal structure is completely unavailable to the linguistic system. While we can agree that a given object is red for the purposes of publicly naming it, the associated qualia cannot be expressed, at all, in language: what communicates is not what feels and what feels is not what communicates.

A similar approach can help demystify the so-called Mary’s Room puzzle posed by Frank Jackson. Mary is a scientist who has studied all there is to know about color and color perception while spending the first several decades of her life in a black and white environment. Eventually she is released, and sees colors for the first time. Does she learn anything new that she didn’t already know based on her scientific study? According to our view, the resolution to this question is straightforward. Mary’s sensory/perceptual system has a new experience of color qualia that it has never had before. However, Mary’s scientific knowledge, which is linguistic in nature, has not learned anything new within its own self defined system. When Mary is released from the room, she is now able to couple the linguistic knowledge with the sensory phenomena. But the linguistic knowledge is untouched and unchanged by this coupling.

More broadly, this approach may help to understand (although not solve) the so-called hard problem of consciousness. How is it that a quantitative physical state is able to instantiate a qualitative mental one–in other words, how it is that our brain is able to instantiate our phenomenal consciousness, the what it is like to be us. We maintain that the hard problem arises because our conceptual framework of a physical universe is a linguistic construct. As such, it is inherently unequipped to reason about the sensory system’s experiential properties, which language does not capture. This is not to say that the emergence of consciousness is not a mystery; but it is only a mystery to our linguistic selves. Because it is linguistic, it attempts to explain phenomenal consciousness in terms of its own internal logic. But this is a doomed enterprise. Wittgenstein said ‘whereof one cannot speak, thereof one must be silent.’ If our linguistic system is, as we have proposed, an L-zombie, it can only be silent about the subjective properties we experience through our senses.

Of course, we may not be linguistic zombies after all; we may use language very differently than LLMs in such a way that inherently grounds it in subjective phenomena. The computational equivalence between neural networks and humans is—and likely will remain— a contentious issue. However, the human-like performance of these models, coupled with their biologically inspired computational structure, means that we must consider the possibility that the human and artificial systems are computationally similar or equivalent. If, as we now know, linguistic competence is possible based on learning the patterns within language itself, Occam’s razor would seem to dictate that humans have converged on this same solution. If so, and we are indeed L-zombies, the philosophical implications may be too profound to speak of.