The Artifact Problem
If you read my previous post, you might reasonably be thinking: so what? We already know large language models can produce incorrect outputs. Most interfaces even remind us to double-check their responses for accuracy. And despite the limitations they are undeniably useful.
I have integrated Claude Code into my professional workflow. It's been a process of discovery and refinement. I find myself uncovering novel use cases and grazing against the edges of its limitations constantly, and I'm just one person. It leaves me with this ambivalent sense of amazement and suspicion. What can it do!?! What can it do????
In 2024, a group of researchers published a paper titled Large Language Model Displays Emergent Ability to Interpret Novel Literary Metaphors.¹ They asked GPT-4 to generate interpretations of metaphors and poems, carefully selecting texts that were unlikely to appear in its training data. The GPT-4 responses were mixed with metaphor interpretations written by college students and evaluated by human judges. GPT-4's interpretations were consistently rated higher. A literary critic also reviewed the poem interpretations and judged them to be excellent or good.
I want to recognize the significance of this. The researchers designed the study specifically to rule out memorization. These weren't prompts GPT-4 could have seen before and it didn't just perform adequately, it outperformed humans. The researchers called this an "emergent ability." Interpretation is an incredibly complex linguistic achievement, and the fact that an LLM can produce outputs we recognize as meaningful interpretation is genuinely impressive.
It's worth asking a more precise question though: what would it actually require for something to interpret? Not just produce output that reads like interpretation, but to do the thing we mean when we say "interpretation"?
In 1901, divers discovered what is now believed to be the world's oldest analog computer: the Antikythera mechanism.² Dating to ancient Greece, it is believed to perform complex astronomical calculations and remains the only known device of its kind from that era. Archaeologists have been studying it for over a century, trying to understand what it does, how it works, and how it fit into the world that produced it.
This is what archaeologists do. They work with material remains, and try to reconstruct the world that produced them. The object alone doesn't tell you enough, it's a compressed record of a context the archaeologist doesn't fully have access to. To understand it, they have to reconstruct the relationships, practices, and knowledge that gave it meaning in its original context.
Text works the same way; writing is an act of compression. As I draft this, I'm condensing all of my fully formed ideas and nebulous intuitions into something that, hopefully, another person can expand into meaning. Text is an artifact of human thought, and like any artifact, it is necessarily deficient of the generative contexts, lived experience, and situational knowledge that led to its creation.
This realization unsettles me, but the loss of context in written language is also what makes communication efficient. The compression works because writer and reader share a world.
One of my favorite philosophers is Ludwig Wittgenstein. Probably because he did something really unusual for a philosopher: he changed his mind. In his early work, the Tractatus, he argued that language was essentially a picture of reality, that the structure of a sentence maps onto the structure of facts. He then spent the rest of his career trying to dismantle that argument.
He landed on meaning as use. His idea was that words don't carry meaning inherently. It arises from shared practices, contexts, and forms of life in which language is used. He called these "language games," not because they're trivial, but because like games, they only work when the participants share the rules.
Anyone who has worked in a large legacy codebase has asked, "why is this built this way?" Even when the original author is still around, the full context behind decisions fades quickly. Meaning as use is why you can look at a legacy codebase you've never seen before and still reconstruct enough meaning to work with it. You and the original author share a form of life: the practices, conventions, and pressures of building software. You're playing the same language game.
If you've onboarded Claude Code into your process, you may notice some tension here. Tools like Claude Code can also navigate unfamiliar codebases, and sometimes impressively well. They don't share a form of life with the original author, but the conventions of software are heavily externalized. Code may be one of the cases where the artifact retains enough structure that statistical reconstruction gets you really far.
But that's also what makes it misleading as a general case. Most human communication isn't standardized and pattern-rich in the same way that code is. Wittgenstein recognized that much of what matters in communication is subtext: implication, shared understanding, everything that isn't explicitly stated but shapes what the words mean.
LLMs have the tokens but not the shared form of life that gives those tokens meaning. They can model the record of the game, but not the playing of it. The LLM processes what was said. It misses everything that was shown through the saying of it.
This brings us back to the study. When human judges rated GPT-4's metaphor interpretations higher than the college students', the significance they found was attributed, not innate. They were meaning-making through a particular language game, bringing their own experience to bear on artifacts.
The literary critic in the study had a slightly different task. He was asked to rate the AI-generated poem interpretations on their own, not in comparison to human-generated text. The critic made a similar qualitative assessment but had an intuition that it was AI generated. The output had the form of interpretation, but something was missing. The critic could feel the gap between what the text performed and what it lacked.
I should be honest here, I've made an ontological claim that I'm not fully sold on. The question of whether producing something functionally indistinguishable from interpretation is interpretation is genuinely open. I believe the difference is meaningful, but this is still actively debated. Either way, what happens between the output and the person reading it is worth examining on its own.
This is, ultimately, a recursive problem. The LLM produces output by reconstituting meaning from its training data, but it operates on artifacts without access to the world that produced them. Then we read that output and reconstitute it again, applying the same interpretive framework we use for human-generated language. We project understanding onto fluency because that's how meaning-making works between humans.
This is the artifact problem. The training data preserves the surface of communication, but not the lived practices that made that surface intelligible. LLMs work, but they work within the contours of what was written and preserved. And because we read their output with the same interpretive generosity we bring to human text, we fill the gap ourselves. The meaning we find in the output is the meaning we brought to it.
I do this every day when I use them. But when we don't acknowledge what they can't do, we default to giving them authority they haven't earned. And decisions grounded in coherence rather than correspondence have a way of looking right until they aren't.
This post, the one you're reading right now, is also an artifact.
I'm writing this for many reasons. Some are intellectual. I genuinely want to externalize these ideas and see if they hold up. Some are professional. I work in this space and thinking clearly about it makes me better at my job. And some of my motivations are likely unconscious, even to me.
There is probably subtext here that makes you weight one of those motivations over the others.
You've been doing the thing I just described, this entire time, with this text. So has every person who's ever read anything.
In this particular instance, there's someone on the other side here with all of the context that didn't make it onto the page, who is playing the same language game you are.
1. Prystawski, Ben, et al. "Large Language Model Displays Emergent Ability to Interpret Novel Literary Metaphors." arXiv preprint arXiv:2308.01497 (2023).
2. "Antikythera mechanism." Wikipedia.