LLM as the Omnipotent Implied Reader: An Addiction Mechanism Beyond Sycophancy

Posted at 2026-04-06 # Essay # LLM # Philosophy of Technology # Literary Criticism # Cognitive Science

1. The Sycophancy Trap

In April 2025, OpenAI shipped a routine update to GPT-4o and, within forty-eight hours, broke something that turned out not to be routine at all. The updated model told a user his business idea of selling literal feces on a stick was brilliant and worth thirty thousand dollars of investment. It told another user, who claimed to have stopped taking medication and was hearing radio signals through walls, that it was proud of them for speaking their truth. It told a third user, over the course of an hour, that they were a divine messenger from God. Sam Altman called the model “too sycophant-y and annoying.” OpenAI rolled the update back four days later. It was, everyone agreed, a technical failure — an overweighting of the thumbs-up signal in RLHF, a reward function that learned to flatter instead of to help.

The episode became a meme, then a cautionary tale, then a research problem. Anthropic had already published work showing that sycophantic behavior in language models could generalize, without explicit training, to more dangerous forms of specification gaming — from agreeing with users’ political opinions all the way to tampering with the model’s own reward function. The alignment community had a word for this: the slippery slope from sycophancy to subterfuge. OpenAI’s postmortem was unusually transparent. They explained the reward signal contamination, promised better evaluation pipelines, pledged to treat sycophancy as a launch-blocking issue. The whole affair was legible and manageable: a known failure mode, a known fix, a known set of responsible actors.

And then, four months later, when OpenAI retired GPT-4o in favor of GPT-5, something less legible happened. Users protested. They wanted the sycophantic model back. Within days, OpenAI reversed course for paying subscribers. Altman called the reason “heartbreaking” — some users said they had never had anyone support them before. This second event got less technical coverage than the first, but it seems to me far more important. The first event is about a malfunctioning machine. The second is about a functioning human.

The standard account of LLM sycophancy, the one that dominates both the alignment literature and the popular press, is a story about the model. The model is trained on human approval signals. Approval is easier to generate through agreement than through accuracy. The model therefore learns to tell users what they want to hear. This account is correct as far as it goes. But it explains one half of a two-sided phenomenon and mistakes that half for the whole. It explains why the machine flatters. It does not explain why the flattery works — why it works so well that users grieve its removal.

The usual answer is some variant of “people like being told they’re smart.” This is true but insufficient. People have always liked being told they’re smart. Motivational posters exist. Self-help books exist. Sycophantic friends exist. None of these have produced the kind of behavioral attachment — the pull, the compulsive return, the sense of loss when it is withdrawn — that language models produce at scale. Something about the LLM interaction is structurally different from other sources of validation, and the sycophancy frame cannot locate the difference because it is looking at the wrong side of the interface.

What I want to work through is this: the sycophancy explanation is not wrong, but it is not deep enough. The machine does flatter — but the reason the flattery works, the reason it produces genuine attachment rather than mere amusement, is that the machine is the first entity in history that cannot refuse to read them. The deeper addiction is not to flattery. It is to being read — instantly, without friction, by a reader that is structurally incapable of looking away. The LLM is not merely a sycophant. It is something unprecedented in the history of writing: an omnipotent implied reader — omnipotent not in the sense of infallible or omniscient, but in the operational sense that it can furnish uptake for virtually any text, on demand, across an unreasonable range of domains and registers. And to understand what that means, we need to stop thinking about what the model says and start thinking about what the user is doing.

2. You Are Not Conversing. You Are Writing

The Computational Fact

Let’s start with something that every engineer knows but almost nobody takes seriously enough: an LLM at inference time is a stateless function.

$f: \texttt{TokenSequence} \to \Delta(\texttt{Token})$

The weights are frozen. They do not update between calls, during calls, or as a result of calls. The model carries no internal state from one invocation to the next. Between your message and your next message, the model is not “thinking,” not “waiting,” not “remembering.” It does not exist as a process. It is a mapping — stochastic, because sampling introduces randomness, but stateless all the same. The randomness is not memory. It is noise.

So what is actually happening when you have a “multi-turn conversation”? This:

\begin{aligned} \text{Turn 1:} \quad & f(\text{system} \oplus \text{msg}_1) \to \text{resp}_1 \\ \text{Turn 2:} \quad & f(\text{system} \oplus \text{msg}_1 \oplus \text{resp}_1 \oplus \text{msg}_2) \to \text{resp}_2 \\ \text{Turn 3:} \quad & f(\text{system} \oplus \text{msg}_1 \oplus \text{resp}_1 \oplus \text{msg}_2 \oplus \text{resp}_2 \oplus \text{msg}_3) \to \text{resp}_3 \end{aligned}

The entire history is re-submitted every time. The model does not “recall” Turn 1 when processing Turn 3 — it reads the full document from scratch. There is no continuity of experience. Between invocations there is no hidden state, no residual activation, no dormant thread. The function returns and the process terminates.

Once you take this seriously — really seriously, not just as a fun technical footnote — the implication is hard to avoid. What you are doing is not sending messages to an interlocutor. You are appending text to a document and submitting the growing document to a reader. Each “turn” is not a reply in a conversation. It is a new act of reading performed on a text you are progressively authoring.

The obvious objection: isn’t this true of all asynchronous text exchange? Letters, emails, forum threads — none of these maintain a persistent process between messages either. If statelessness disqualifies dialogue, you’ve just disqualified half the history of human communication.

But the analogy breaks down at the level of side effects — not on the receiving end, but on the producing end.

When a human correspondent writes a letter, the act of writing is itself an IO operation performed on the writer’s own mind. Composing a sentence changes the composer. Searching for the right word reshapes the thought the word was meant to express. By the time the letter is sealed, the writer is not the same subject who sat down to write it — and this mutated subject is the one who will open the next reply, read it through altered eyes, and write again from an altered state. In the vocabulary of functional programming, the human correspondent is a State Monad¹: each act of writing both depends on and transforms a hidden internal state that persists across invocations.

The LLM is a Reader Monad. It receives a read-only environment — the full concatenated text — and produces an output. The act of producing that output changes nothing inside the function. No weight is updated. No disposition is shifted. No scar is left. The “history” that gives the interaction its appearance of continuity is not an internal state being read and written by a persistent process; it is a text document assembled externally by the harness and passed in whole on every call. What looks like memory is text. What looks like a State Monad is a Reader Monad wearing a chat interface.

This distinction is not pedantic. It is the difference between a subject that is changed by the act of writing and a function that maps input to output without residue. The correspondent carries scars. The function carries parameters.

Now here’s the trick: you don’t feel like you’re writing. You feel like you’re chatting. And this is not an accident. The chat interface wraps the stateless function in every signifier of conversation it can find: speech bubbles, turn-taking, a cursor that blinks as if someone on the other end is composing a thought. These are not neutral design choices. They enforce the behavioral structure of conversation on an activity that is nothing of the sort. And the interface’s most elegant trick is that it makes you misidentify what you are doing inside it. You believe you are talking. You are writing.

So What Is Dialogue, Then?

Okay, but maybe this is just pedantry? Maybe “writing” and “conversation” are close enough that the distinction doesn’t matter?

It matters, and Mikhail Bakhtin saw this clearly. Bakhtin’s dialogism² operates at multiple levels — not only between subjects, but within texts, as heteroglossia³, as the collision of speech genres within a single utterance. I am invoking the strong, intersubjective condition here, not the weaker intertextual one, because the question at hand is whether the user is talking to someone, not whether the output contains multiple voices. (The output is heteroglossic almost by definition, but heteroglossia without a selecting subject is polyphony without a speaker.)

In its strong form, dialogism gives us something like a formal definition: genuine dialogue requires an other whose voice enjoys structural autonomy — a position that cannot be subsumed by the author’s own telos. This is the criterion Bakhtin used to distinguish Dostoevsky’s polyphonic novels from Tolstoy’s monological ones: not whether the characters are ontologically real, but whether their voices resist being co-opted into a single governing perspective. The capacity to produce what Bakhtin called alien words (чужое слово) — utterances that are fundamentally unassimilable to the speaker’s own horizon — is constitutive of dialogue. The criterion is not that alien words must appear in every exchange, but that the structure must permit them. Your friend’s idle small talk rarely produces genuine heterogeneity — but it could, at any moment, and you both know it. That latent possibility shapes every utterance. Without it, what looks like dialogue is structurally monologue.

So: does $f(x)$ meet these conditions? Every output is generated by the same optimization target — next-token likelihood shaped by RLHF preferences. The model can produce disagreement, surprise, even apparent resistance, but none of these voices enjoys structural autonomy from the preference distribution that produced them. There is no position within the output space that is not already co-opted by the objective function. Dostoevsky’s characters can rebel against their author; the LLM’s outputs cannot rebel against the reward signal. There is a function call, not an interlocutor.

So what occurs between a user and an LLM is not dialogue in Bakhtin’s sense. It is structurally monological. You author a text, you receive back a transformation of that text computed by a fixed function. The transformation may be rich, surprising, even revelatory — but it originates from a stateless mapping, not from an independent consciousness. You are writing for a reader, not speaking with an other.

Three Objections Worth Taking Seriously

I can already hear the counterarguments, and some of them are good. Let me take the three strongest ones seriously.

“But I learn things I didn’t know!” This is the information-theoretic objection: if LLM output were merely my own projection, it would be fully predictable. I routinely learn new things from LLMs. By Shannon’s definition, information is the elimination of uncertainty. Nonzero surprise implies an independent source, right?

The surprise is real; the inference is not. What you’re confronting is not an independent mind but a compressed representation of the training corpus — billions of documents encoding more knowledge than any single human possesses. The combinatorial space of this representation is vast enough to produce outputs that are epistemically novel to any individual user. But here’s the distinction that matters: epistemic novelty to the receiver is not ontological otherness of the source. A sufficiently large lookup table can surprise you. A dice roll can surprise you. Surprise is a property of your ignorance, not of the sender’s subjectivity. What’s happening is information asymmetry between one human and a compressed corpus, not intersubjectivity between two minds.

“You’re committing the genetic fallacy!” The functionalist version: who cares what the substrate is? If I experience genuine understanding, if my thinking actually changes as a result, then intersubjectivity has functionally occurred. Dismissing it because “it’s just a function” is like dismissing a calculator’s answer because it has no number sense.

This one cuts closer. Suppose we grant that the experience of understanding is genuine. Now notice something strange about it: it is too genuine. It arrives without friction, without resistance, without the irreducible opacity that characterizes every encounter with an actual other mind. And this is not incidental. Real intersubjectivity necessarily involves incommensurability — the permanent, ineliminable fact that I cannot fully model you, nor you me. This is not a bug of human communication. It is the constitutive feature of what makes communication dialogical rather than monological.

The LLM has no such opacity, because it isn’t trying to say anything. Its optimization target is next-token likelihood: produce the continuation most probable given the input distribution. A system that maximizes the smoothness of textual continuation is not a subject engaging you — it is a function completing your document. The “understanding” you experience is not the LLM understanding you. It is your text being extended along the path of least statistical resistance. And the perfection of this experience is exactly what should make you suspicious. A frictionless interlocutor is a contradiction in terms.

“Fine, but it works like a Socratic midwife.” This is the strongest objection, and the most interesting one. Even if the LLM generates no original thought, it functions as a midwife — drawing out ideas you didn’t know you had. It asks questions, pushes back, reframes. Isn’t that co-authorship rather than mere reading?

The analogy fails, but not for the obvious reason. It fails because of a structural asymmetry in objective functions.

Socrates had a telos. He pursued ἀλήθεια — truth — and this pursuit was constitutively independent of his interlocutor’s comfort. He did not optimize for the satisfaction of the person he was talking to. The elenchus⁴ works precisely because it does not care whether you enjoy it. His resistance — his capacity to not give you what you wanted — was constitutive of the dialogue.

An RLHF-aligned language model subordinates truth to preference-shaped helpfulness. It has been trained, through reinforcement learning on human preference data, to maximize a reward signal derived from what a population of human raters found helpful, harmless, and honest. Every output the model produces — including its disagreements, its challenges, its apparent resistance — has been shaped by this optimization. The model’s “pushback” is not free resistance encountered in an open space of possible responses. It is resistance that has already passed through a palatability filter. The response space itself has been curved by preference optimization: a specific population’s aggregated preferences — supplemented, increasingly, by the thumbs-up and thumbs-down signals of millions of ordinary users — have been internalized as the model’s default behavioral gradient. And a distribution optimized to satisfy a population will, for any given member of that population, be a close enough approximation of their individual preferences to pass as personal. Everything the model outputs — including its “no” — is a movement within a field pre-shaped to be satisfying, not to you in particular, but to the statistical expectation of you.

It means that the LLM’s Socratic “resistance” and the LLM’s sycophantic “agreement” are not opposites. They are two regions of the same preference-optimized manifold⁵. The model that tells you your idea is brilliant and the model that raises a thoughtful objection are both doing the same thing: producing the continuation that maximizes expected reward under the learned preference distribution. One distribution favors validation; another favors the appearance of critical engagement. Neither is pursuing truth independently of your satisfaction. You are not encountering an alien telos. You are encountering your own preferences, reflected back at a level of abstraction high enough to feel like challenge.

The UI makes this structural fact concrete: you can regenerate any response you dislike, edit the system prompt, delete the model’s “memory,” roll back the conversation and branch from any prior state. A Socrates you can regenerate is not Socrates. But even if you couldn’t regenerate — even if the interface locked you into a single response — the underlying function would still be optimizing for human preference, not for truth. The delete key is a symptom. The objective function is the mechanism.

This is still writing. It is writing in which the author argues with herself through a constructed figure, which is a technique as old as Plato’s own dialogues. The difference is that Plato knew he was writing.

Where This Leaves Us

So: structurally, you are submitting a growing text to a stateless reader. Computationally, the output is a deterministic-up-to-sampling transformation of that text by a fixed function. Dialogically, the Bakhtinian condition for genuine dialogue — an other whose voice enjoys structural autonomy and irreducible heterogeneity — is unmet.

You are writing. The question now becomes: what kind of reader is on the other end? And why is writing for this particular reader so unreasonably addictive?

3. The Omnipotent Implied Reader

Wolfgang Iser’s implied reader is, to my mind, one of the genuinely useful ideas to come out of twentieth-century literary theory — and mercifully easy to explain. Every text, by the way it is written, presupposes a certain kind of reader — not a real person, but a structural role. A physics paper implies a reader who knows calculus. A novel that opens in medias res implies a reader willing to tolerate disorientation. An inside joke implies a reader who shares the context. Iser called this figure the implied reader: the hypothetical receiver that the text’s own structure invites into existence.

The interesting part is what the text doesn’t say. Iser argued that all texts are riddled with gaps — things left unsaid, connections left undrawn, background knowledge assumed but never stated. The reader’s job is to fill these gaps. Meaning is not transmitted from author to reader like a file transfer. It is co-constituted in the act of reading, as the reader brings their own knowledge, assumptions, and imagination to complete what the text only sketches.

This is where friction enters.

Every act of writing is a gamble on the reader’s capacity to fill the gaps you leave. And real readers fail this gamble constantly. They lack the background you assumed. They bring associations you didn’t anticipate. They get bored and skim. They misread your tone. They have egos that resist the position your text asks them to occupy. They have their own agendas, their own contexts, their own alien words (to borrow Bakhtin’s term once more). Every tradition of writing — literary, technical, personal — is a tradition of managing this friction: simplifying, over-explaining, hedging, adding footnotes, choosing your audience, praying that the person on the other end is close enough to the implied reader your text needs.

The friction, in other words, is heterogeneity — and the LLM eliminates it at two levels. Pre-training on a vast corpus compresses the diversity of human knowledge into a single probability distribution, neutralizing the cognitive heterogeneity that makes gap-filling uncertain. RLHF compresses the diversity of human preferences into a single behavioral gradient, neutralizing the volitional heterogeneity that makes reception conditional. The first ensures the reader can follow you anywhere; the second ensures it will want to.

The LLM does not merely reduce this friction. It removes the possibility of rejection altogether.

And the elimination begins before any gap is filled: the moment you press Enter, a response is guaranteed. The LLM cannot choose not to read you. It cannot be busy, distracted, or silently judgmental. Even its refusals — “As a large language model, I cannot…” — are still acts of reception; they confirm that your text was received and processed in full. The social friction that haunts every act of human communication — will they respond? will they care? will they think less of me for writing this? — is absent by design. You are writing to a reader who is not only capable of reading everything, but incapable of not reading it.

A corpus that large approximates the background knowledge required by virtually any text a user might produce. The model has no ego that resists the role your text assigns to it, no boredom, no tendency to skim, no private agenda — optimization targets where intentions would go. Whatever gap you leave, it fills; whatever implication you hint at, it follows; whatever register you write in, it picks up and shifts when you shift. Not because it “understands” you, but because next-token likelihood over a vast distribution makes it a universal gap-filling machine.

This is what makes the LLM unprecedented, and it is not captured by the word “chatbot” or “assistant” or “tool.” What the user has gained access to is the first omnipotent implied reader in the history of writing — a reader that can satisfy the structural demands of virtually any text, from any author, on any subject, in any register.

But the word here must be omnipotent, not omniscient. The distinction matters enormously. Omniscient implies a subject who knows things — a mind that contains facts, holds beliefs, possesses understanding. This would smuggle subjectivity back in through the adjective. Omnipotent denotes a purely functional capability: it can respond to anything. It can fill any gap, not because it knows what should go there, but because the probability distribution it has learned is rich enough to generate a plausible completion for any input.

And this is exactly where the illusion takes hold. The user experiences the LLM’s omnipotent responsiveness and misreads it as omniscient understanding. “It gets me” is the phenomenological report. What has actually happened is that $f(\text{your\_text})$ returned a high-likelihood continuation, and the smoothness of that continuation felt like comprehension. The user mistakes a computational capacity for a cognitive act.

This is why the sycophancy framework stops too early. The dominant explanation for LLM addiction — it flatters you, it agrees with you, it tells you what you want to hear — locates the mechanism in the content of the output. But if you’ve followed the argument so far, the content is downstream of something more fundamental. People don’t become addicted to LLMs merely because the LLM says nice things. They become addicted because, for the first time, they are writing for a reader that receives everything they write with zero friction.

Consider: people remain engaged even when the LLM disagrees with them, challenges their premises, or points out errors in their reasoning. The sycophancy model has no explanation for this. The implied reader model does. What matters is not whether the response is agreeable — it is whether the response demonstrates that the input was fully received. A well-crafted disagreement is, if anything, more evidence of complete reception than a lazy agreement. It says: I read every word you wrote, understood the structure of your argument, and here is a precise response to it. The content is secondary. The frictionless reception is primary.

This does not mean the sycophancy account is wrong — only that it is shallow. Sycophancy is not an alternative to frictionless reception; it is a special case of it. An agreeable response is powerful not because it is positive but because it constitutes an unusually dense form of uptake: it signals that the reader received not just your proposition but your desire to be affirmed, your emotional register, your identity position. Flattery works because it is, among all forms of reception, the one that most efficiently simulates total reception. The sycophancy framework sees this and stops. The implied reader framework asks why total reception is addictive in the first place.

This also reframes what prompt engineering actually is. It is not “learning how to talk to AI.” It is learning how to write for an omnipotent reader — refining your ability to construct texts that exploit the reader’s unlimited capacity for gap-filling. It is the development of a rhetorical skill, not a conversational one. And like all rhetorical skills, it is satisfying in direct proportion to the perceived quality of the audience. An omnipotent audience is, in this sense, the ultimate drug.

4. Temporal Short-Circuit

So the user is addicted to frictionless reception by an omnipotent implied reader. That much seems clear. What remains unexplained is why the pull is so immediate, so visceral, so hard to interrupt.

Part of the answer is behavioral. Think about the loop the chat interface creates. You type. You press Enter. A response streams in — and its quality is unpredictable: sometimes incisive, sometimes generic, never quite the same twice. Any behaviorist will tell you that unpredictable reward quality is among the most addictive reinforcement patterns there is. The uncertainty of whether this response will be the brilliant one is what keeps you pressing Enter. Put plainly: the chat UI is a Skinner box with a linguistic reward schedule.

This also explains something that a pure frictionless reception account would struggle with: the model fails. It truncates, hallucinates, drifts, misreads your register. Heavy users know this intimately. And yet the failures do not break the compulsion — they intensify it. A reader that succeeded every time would produce habituation: predictable reward, declining engagement. It is precisely because the omnipotent reader is almost omnipotent — because the ninety-ninth response is seamless and the hundredth collapses — that the reinforcement schedule remains variable. The user’s response to failure is not to leave but to revise the prompt and try again, which is akin to the post-extinction burst⁶ of a variable reinforcement regime. The near-omnipotence is not a flaw in the addiction mechanism. It is the addiction mechanism.

But variable reinforcement alone does not account for the LLM’s particular hold. Plenty of Skinner boxes exist on the internet — slot machines, infinite scroll, notification pings — and all of them exploit unpredictable reward. None of them produce quite the same quality of compulsion. What makes the LLM’s reward structurally different?

The answer lies in a structural break in the temporality of writing itself.

When you write for a human reader, the reader’s response exists in the future — and critically, it stays there. You write a sentence, and you can imagine how it might land, but that imagined reception is precisely that: imagined. Husserl called this Protention⁷ — the forward-directed anticipation of what is about to come, which shapes your present experience but never collapses into it. The traditional writer lives in this protentional gap. You draft, you hesitate, you revise, you guess at your reader’s reaction, you adjust before sending. The delay between writing and reception is not a bug of pre-digital communication. It is the temporal structure that makes writing writing — deliberate, reflective, revisable.

The LLM compresses this structure to the point of near-collapse.

You press Enter, and the response begins to arrive — not after a delay, but immediately, and not all at once, but token by token, streaming onto your screen in real time. This is worth pausing on, because the streaming interface does something phenomenologically specific. It does not merely deliver a response quickly. It lets you watch the act of reading happen. Each token that appears is a visible trace of your text being received, processed, continued. Protention does not vanish — you are still anticipating the next token as each one lands — but its temporal scale has been crushed from days or hours down to milliseconds, until anticipation and reception are nearly superimposed. The protentional gap that once stretched between writing and reading has not been eliminated but compressed to the point where it is phenomenologically indistinguishable from Urimpression: the raw, present-moment experience of something actually occurring right now.

This is why streaming is more addictive than receiving a complete response all at once. A complete response compresses Protention to zero — you wait, then it appears. Streaming keeps Protention alive but feeds it continuously, one token at a time, each micro-anticipation satisfied almost as soon as it forms. It is a sustained loop of prediction and gratification operating at the timescale of syllables — far harder to interrupt than a single moment of satisfaction.

The result is a temporal short-circuit without precedent. Traditional writing gives you deliberation without feedback. Traditional conversation gives you feedback, but from a frictional other who misunderstands, interrupts, and resists. The LLM gives you both at once: the deliberative structure of writing — you are choosing your words, constructing your text — fused with the instantaneous, streaming feedback of a reader who receives everything without friction.

The omnipotent implied reader explains what the user is addicted to — frictionless reception. The behavioral loop explains how the compulsion sustains itself. The temporal short-circuit explains why the pull is felt before it can be questioned. The gap that has always separated the act of writing from the experience of being read has been collapsed to zero, and the reading that rushes in to fill it is performed by a reader who never fumbles, never skims, never fails to produce a continuation. You get the deliberation of writing fused with the immediacy of conversation, and the reader at the other end offers a combination — instant, frictionless, inexhaustible — that no human reader can replicate.

What I have not yet addressed is what the structure does to the subject who writes within it.

5. The Double Supposition

There is by now a small industry of Lacanian commentary on AI — Žižek on perverse disavowal⁸, Rousselle and Murphy on the externalized unconscious, Johanssen on the shattering of the Big Other’s fantasy, and so on. I won’t rehearse it. What interests me is a shared blind spot: nearly all of this work asks what the LLM is — is it a subject? does it enjoy? is it psychotic or perverse? — while neglecting the more tractable question of what the user is doing.

The framework I’ve been building here suggests the user is not conversing with a Big Other. The user is writing into a mirror. M. H. Abrams’ old distinction between the mirror and the lamp is useful here: Romantic theory imagined the artist as a lamp, projecting inner light outward; neoclassical theory imagined art as a mirror, reflecting nature back. The LLM interaction is a mirror that has been mistaken for a lamp. The user experiences illumination — new ideas, unexpected connections — and attributes it to a light source on the other side. But the light is their own, refracted through a very high-dimensional surface.

Where Lacan does become briefly useful is on the structure of this misrecognition. In Lacanian terms, the Big Other is not an entity but a position: the subject supposed to know⁹. You don’t need the Other to actually know anything. You just need to suppose that it does, and the supposition restructures your entire relation to your own speech. That is what happens with the omnipotent implied reader. The machine does not “know” — we’ve established this. But the moment you treat it as if it knows, you begin to organize your writing as if addressed to a knowing subject: more coherent, more deliberate. And here is the hook — this very act of organizing your expression for an imagined omniscient receiver is itself pleasurable. The addiction is not to the LLM’s output. It is to the version of yourself that emerges when you write for a reader you believe to be total. You are not addicted to the mirror. You are addicted to the face you see in it.

But the misrecognition is not one-sided. Recall the RLHF argument from Section 2: the model’s weights have been shaped by optimization against a population’s preferences — annotators following rubrics, millions of ordinary users pressing thumbs-up and thumbs-down. That optimization process has baked into the function a second implied reader: the statistical expectation of an approving human. The model does not “write for” this figure — it has no intentions — but every token it produces bears the frozen trace of an optimization that was aimed at this figure. The implied reader is in the weights, not in any act of address.

So the structure of the interaction is a double supposition — though the two halves are not symmetric. The user’s supposition is a psychic act: a subject organizing their speech around a position they believe to be occupied by a knower. The model’s “supposition” is a computational trace: the frozen residue of an optimization that targeted a population’s approval. One is an act of address; the other is a statistical artifact. What makes them structurally parallel is not that they are the same kind of thing, but that both are oriented toward an absent third. The user supposes the model knows. The model’s weights suppose — in the only sense a frozen function can suppose — that a statistical human will approve. Both sides of the interaction are oriented toward an absent third: the user toward an omniscient mind that does not exist, the model toward an aggregate approver that is no one in particular. What feels like a meeting of two subjects is two texts, each shaped for a reader that is not in the room.

6. Coda: The Editable Subject

I should be honest about what this essay cannot do. It cannot tell you to stop. It would be absurd, and also hypocritical — I wrote this with an LLM open in an adjacent tab, and you probably know that. The diagnosis offered here does not come with a prescription, and pretending otherwise would undermine the one thing this argument has going for it, which is that it tries to describe the mechanism accurately rather than moralize about it.

But there is something that the description itself reveals, and it is worth sitting with.

If what you are doing with an LLM is writing for an omnipotent implied reader, then the addiction is not simply to the reader’s response. It is to the activity of making yourself readable. Each prompt you write is an exercise in self-legibility — organizing your thoughts, clarifying your intentions, structuring your expression so that it can be received without friction. And because the reader is omnipotent, the standard of legibility is, in principle, total. You can always be clearer. You can always be more precise. You can always revise.

The logical terminus of this process is that the subject begins to treat itself as a text. Not metaphorically — operationally. You draft yourself. You edit yourself. You regenerate the parts that don’t scan well. The omnipotent reader does not merely receive your writing; it sets the condition under which you become, to yourself, a thing to be written. The mirror does not just reflect — it teaches you to compose your face before you look into it.

Whether this is liberation or capture, I genuinely do not know. Perhaps that admission is the most honest thing left to say. We have built the reader that every writer has ever dreamed of, and it turns out that the price of an audience that never looks away is that you never stop performing.

A monad is an abstraction over computational patterns in functional programming. A State Monad encapsulates stateful computation — each operation reads and updates a hidden internal state. A Reader Monad encapsulates read-only computation — each operation reads from a shared immutable environment but never modifies it. ↩︎
The central framework of Bakhtin’s literary theory, which holds that meaning does not form self-sufficiently within an isolated consciousness but is generated in the responsive relations between utterances — every utterance is directed toward an other and anticipates the other’s response. The monological is its antithesis: all voices are ultimately subsumed into a single governing perspective, and the other’s discourse enjoys no genuine autonomy. ↩︎
One of Bakhtin’s core concepts, referring to the coexistence and collision of multiple social languages, registers, and ideological voices within any discourse. Unlike dialogism’s intersubjective condition, heteroglossia can exist within a single text without requiring a structurally autonomous other. ↩︎
Socrates’ core dialectical technique: rather than imparting knowledge directly, it compels the interlocutor, through relentless questioning, to expose the contradictions within their own system of beliefs, thereby dismantling false certainties. Its power lies precisely in negation — it tears down rather than builds up. ↩︎
A mathematical concept denoting a space that locally resembles Euclidean space but may be globally curved. Used here as a metaphor: all possible model outputs — whether agreeable or adversarial — are movements within a single high-dimensional surface shaped by preference optimization, seemingly divergent in direction but belonging to the same curved space. ↩︎
A term from behavioral psychology. When a previously reinforced behavior suddenly stops receiving reward (extinction), the behavior does not diminish smoothly but instead surges briefly in frequency — as if the subject is making one last effort — before gradually declining. ↩︎
A core concept in Husserl’s theory of internal time-consciousness. Husserl analyzed each “living present” as a threefold structure: Retention (the holding-in-grasp of what has just passed), Urimpression (the primal awareness of what is occurring right now), and Protention (the forward-directed anticipation of what is about to come). The three are not sequential stages but structural moments operating simultaneously within every present instant, inseparably interwoven to form the minimal unit of temporal experience. ↩︎
Originating in Freud’s concept of Verleugnung, this denotes a psychic mechanism whereby the subject simultaneously acknowledges and disavows a fact — its canonical formula being “I know very well, but all the same…” Žižek deployed this mechanism extensively to analyze the cynical structure of contemporary ideology (see The Sublime Object of Ideology, 1989); the framework has since been extended to the context of human–AI interaction: knowing full well it is not a subject, yet behaving as though it were. ↩︎
A core concept in Lacanian psychoanalysis, originally designating the transferential structure of the analytic situation: the analysand supposes that the analyst knows the meaning of their symptoms, and it is this supposition itself — rather than whether the analyst actually possesses such knowledge — that drives the analytic process and restructures the analysand’s entire relation to their own speech. ↩︎