Your Work Trained the Model. The Model Replaced You. Philip K. Dick Wrote This Story in 1968.

Written by thegeneralist | Published 2026/04/06
Tech Story Tags: ai | futurism | business | technology | science-fiction | cyberpunk | philip-k.-dick | hackernoon-top-story

TLDRThe first workers displaced by generative AI weren't software engineers. They were translators and $1.32/hr data labelers. Philip K. Dick predicted why. via the TL;DR App

How Philip K. Dick predicted the moral logic of AI training data and labor theft 50 years before the dataset existed

Part 4 of a six-part series using science fiction as a lens for understanding AI, work, and power in 2026 and beyond.

The basic tool for the manipulation of reality is the manipulation of words. If you can control the meaning of words, you can control the people who must use them. Philip K. Dick, How to Build a Universe That Doesn't Fall Apart Two Days Later, 1978

Start with the workers nobody is talking about.

Between 2019 and 2023 translator incomes fell by roughly a third as neural machine translation scaled and agencies stopped paying human rates for work the software could approximate. Illustrators watched commissions dry up as generative image tools produced in seconds what had taken hours, a collapse that by 2023 had become the subject of the Andersen v. Stability AI class-action lawsuit, the first major litigation explicitly naming training data scraping as copyright infringement. Entry-level copywriters, transcriptionists, junior creatives, the people who occupied the first rung, found that rung quietly removed.

And underneath all of them: the data labelers. In January 2023, TIME journalist Billy Perrigo revealed that OpenAI had contracted Sama, a Nairobi-based firm, to label toxic content for as little as $1.32 to $3.74 an hour. Workers processed graphic violence, child abuse, and suicide content to make ChatGPT safe for public deployment. Several reported lasting psychological harm. Their work was essential. It has never been reflected in how those models are priced, valued, or discussed.

Their labor is inside the models. The models are now replacing the tier of workers above them.

That is the data. It has a shape, and the shape is not new. The moral logic underneath it is what Philip K. Dick spent his career trying to show us.

The Voigt-Kampff Test Was Never About Androids

Is the android human? No. The question you should ask is: who designed the test that says it isn't?

Paraphrase of the central tension of Do Androids Dream of Electric Sheep?, 1968

Philip K. Dick wrote the same story in many forms across thirty years. It was a story about the reasoning a system deploys when it needs to use something and doesn't want to pay for it.

In Do Androids Dream of Electric Sheep? (1968), the thing being used is an android indistinguishable from a human. The system has a test: the Voigt-Kampff empathy assessment. Androids can't feel genuine empathy, the reasoning goes, placing them outside the category of beings entitled to full moral consideration. The test exists not to determine whether androids suffer. It exists to maintain a boundary that makes their exploitation feel principled rather than convenient.

Dick's horror isn't the android. It's the pattern. The characteristics chosen as the threshold for full humanity are always the characteristics the most exploited humans have historically been said to lack. The enslaved were said to feel differently. The colonized were said to reason differently. Each claim was made by the people who benefited from it, assessed using tools they designed, and revised only under sustained external pressure.

The same structure runs through A Scanner Darkly (1977), where an undercover narcotics officer surveils himself without knowing it. The system uses his own observations as evidence against him. His labor produces the instrument of his own erasure. That mechanism, writing the same sentence in 2026, looks like a photographer whose portfolio trained the model that now undercuts her rates.

And through We Can Remember It for You Wholesale (1966), where manufactured memories become indistinguishable from real ones. Ownership of experience becomes unstable. If a model can produce text indistinguishable from a writer's, and the writer's original text trained the model, what exactly belongs to whom? Dick was asking that question about memory. In 2026, it applies to creative output and the accumulated knowledge of entire professions.

The android is a mirror. In 2026, it's pointed at a dataset.

Dick showed us the system's logic. Le Guin, as we'll see, showed us how good people live inside it.

The Five Hundred Years Old Argument

The past is never dead. It's not even past.

William Faulkner, Requiem for a Nun, 1951

The argument for training AI on creative work without consent or compensation has five components:

1. The material was publicly available. The common lands of 16th-century England were publicly available, used by communities for generations. The Enclosure Acts converted them to private property with the argument that productive use required concentrated ownership. Their loss was categorized as the cost of modernization.

2. Individual contributions are too small to measure. Traditional ecological knowledge embedded in pharmaceutical plants was developed over generations, each contribution genuinely untraceable in isolation. The aggregate value, once extracted and patented, was not small. The communities received nothing.

3. Something genuinely transformative is being built. The gig economy reclassified employees as independent contractors on the basis that the platform was a new category of business. The reclassification was real. So was the removal of employment protections. The transformation and the extraction were not in tension. The transformation was the mechanism of the extraction.

4. The benefits will be broadly shared. This clause has appeared in every prior instance. It has never, by itself, resulted in the foundation being compensated.

5. The practice is standard across the industry. Meaning: the outcome has no single author and therefore no accountability.

Le Guin's The Word for World is Forest (1972) runs this logic as a novel. A corporation harvests an inhabited planet's resources on the basis that the indigenous population's relationship to the land doesn't constitute productive use. Their presence registers as a cost to manage, not a claim to honor. The five clauses appear in the colonizers' internal communications almost verbatim. Le Guin wasn't writing about land. She was writing about the reasoning structure that makes any extraction feel inevitable to the people conducting it.

The novelty is the dataset. The logic is five hundred years old.

Extraction always starts where leverage is lowest. The pattern isn’t incidental, it’s the whole the point.

Paragraph 46, Sub-Section C

You did consent, actually. Paragraph 46, sub-section C.

Schlaema Inc. executive, Joan Is Awful, Black Mirror S6E1, 2023

The Black Mirror episode Joan Is Awful is one of the most directly useful piece of fiction produced about the AI transition.

A streaming service uses AI to generate a dramatized, unflattering version of an ordinary woman's life without her knowledge. When Joan objects, she is shown the terms of service she agreed to when she signed up. Paragraph 46, sub-section C. She consented. Her life was a licensable asset from the moment she clicked agree.

The mechanism is the point: the terms of service as an instrument that converts a person's existence into acquirable material. The consent is technically present. It was designed to be unreadable and irreversible.

Brooker was not writing satire. He was writing a documentary that arrived slightly ahead of its subject.

In June 2023, the same month Joan Is Awful aired, Adobe quietly updated its terms of service to include language permitting it to access user content for AI training purposes, triggering a wave of professional backlash. Meta began opting European users into AI training using their posts and photos in 2024, through a settings update most users never saw. In both cases: paragraph 46, sub-section C. You consented. The document said so.

This is where the push/pull distinction, which has run through this series since Article 1, becomes most viscerally clear. A demand-driven technology, one where users had genuine choice and leverage, would create negotiating pressure that extraction logic can't survive. The push removes that pressure. The adoption isn't a negotiation. It's an installation.

The writers, illustrators, and musicians whose work trained the models posted it publicly because they were sharing, not donating it as raw material for a commercial product they had no stake in. The distinction was never explained, because explaining it would have complicated the acquisition.

Dick's android didn't consent to be a bounty hunter's target. The system had a test that said it wasn't the kind of thing that needed to consent. The training data didn't consent to become a product. The terms of service say it did.

The Lithium, the Water, the Land, and the Child in the Basement

They all know it is there, all the people of Omelas. Some of them have come to see it, others are content merely to know it is there. They all know that it has to be there.

Ursula K. Le Guin, The Ones Who Walk Away from Omelas, 1973

The training data scrape is the extraction everyone can see, once they look. But as Kate Crawford documented in Atlas of AI (2021), it sits on top of another extraction that is older and more literal: the physical one.

The rare earth minerals are mined under conditions that would be illegal in the countries where the models are deployed. The water consumed by data centers at a scale that draws down aquifers in drought-prone regions. The land converted from communities to server farms.

Crawford's argument is precise: every layer of the AI stack, from the lithium in the hardware to the creative work in the training set, is an extraction from a population that didn't set the terms and doesn't share the returns.

The safety discourse running in parallel, the alignment papers, responsible scaling policies, congressional testimony about existential risk, rarely addresses this. Future, speculative, distributed harm is legible to those frameworks. Present, documented, concentrated harm is categorized as a labor issue, a separate conversation, in other rooms.

Le Guin's Omelas is the precise frame. Everyone in the city knows about the child in the basement. They visit, feel the horror, and most return. Not because they're evil, but because the framework they live inside has no mechanism for addressing what they've seen without dismantling what's above it. The present harm is visible. It's been categorized as a different kind of problem.

Most people return to the city.

Invisible by Design

Calling it a cloud is an act of erasure. The cloud is made of rocks, copper, rare earth minerals, labor, and energy. It just doesn't look like it.

Kate Crawford, Atlas of AI, 2021

PKD predicted the justification. And there is something he couldn't have predicted, the thing that makes 2026 different from every previous instance: the extraction is almost invisible by design.

The Enclosure Acts were visible: you could see the fences. The factory system was visible: you could see the workers. Even the gig economy reclassification was visible enough to trigger legal challenges.

The training data scrape happened in silence. The $1.32/hour labeler processed content behind an NDA. The creative work was ingested at scale through automated crawlers that left no trace, filed no request, and offered no receipt. The terms of service that converted public sharing into commercial licensing were designed to be unreadable, and the people who signed them were not the people who wrote them.

The five clauses aren't cynicism, the people deploying them largely believe them. Sincere belief in all five is entirely compatible with the people bearing the cost receiving nothing, which is precisely what the Mirror shows. It doesn't require bad actors. It requires a system that makes costs invisible and precedents that have never resolved in favor of the foundation.

That's what makes the push/pull distinction structural rather than just political: a demand-driven adoption, one where users and workers had genuine leverage, would produce visibility. Negotiation requires disclosure.

The push removes the negotiation, and with it, the need to make the extraction visible at all.

The Pattern Is Not Destiny

The pattern has been interrupted before. Translators organizing through professional guilds, illustrators pursuing class action suits. Writers whose 2023 strike explicitly named AI training as a labor issue for the first time in a major industrial negotiation.

Slow, partial, several steps behind the technology, but present. The leverage has never come from the system. It comes from outside it.

What remains is the question underneath all of it: what happens to the people who aren't being extracted from and aren't yet displaced, who are simply living inside the transition, trying to hold onto a sense of who they are while the economic architecture that gave them that sense is quietly rewritten?

That question doesn't have a strategy answer. It has something older.


Next: what Marx saw on the factory floor in 1844, what Becky Chambers imagined when the robots voluntarily left, and why the identity crisis is already here, before the displacement has fully arrived.


Written by thegeneralist | Product Manager, Founder, Indie hacker, you name it! I've done or am about to! Stay tuned :)
Published by HackerNoon on 2026/04/06