Claude Unlocked 1 Million Tokens For Everybody: What Happens Now?

If Claude is part of your workflow, the new 1 million token limit from Anthropic is a big deal.

The news about Anthropic unlocking 1 million tokens landed at #1 on Hacker News with over 1,100 points and 485 comments. That’s significant traction.

The context window makes headlines, but the interesting part is what Anthropic did around it: pricing, benchmarks, and the strategic message underneath.

Today, we’re going to look at what changed, why it matters, and what it means if you’re choosing between Claude and ChatGPT right now.

AI product manager, builder, and product thinker. I write Product with Attitude, a newsletter about building with AI.

If this is your first time here, welcome!

Here’s what you might have missed:

→ I Tested Perplexity Computer Hard. Here’s How I’d Save Credits Now.

→ Nine Days Inside Claude Cowork: A Reality Check

What’s Inside

What Anthropic announced on March 13, 2026
What tokens are and why context windows matter
The product strategy behind the pricing change
What this means for builders and researchers
Claude vs ChatGPT vs Gemini today
What still doesn’t work
Why this matters for critical AI literacy

What Changed On March 13

On March 13, 2026, Anthropic announced that the 1 million token context window is now generally available for Claude Opus 4.6 and Claude Sonnet 4.6.

What Tokens Are

If you’re not a developer, “1 million tokens” can feel abstract, so I want to translate it into something intuitive.

Every AI model has a context window. We can think of it like a whiteboard.

Everything we type, everything the AI responds, every document we paste, it all goes on the whiteboard. When the whiteboard fills up, the oldest content gets erased. The AI forgets it.

A token is roughly three-quarters of an English word. So 1,000 tokens is about 750 words. A little under two pages.

How Tokens Evolved

For most of AI history, these whiteboards were tiny. GPT-3 had a whiteboard big enough for about six pages. GPT-4 expanded to 32,000 tokens, roughly 64 pages, which felt enormous at the time.

Then, Anthropic started testing 1 million tokens. A whiteboard big enough for:

750,000 words. That’s about 75 average non-fiction books, or War and Peace plus the entire Harry Potter series.
3,000 pages of text.
110,000 lines of code.
600 PDFs or images in a single session. Up from 100 previously.

That’s been in beta for a while. But, on March 13, 2026, it went fully live for everyone with access. At no extra charge.

The Product Strategy Signal Buried in the Price

Before this week, if we were using Claude through the API and our conversation went past 200,000 tokens, we got charged a premium rate.

A “you used too much” surcharge. Opus 4.6 input jumped from $5 to $10 per million tokens. Sonnet 4.6 from $3 to $6.

That’s gone now.

A 900K-token request is now billed at the same per-token rate as a 9K one.

This is a deliberate competitive move.

OpenAI’s GPT-5.4 charges double once you cross 272,000 tokens.

That surcharge kicks in automatically, applies retroactively to your whole session, and catches a lot of developers off guard. Anthropic just removed that friction entirely.

The product strategy is clear: combine flat pricing with better long-context recall and dominate production AI workflows.

The Part Anthropic Had To Prove

Having a big context window and reading files accurately are two different things.

A large context window is capacity.
Accurate reading is capability.

I’ve experienced this several times:

uploading a full meeting transcript, asking what was decided about the budget, and getting back the thing that was discussed but not decided yet.
pasting a 50-page doc into Claude, asking about something on page 3, and having the model answer confidently about something from page 45.

So, to show this upgrade mattered, Anthropic had to prove two things: the window got bigger, and the reading got sharper.

There's a benchmark that tests this.

Picture a 3,000-page document, roughly the equivalent of ten thick novels stacked on top of each other.

Not near the front, not near the back, but scattered throughout, buried inside thousands of unrelated sentences. Fact #1 might be on page 47. Fact #2 on page 2,990.

The test, called MRCR v2 (Multi-Round Coreference Resolution), asks the AI to find all 2.

The score tells you what percentage of the time the model finds every single one. Not “found 1 out of 2.” All, or it doesn’t count.

Here’s what the scores look like, according to Anthropic’s GA announcement:

This is what makes the 1M window useful rather than just a marketing number. A big whiteboard we can’t read accurately is worse than a small one we can.

Two caveats I noticed about the MRCR v2 benchmark:

First, the 78.3% figure comes from Anthropic’s own announcement. Independent verification is still pending.
Second, several developers on Hacker News described the ‘dumb zone’ challenge - the model finds the facts, but ignores decisions it made earlier in the same session.

Less Compaction, Less Lost Context.

Before: You’re three hours into a Claude Code session. You’ve pulled database records, cross-referenced documentation, and searched source code.
Then, inevitably, compaction kicks in. Claude auto-summarizes everything to make room. And unless you proactively saved everything, the architecture decision from hour one is gone.

After:
A 1M context window keeps the whole session together. According to According to Anthropic's GA announcement, compaction events fell by 15%. That means fewer resets and less lost context.

What 1M Context Changes For Different Kinds Of Work

Every article I read about the 1M context window focused on codebases and Claude Code. That makes sense. Developers are the loudest audience and the most immediate beneficiaries.

But the implications go further.

For newsletter writers and content creators:

We can feed it an entire research project. Research papers, interview transcripts, competitor reports, and our own previous writing can all live in a single session.
We can ask questions across everything at once, without copy‑pasting between tabs or losing context when we switch documents. Six months of Substack posts plus a pile of research becomes something Claude can scan for gaps, patterns, and themes we haven’t written about yet.

Research workflows:

We can drop in full papers, datasets in text form, long email threads, and meeting notes, then query across all of it.
Instead of working folder by folder, we can treat the whole project as one searchable surface and ask more ambitious questions, such as: Which passages support this hypothesis?

For people relying on agents:

Long‑running agents no longer have to forget the beginning of the job by the time they reach the end. An agent that searches a database, reviews logs, cross‑references documentation, and writes a report can now keep the whole job in view.

For product managers:

We can load a full quarter of customer interviews, support tickets, and feature requests into one session and ask for synthesis. And save days of manual work.

For AI builders and developers:

We can paste an entire codebase into one session.
Claude can keep the architecture, logs, config files, and failed attempts in the same window instead of summarizing away the details.

We’re moving from AI that helps with tasks to AI that can run entire workflows without losing the thread halfway through.

What This Means For ChatGPT Users vs. Gemini Pro Users vs. Claude Users

This is the question most of us care about, so you’ll find infographics below that walk through the answer.

ChatGPT App Users (Plus, Pro, or free)

If you’re using ChatGPT through the website or app, the whiteboard is between 5 and 8 times smaller right now.

The context window depends on which model is active:

GPT-5 Fast (default in Plus): 128,000 tokens
GPT-5 Thinking (deep reasoning mode): 196,000 tokens
o3/o4-mini: up to 200,000 tokens

That’s not a knock on ChatGPT’s quality. But it means you’ll hit a wall on long documents, big projects, or complex sessions that Claude now handles at full capacity.

In practice, this shows up as:

Pasting a long document and getting an answer that misses something from the middle
Starting fresh sessions because the old one “forgot” too much
Uploading files and getting partial analysis because only a slice fits in the active context

If you’re doing that kind of work regularly, this announcement is a real reason to try Claude.

OpenAI API Users

The picture here is more nuanced:

GPT-5.4 via API also supports 1.05 million tokens, but with a catch.
At scale, it has a pricing cliff that Claude doesn’t.
Below 272K input tokens, it’s $2.50/M input.
Cross that threshold, and OpenAI charges 2x input and 1.5x output for the full session, not just the overage.
A request with 300K input tokens costs roughly twice what a 250K request costs. That’s the kind of billing surprise that shows up at the end of the month, not in the middle of a session.
GPT-4.1 via API supports 1 million tokens at flat pricing: $2/M input, no surcharge. Genuinely flat, like Claude.
For API builders doing long-context work: GPT-4.1 flat vs. Claude flat are now comparable.

Gemini 3.1 Pro Users:

Gemini 3.1 Pro also has a 1M token context window, so on raw size, it's level with Claude.
Gemini handles different file types (text, images, audio, video, and PDFs) more naturally.
For text-heavy work like documents, code, or research, Claude’s recall at 1M tokens is stronger.

Claude Users:

If you’re on Max, Team, or Enterprise, the 1M window is now automatic with Opus 4.6; you don’t need to change any settings, and there’s no extra charge.
If you’re on Pro, you need to opt in by typing /extra-usage in Claude Code. It doesn’t come on automatically.

Also worth noting: Anthropic launched a March 2026 bonus usage promotion across all plans, including Free. If you want to experiment with longer sessions this month, it’s a good time to try.

The Things That Still Don’t Work Well

We need to be honest here.

Several developers tested this hard and shared what broke:

Cost adds up fast. A 900,000-token session with Opus 4.6 costs roughly $4.50 in input tokens alone. Fine for one-off research. Dangerous if we’re running this in a loop. Know the numbers before shipping to production.
The token trap. One developer tested Claude inside Cursor and watched a single AI tool call consume 800,000 tokens by pulling an entire database. Bigger window means a bigger cost explosion if we’re not careful.
If we’re building agents, we still need to be deliberate about what gets fed in. Context discipline still matters.
Context rot is not solved yet.
AlphaSignal's analysis points out that all long-context models still degrade at extreme lengths. Content near the beginning and end of the window is recalled more reliably than details buried in the middle.
One developer in the HN thread described it this way:

Opus runs into an issue and says ‘You know what? I know a simpler solution’ and continues down a path I explicitly voted down.

Larger Context Window vs Critical AI Literacy

From a critical AI literacy perspective, my recommendation is simple: don’t let bigger context windows make you lazy.

1M tokens can encourage less disciplined prompting, context management, and upfront thinking.

A bigger window does not mean we can stop planning, but it may mean that the cost of poor planning scales faster.

This is the critical literacy lens I keep coming back to on this newsletter: when the AI tools get more powerful, the human skill that matters most is knowing how to use them well.

What This Decision Tells Us About Anthropic

This pricing change is a product strategy signal worth reading carefully.

Anthropic’s revenue is built primarily on business customers and API usage. Revenue Memo reports that roughly 70-75% of total revenue comes from API and token-based consumption, with another ~18% from Claude Code.

That’s the opposite of OpenAI, where a larger share comes from consumer subscriptions.

This means Anthropic wins when developers and companies build serious, long-running workflows on top of Claude.

Every friction point that stops a team from committing to Claude in production is a revenue leak.

The long-context surcharge was one of those friction points. Removing it is a strategy to drive deeper adoption.

The numbers suggest they can afford it:

Anthropic raised $30 billion at a $380 billion valuation in February 2026.
Total annualized revenue is approaching $19 billion.
Claude Code alone has grown past $2.5 billion in run-rate revenue since the start of the year.

They have the runway to absorb the pricing compression and wait for volume to catch up.

One interesting exception: The opt-in friction for Pro users (they need to type /extra-usage in Claude Code to unlock 1M) breaks the “no friction” pattern.

Making it automatic only for Max, Team, and Enterprise protects margins while creating a clear upgrade signal.

The Bottom Line

The 1M context window:

Removes the pricing penalty for long-context work. Flat rates across the full window.
Delivers Claude’s best long-context recall. 78.3% on MRCR v2, compared to 26.3% for Gemini and 18.5% for the previous best, Claude.
Makes long-running agents, big document analysis, and full-codebase sessions practical for the first time without watching the bill.
Widens the gap between Claude and ChatGPT for users who work with long, complex projects in the app.

For us as builders, the new question is: what would we build if forgetting stopped being a constraint?

Additional Resources

If you’re exploring Claude and want to use it better, these articles are a helpful place to begin: