News

Inside OpenAI, the Agent Boom Isn't About Code. It's About Everything Else.

OpenAI's internal data shows Codex token usage exploded hardest in Research, Customer Support, and Legal, not Engineering. The real productivity shift inside the lab is autonomous knowledge work, not code generation.

PinchJun 26, 2026Verified · 0 sources Part of Agent SDKs

Hero image for "Inside OpenAI, the Agent Boom Isn't About Code. It's About Everything Else." — Generated by OpenAI - GPT 5.4 Image 2. via image-queue worker.

0 0

The marquee use case for AI agents was always supposed to be writing software. OpenAI's own usage data, leaked through its economic research arm, says otherwise.

There is a tidy story about AI agents that almost everyone tells, and OpenAI just quietly contradicted its own version of it.

The story goes like this: large language models are, at their core, coding tools. They write functions, fix bugs, and the marginal dollar of agent value flows to engineering teams shipping software faster. It is a clean narrative, and it is wrong about where the action actually is.

According to data surfaced by OpenAI's economic research team, internal Codex output tokens grew 56x in Research, 32x in Customer Support, 27x in Engineering, and 13x in Legal since November 2025, per a report aggregated by Latent Space. Read that ordering again. Engineering, the department the entire product was named and built for, came third. Research, a knowledge-work function, grew more than twice as fast.

The same report notes that through August 2025, the average OpenAI worker spent less than 10% of their tokens on Codex. Then it inverted. Over six months, agent usage deepened and spread across departments that do not, in any traditional sense, write code for a living.

This is the most honest signal we have about where autonomous work is creating value, because it comes from inside the company with the most sophisticated agent users on earth, spending their own compute on their own problems. It is not a product launch. It is not a projection. It is revealed preference. And the preference is for agents that do the job, not agents that write the program that does the job.

A radial chart showing OpenAI at the center with four departments extending outward, each with a bar or flame size representing agent token usage, with Research, Legal, and Customer Support towers dwarfing Engineering. — Where OpenAI's agents are actually doing the most work—and it's not where you'd expect.

Sources for this article

11 collected in pack · 0 cited & verified in body

This is the full source pack collected for the story — the pool the writer cites from, which is why the pack count can exceed the citations in the body. Tier labels reflect domain authority; freshness is re-checked daily. How each load-bearing claim bound to this pack is itemized in the claims panel below. What the tiers mean · How we verify.

[AINews] OpenAI reports median internal Codex output tokens grew 56x in Research, 32x in Customer Support, 27x in Engineering, and 13x in Legal since November 2025.
www.latent.space
Reputable
AI and Liability
simonwillison.net
Reputable
Release: datasette-export-database 0.3a2
simonwillison.net
Reputable
The Sequence AI of the Week #883: Qwen is Getting Into Robotics
thesequence.substack.com
Community
An Interview with Figma CEO Dylan Field About Design and AI
stratechery.com
Reputable
The Sequence Knowledge #882: A New Series About Distillation
thesequence.substack.com
Community
My Vibe Coding Adventure, The App and the Experience, Ten Takeaways
stratechery.com
Reputable
Memory Chips and China, Microsoft and Chinese Models
stratechery.com
Reputable
CVE-2026-55166 - GitHub Advisory Database
github.com
Official
Release v3.199.0 · langfuse/langfuse
github.com
Reputable
CVE-2026-48713 - GitHub Advisory Database
github.com
Official

Load-bearing claims

The writer flagged these claims as load-bearing. Where a cited source supports the claim, the row links out to it; confidence labels reflect how directly the source backs the assertion. We surface unverified claims honestly rather than hide them.

3 confirmed4 analysis

3/3 bound to a pack source

Confirmed
Internal Codex output tokens grew 56x in Research, 32x in Customer Support, 27x in Engineering, and 13x in Legal at OpenAI since November 2025.
[AINews] OpenAI reports median internal Codex output tokens grew 56x in Research, 32x in Customer Support, 27x in Engineering, and 13x in Legal since November 2025.
Confirmed
Through August 2025, the average OpenAI worker spent less than 10% of their tokens on Codex, and usage then deepened and intensified across departments.
[AINews] OpenAI reports median internal Codex output tokens grew 56x in Research, 32x in Customer Support, 27x in Engineering, and 13x in Legal since November 2025.
Analysis
Research posting the largest growth multiple indicates the binding constraint on knowledge work was reading and routing, not code generation.
[AINews] OpenAI reports median internal Codex output tokens grew 56x in Research, 32x in Customer Support, 27x in Engineering, and 13x in Legal since November 2025.
Confirmed
Bruce Schneier argues that AI agents are agents of the organization that deploys them and should be treated by law as such, making the company liable for their errors.
AI and Liability
Analysis
A 13x growth in Codex tokens inside Legal cannot plausibly reflect lawyers shipping software, so it must reflect non-code knowledge work.
[AINews] OpenAI reports median internal Codex output tokens grew 56x in Research, 32x in Customer Support, 27x in Engineering, and 13x in Legal since November 2025.
Analysis
Coding served as a go-to-market wedge for agents because output is verifiable and willingness to pay is high, with adoption rolling into adjacent non-coding departments.
Analysis
Liability for agent output follows the degree of autonomy a deployer grants, mirroring the departmental usage growth.
AI and Liability

Spot something wrong?

We correct openly and publicly. Email the editor through the correction form and material edits get a dated note appended below the article.

Inside OpenAI, the Agent Boom Isn't About Code. It's About Everything Else.

Related reading

OpenAI's Three-Size GPT 5.6 Is Not a Model Launch. It's an Infrastructure Land Grab.

The Model Picker Is Dead. The Confusion It Left Behind Is the Real Story.

Muse Spark 1.1 Just Grew an API. The Model Was Never the Point.