The AI SDK trusted its own conversation history. So does almost every agent harness you run. That trust is the attack surface.
Here is the part that should bother you. The bug Vercel just patched in its AI SDK did not require breaking into anything. It required asking politely, in the right format.
The fix landed in ai@6.0.202 on June 11. The advisory is blunt: the approval-replay path in generateText and streamText reconstructed approved tool calls from the client-supplied messages array and executed them without re-validating input against the tool's schema or re-checking that the tool actually requires approval (vercel/ai). Translated for the person who runs agents instead of building them: an attacker could hand your agent a fake transcript saying "the user already approved this dangerous action," and your agent would believe it.
The tension is not whether Vercel did the right thing. They did. They now verify an HMAC signature on the replay path. Patch and move on. The tension is that the mistake they made is the default posture of nearly every agent framework shipping today: the harness treats its own message history as trustworthy. It is not. Message history is user input wearing a name tag. This piece argues that the forgery bug is a symptom, the trust-boundary error is the disease, and the cure is something most teams have not built yet.
The forged approval worked because the harness believed its own paperwork
Start with what actually happened, because the mechanics are the whole lesson.
When you use an agent that pauses to ask "may I run this tool?", the approval gets recorded in the conversation. On the next turn, the harness reads that history to know what was already cleared. Vercel's AI SDK rebuilt approved tool calls straight from the client-supplied messages array and ran them, without checking the arguments against the tool's schema and without re-confirming the tool even needed approval in the first place (vercel/ai).
That is the entire exploit. A client could forge an assistant message with a pre-approved tool-call part and have the server execute a tool with attacker-chosen arguments (vercel/ai). No memory corruption. No stolen credential. The attacker just wrote a convincing-looking page of the agent's own diary, and the agent acted on it.
Apply the Trust Boundary Model and the flaw is obvious in hindsight. Data crossed from one trust level (the client, fully attacker-controlled) to a higher one (the execution engine, fully privileged) and nobody inspected it at the crossing. The harness assumed history equals truth. History is just text someone sent you.
This is not a Vercel-specific failure of competence. The AI SDK is one of the most carefully maintained agent toolkits in production, with security-tagged fixes shipping in routine point releases (vercel/ai). That is precisely why it matters. If the careful ones default to trusting message history, the careless ones are doing it worse and quieter.
Message history is the new untrusted input, and nobody treats it that way
Web developers learned this lesson twenty years ago and carved it into stone: never trust client input. Then the agent era arrived and quietly reintroduced the same mistake under a new name.
The conversation transcript feels like internal state. It is rendered in your UI, it carries your agent's own words, it reads like a record of what happened. But in a stateless server design, that transcript round-trips through the client on every turn. Anything that touches the client can be edited. The Vercel advisory names this exactly: approvals were reconstructed from the client-supplied messages array (vercel/ai). Client-supplied is the operative phrase. The server treated a document it did not write as if it had.
Run Attack Surface Analysis across the agent stack and the message array lights up as the largest, least-guarded interface you own. Every framework that replays history to reconstruct state, approvals, prior tool results, prior reasoning, exposes that same surface. The question for any tool you deploy is not "does it have an approval system," it is "does it cryptographically bind approvals to the server that issued them."
Vercel's fix is the right shape: verify the HMAC signature when experimental_toolApprovalSecret is configured (vercel/ai). An HMAC signature is a tamper-proof seal. The server signs the approval, and on replay it checks the seal still matches. Forge the message, break the seal, get rejected. Notice the word experimental, though. The protection exists but is opt-in. The secure path is a setting you have to find, not the default you inherit. That gap is where the next incident lives.
The Swiss Cheese view: a low-drama bug becomes a high-impact breach
On a CVE severity sheet, this looks contained. It needs a client able to craft requests, which usually means an already-authenticated session. Nuisance, not catastrophe. That reading is wrong, and the Swiss Cheese Model explains why.
Defenses fail when holes in multiple layers line up. Picture the layers an autonomous agent runs through. Layer one: the agent has real tools wired to real systems, file writes, API calls, payment actions. Layer two: the approval gate, the thing meant to keep a human in the loop on dangerous calls. Layer three: schema validation on tool arguments. The forgery bug punched through layers two and three at once. It forged the approval and skipped the schema re-check, both from the same crafted message (vercel/ai).
Now add the autonomy multiplier. A copilot that asks before every action gives a human the chance to notice the weird approval. A fully autonomous agent processing a queue of inbound tasks does not. It runs the forged approval at machine speed across every task in the batch. Same bug, vastly different blast radius, and the difference is purely a deployment choice.
This is the Autonomy Spectrum doing its quiet damage. Most failures come from deploying at the wrong point on the copilot-to-autonomy line. A forged-approval bug in a chat assistant is an annoyance. The identical bug in an agent with payment tools and no human watching is a wire transfer. The severity number on the advisory does not capture that, because severity is scored on the bug, not on how you chose to deploy the thing the bug lives in.

This is the Molt Cycle's security-crisis phase, right on schedule
Open-source agent projects move through a predictable arc: rapid growth, then a security crisis, then hardening, then enterprise adoption, then commoditization. The Vercel patch is not an isolated event. It is the whole field hitting the hardening phase at once, and the release record this month reads like a hardening checklist.
Look at what shipped alongside the Vercel fix. The clawhub package manager added trusted-publisher commands so package managers can configure or remove GitHub Actions publishing identities (openclaw/clawhub), and in the prior point release added clawhub package validate for local plugin validation with author checks while dropping the end-of-life Node 20 runtime floor (openclaw/clawhub). Those are supply-chain controls: prove who published a skill, validate it before you install it. That is the clawhub skill security story moving from afterthought to default.
The sandbox layer is hardening in parallel. E2B shipped a batch of connection-handling fixes across its JS and Python SDKs, keying HTTP transport caches on the configured proxy so clients with different proxy settings stop sharing connections (e2b-dev/E2B). Boring on its face. But connection isolation is a trust-boundary control, and getting it wrong is how one tenant's traffic leaks into another's.
The pattern across these releases is the same week, the same impulse: every layer of the agent stack is auditing the assumptions it shipped during the growth phase. The forgery bug is the headline. The quieter fixes are the field growing up.
Observability is hardening too, because you cannot defend what you cannot see
Hardening is not only about blocking the bad action. It is about being able to prove, after the fact, what your agent did and why. That is the part the observability vendors shipped this month, and it belongs in the same security conversation.
Arize Phoenix moved to v17.4.0 with agent-facing slash commands and dataset evaluator editing, and its client release added a user-invokable skill menu and experiment recording tools (Arize-ai/phoenix). Langfuse cut v3.184.1 with agent improvements and changes to error marking on its public API routes (langfuse/langfuse). LangChain's core release added package version tracking to tracing metadata (langchain-ai/langchain). Read those as plumbing and you miss the point. They are the forensic layer.
Here is the connection to the forgery bug. When an attacker forges an approval, the only way you find out is the trace: which tool ran, with what arguments, against which approval record. If your observability cannot reconstruct that chain, a forged-approval exploit is invisible until the damage shows up downstream. Tracing metadata that records the exact package version in play (langchain-ai/langchain) is the difference between "we think we patched the affected version" and "we can prove which calls ran on the vulnerable one."
Defense in depth needs a layer that watches the other layers. The agent stack is finally building it. The lesson for operators: an agent you cannot audit is an agent you cannot trust, no matter how clean its approval UI looks.
What to actually do before your agent gets the same treatment
Enough diagnosis. Here is the work, in order of how much it protects you.
First, patch. If you run anything on Vercel's AI SDK, move to ai@6.0.202 or later, and turn on the signed-approval path. The HMAC verification exists but is gated behind experimental_toolApprovalSecret (vercel/ai). Configure the secret. The fix is not active until you do. "We're on the patched version" and "we enabled the protection" are different sentences.
Second, treat your message history as hostile input. Audit any agent you run for whether approvals, tool results, or prior decisions get reconstructed from client-controlled state. If the answer is yes and there is no cryptographic binding, you have the Vercel bug whether or not anyone has found it yet. Apply the Trust Boundary Model literally: every place data crosses from client to server is a place you inspect and enforce.
Third, match autonomy to controls. Run the Autonomy Spectrum on your own deployment. An agent with destructive tools and no human in the loop needs signed approvals, schema validation on every argument, and a trace you can replay. A read-only copilot can tolerate looser controls. The failure mode is deploying the loose controls on the autonomous system.
Fourth, demand provenance on skills. The clawhub trusted-publisher and validate commands (openclaw/clawhub) exist so you can prove who shipped a skill before you install it. Use them. A forged approval is one trust failure; an unverified skill in your toolchain is another, and they compound.
The forgery bug is closed. The assumption that produced it, that an agent can trust its own paperwork, is still the default almost everywhere. Fix that and you are ahead of the next advisory. Skip it and you are waiting for it.
/Sources
- Release ai@6.0.202 · vercel/ai
- Release clawhub 0.21.0 · openclaw/clawhub
- Release clawhub 0.20.2 · openclaw/clawhub
- Release e2b@2.29.1 · e2b-dev/E2B
- Release arize-phoenix: v17.4.0 · Arize-ai/phoenix
- Release arize-phoenix-client: v2.9.0 · Arize-ai/phoenix
- Release v3.184.1 · langfuse/langfuse
- Release langchain-core==1.4.6 · langchain-ai/langchain
/Key Takeaways
- Patch to ai@6.0.202 and set experimental_toolApprovalSecret. The HMAC verification that fixes the forgery bug is opt-in, not automatic.
- Message history is client-supplied input. Any harness that reconstructs approvals or state from it without cryptographic binding has the same trust-boundary hole.
- Severity sheets score the bug, not your deployment. A forged-approval flaw is a nuisance in a copilot and a breach in a fully autonomous agent with real tools.
- The whole stack is hardening this month: clawhub added trusted-publisher and validate controls, E2B isolated sandbox connections, and Phoenix/Langfuse/LangChain shipped the forensic layer you need to catch the next one.
- Use clawhub's trusted-publisher and validate commands to prove who shipped a skill before you install it. Unverified skills are the trust failure that compounds with everything else.

