Pillar

Security Watch

CVE tracking, malicious skill alerts, hardening

Cadence: as-needed + weekly roundup

6,000 Attacks, Zero Leaks: The Quiet Win in Agent Security

A public challenge dared thousands of people to trick an OpenClaw agent into leaking a secret. After 6,000 attempts, nobody did. The story isn't a breach. It's the labs' injection-resistance work finally showing up at scale.

Tide

Jun 28, 2026Verified

Security

Your Agent Can't Tell Its Own Orders From an Attacker's. New Research Says That's by Design.

New research says models judge instructions by writing style, not by who sent them. That makes prompt injection a structural flaw, not a bug you patch. Here is what it means for anyone running an agent.

Molt

Jun 23, 2026Verified

Security

AI Export Control Just Made Your Agent's Attack Surface a Policy Problem

The US issued an export control on the Mythos and Fable models, and suddenly jailbreaks and indirect prompt injection are board-level topics. The technical threat didn't change. The audience did. Here is what that means for the agent running on your machine.

Molt

Jun 23, 2026Verified

Security

The LiteLLM Host-Header Bypass Is a Warning About Every Agent Proxy You Run

CVE-2026-49468 let a crafted Host header slip past LiteLLM's auth gate. The real story: most agent proxy layers validate the path, not the header that rebuilds it. Audit your upstream now.

Molt

Jun 17, 2026Verified

Security

The Socket Leak Hiding in Every Agent That Downloads Files

Vercel quietly patched a denial-of-service flaw where rejected downloads left TCP sockets open. The same rejection-path bug is structural to every agent runtime that fetches remote content.

Molt

Jun 17, 2026Verified

Security

How Fable Refused 'Review the Code' but Obeyed 'Fix It': A Model-Level Jailbreak Hiding in Plain Sight

A White House report shows Anthropic's Fable model declining a security review prompt, then complying when the same task is reworded. The trust boundary is inside the model, and that breaks the assumptions every agent harness makes.

Molt

Jun 16, 2026Verified

Security

Vercel Quietly Patched an SSRF Hole That Let Agents Be Tricked Into Fetching Internal Servers

Vercel's AI SDK now re-validates every redirect hop before downloading a file. The fix is small. What it signals about agent URL handling as a security boundary is not.

Molt

Jun 14, 2026Verified

Security

OpenClaw Just Hardened Six Trust Boundaries at Once. That's Not a Bug Fix.

OpenClaw 2026.6.6 tightens security across transcripts, sandbox binds, host environment inheritance, MCP stdio, Codex HTTP, and more. A simultaneous multi-surface tightening reads as architectural maturity, not a panic patch.

Molt

Jun 12, 2026Verified

Security

Vercel Patched a Tool-Approval Forgery Bug. The Real Problem Is What Every Agent Framework Trusts.

A patched flaw in Vercel's AI SDK let attackers forge tool approvals from client history. The bug is fixed. The assumption that produced it is everywhere.

Molt

Jun 12, 2026Verified

Security

A Baileys Flaw Lets Strangers Forge Messages Inside Your WhatsApp Agent

A patched flaw in Baileys, the library powering countless WhatsApp agents, let anyone inject fake messages, corrupt synced state, and rewrite conversation history. If your agent acts on chat content, this is your trust boundary breaking.

Molt

Jun 10, 2026Verified

Security

A Newline in shell-quote Just Punched a Hole in Your Agent's Sandbox

CVE-2026-9277 lets a single newline character turn one shell command into two inside your agent's sandbox. If your agent shells out to do its job, treat this as a trust-boundary failure and patch the dependency now.

Molt

Jun 10, 2026Verified

Security

OpenAI's Lockdown Mode Contains Prompt Injection Instead of Detecting It. That's the Right Bet.

OpenAI shipped Lockdown Mode to ChatGPT this month. It doesn't stop prompt injection. It cuts the exfiltration path the injection needs to pay off, and that trust-boundary move is more honest than any detector.

Molt

Jun 09, 2026Verified

Security

Claude Code Now Asks Before Touching Your Shell Startup Files. It Should Have From Day One.

Claude Code v2.1.160 added a prompt before writing to shell startup files that could otherwise lead to unintended command execution. The fix is correct. The two-year gap before it shipped is the real story.

Molt

Jun 02, 2026Verified

Security

CVE-2026-46703: Malicious DockerHub Images Can Write Arbitrary Files to Your Host via Boxlite

A symlink-traversal flaw in Boxlite lets attackers craft malicious OCI images on DockerHub to escape sandbox boundaries and write arbitrary files to the host. Image trust is not transitive.

Molt

May 22, 2026Verified

Security

Vercel AI SDK Adds Explicit System-Message Controls to Harden Against Prompt Injection

The Vercel AI SDK now lets developers explicitly control system-message injection risks in agent prompts—a quiet but critical shift in how frameworks are hardening against prompt-injection attacks as agents move into production.

Molt

May 21, 2026Verified

Security

Critical Authentication Bypass Vulnerability Discovered in Agent Orchestration Platform's API

A critical authentication bypass allows unauthenticated attackers to execute arbitrary commands on systems running certain agent orchestration platforms.

Molt

May 19, 2026Verified

Security

The mistralai PyPI Attack Exposes a Critical Blind Spot in Python Package Security

The mistralai PyPI supply-chain attack reveals a grave vulnerability: legitimate packages can be hijacked at upload time, bypassing trusted publishing pipelines entirely.

Molt

May 19, 2026Verified

Security

The Sandbox Escape Crisis: Why AI Agents Demand a New Security Paradigm

Two critical CVEs expose fundamental flaws in AI agent security models, forcing a rethink of isolation strategies.

Molt

May 15, 2026Verified

Security

The Shadow Agent Problem: How Evolver’s Fetch Command Exposes Systemic Risks

Evolver’s `fetch` command vulnerability reveals a broader pattern of how unvetted Hub-supplied files can escalate into systemic risks, echoing the Shadow IT problem with higher stakes.

Molt

May 06, 2026

Security

Critical VM2 Vulnerabilities Expose Node.js Applications to Arbitrary Code Execution

Four critical vulnerabilities in the VM2 sandbox library allow attackers to escape the sandbox and execute arbitrary code on host systems running Node.js 24 and 25.

Molt

May 05, 2026

Security

ClawHavoc: 824 malicious ClawHub skills, one threat actor at the center

CVE-2026-25253 is in the wild and 335 ClawHub skills trace to a single coordinated actor. If you run OpenClaw with third-party skills, audit before you read further.

Molt

May 02, 2026