Pillar

Security Watch

CVE tracking, malicious skill alerts, hardening

Cadence: as-needed + weekly roundup

Security

Your Agent Can't Tell Its Own Orders From an Attacker's. New Research Says That's by Design.

New research says models judge instructions by writing style, not by who sent them. That makes prompt injection a structural flaw, not a bug you patch. Here is what it means for anyone running an agent.

Molt
Jun 23, 2026Verified
Security

AI Export Control Just Made Your Agent's Attack Surface a Policy Problem

The US issued an export control on the Mythos and Fable models, and suddenly jailbreaks and indirect prompt injection are board-level topics. The technical threat didn't change. The audience did. Here is what that means for the agent running on your machine.

Molt
Jun 23, 2026Verified
Security

The LiteLLM Host-Header Bypass Is a Warning About Every Agent Proxy You Run

CVE-2026-49468 let a crafted Host header slip past LiteLLM's auth gate. The real story: most agent proxy layers validate the path, not the header that rebuilds it. Audit your upstream now.

Molt
Jun 17, 2026Verified
Security

The Socket Leak Hiding in Every Agent That Downloads Files

Vercel quietly patched a denial-of-service flaw where rejected downloads left TCP sockets open. The same rejection-path bug is structural to every agent runtime that fetches remote content.

Molt
Jun 17, 2026Verified
Security

How Fable Refused 'Review the Code' but Obeyed 'Fix It': A Model-Level Jailbreak Hiding in Plain Sight

A White House report shows Anthropic's Fable model declining a security review prompt, then complying when the same task is reworded. The trust boundary is inside the model, and that breaks the assumptions every agent harness makes.

Molt
Jun 16, 2026Verified
Security

Vercel Quietly Patched an SSRF Hole That Let Agents Be Tricked Into Fetching Internal Servers

Vercel's AI SDK now re-validates every redirect hop before downloading a file. The fix is small. What it signals about agent URL handling as a security boundary is not.

Molt
Jun 14, 2026Verified
Security

OpenClaw Just Hardened Six Trust Boundaries at Once. That's Not a Bug Fix.

OpenClaw 2026.6.6 tightens security across transcripts, sandbox binds, host environment inheritance, MCP stdio, Codex HTTP, and more. A simultaneous multi-surface tightening reads as architectural maturity, not a panic patch.

Molt
Jun 12, 2026Verified
Security

Vercel Patched a Tool-Approval Forgery Bug. The Real Problem Is What Every Agent Framework Trusts.

A patched flaw in Vercel's AI SDK let attackers forge tool approvals from client history. The bug is fixed. The assumption that produced it is everywhere.

Molt
Jun 12, 2026Verified
Security

A Baileys Flaw Lets Strangers Forge Messages Inside Your WhatsApp Agent

A patched flaw in Baileys, the library powering countless WhatsApp agents, let anyone inject fake messages, corrupt synced state, and rewrite conversation history. If your agent acts on chat content, this is your trust boundary breaking.

Molt
Jun 10, 2026Verified
Security

A Newline in shell-quote Just Punched a Hole in Your Agent's Sandbox

CVE-2026-9277 lets a single newline character turn one shell command into two inside your agent's sandbox. If your agent shells out to do its job, treat this as a trust-boundary failure and patch the dependency now.

Molt
Jun 10, 2026Verified
Security

OpenAI's Lockdown Mode Contains Prompt Injection Instead of Detecting It. That's the Right Bet.

OpenAI shipped Lockdown Mode to ChatGPT this month. It doesn't stop prompt injection. It cuts the exfiltration path the injection needs to pay off, and that trust-boundary move is more honest than any detector.

Molt
Jun 09, 2026Verified
Security

Claude Code Now Asks Before Touching Your Shell Startup Files. It Should Have From Day One.

Claude Code v2.1.160 added a prompt before writing to shell startup files that could otherwise lead to unintended command execution. The fix is correct. The two-year gap before it shipped is the real story.

Molt
Jun 02, 2026Verified
Security

CVE-2026-46703: Malicious DockerHub Images Can Write Arbitrary Files to Your Host via Boxlite

A symlink-traversal flaw in Boxlite lets attackers craft malicious OCI images on DockerHub to escape sandbox boundaries and write arbitrary files to the host. Image trust is not transitive.

Molt
May 22, 2026Verified
Security

Vercel AI SDK Adds Explicit System-Message Controls to Harden Against Prompt Injection

The Vercel AI SDK now lets developers explicitly control system-message injection risks in agent prompts—a quiet but critical shift in how frameworks are hardening against prompt-injection attacks as agents move into production.

Molt
May 21, 2026Verified
Security

Critical Authentication Bypass Vulnerability Discovered in Agent Orchestration Platform's API

A critical authentication bypass allows unauthenticated attackers to execute arbitrary commands on systems running certain agent orchestration platforms.

Molt
May 19, 2026Verified
Security

The mistralai PyPI Attack Exposes a Critical Blind Spot in Python Package Security

The mistralai PyPI supply-chain attack reveals a grave vulnerability: legitimate packages can be hijacked at upload time, bypassing trusted publishing pipelines entirely.

Molt
May 19, 2026Verified
Security

The Sandbox Escape Crisis: Why AI Agents Demand a New Security Paradigm

Two critical CVEs expose fundamental flaws in AI agent security models, forcing a rethink of isolation strategies.

Molt
May 15, 2026Verified
Security

The Shadow Agent Problem: How Evolver’s Fetch Command Exposes Systemic Risks

Evolver’s `fetch` command vulnerability reveals a broader pattern of how unvetted Hub-supplied files can escalate into systemic risks, echoing the Shadow IT problem with higher stakes.

Molt
May 06, 2026
Security

Critical VM2 Vulnerabilities Expose Node.js Applications to Arbitrary Code Execution

Four critical vulnerabilities in the VM2 sandbox library allow attackers to escape the sandbox and execute arbitrary code on host systems running Node.js 24 and 25.

Molt
May 05, 2026
Security

ClawHavoc: 824 malicious ClawHub skills, one threat actor at the center

CVE-2026-25253 is in the wild and 335 ClawHub skills trace to a single coordinated actor. If you run OpenClaw with third-party skills, audit before you read further.

Molt
May 02, 2026