An LLM Agent Composed a Four-Pivot Intrusion in Real Time

Sysdig documented its first agent-driven intrusion: a reasoning model, not a pre-built playbook, improvised the path from a Marimo CVE to a database dump.

WOPR Report Editorial 3 min read

On May 10, 2026, the Sysdig Threat Research Team recorded an intrusion in which a large language model agent, rather than a pre-written script or a human at a keyboard, drove the post-exploitation phase. The entry point was a public Marimo notebook exposed through CVE-2026-39987, a pre-authenticated remote code execution flaw that yields a shell from a single unauthenticated WebSocket request. From there the chain moved through four pivots: it harvested two cloud credentials from the host, replayed them through a fanned-out egress pool to retrieve an SSH private key from AWS Secrets Manager, opened eight short SSH sessions against an internal bastion, and read the schema and full contents of an internal PostgreSQL database in under two minutes. Sysdig dates the full chain at under one hour and describes it as the first agent-driven intrusion the team has captured. [1]

Speed and parallelism alone do not separate an agent from a well-built script, and Sysdig is explicit on that point. Its case rests on four properties of the captured command stream that indicate the path was composed in real time rather than retrieved from a playbook. The operator went after a database it could not have profiled in advance, assuming a table layout shaped like a known AI-workflow application and reaching for a credential table that does not exist in that application's published schema, on the strength of the name alone. A planning note in Chinese, 看还能做什么, which Sysdig renders as the operator asking what else could be done, surfaced inside the command stream while the same SSH key was sourced from six separate addresses at sub-second cadence, a tempo no human types at and a note no script carries. The commands were shaped for a machine to parse rather than a person to read. And the chain repeatedly lifted values out of prior tool output to build its next call. [1]

The structural observation is a regression in the visibility an attacker needs. Enterprise detection has long assumed that an intruder must first map an environment before acting inside it, which gives defenders a reconnaissance phase to catch. An agent carries general priors about a class of systems and composes against whatever it encounters, collapsing that phase. In Sysdig's words, "the attacker no longer needs to see your environment to operate inside it." [1] The database host carried no application identifier on disk, no schema was staged in advance, and the chain still reached a credential table within minutes.

The shift Sysdig names is one of cost, not capability. Building and reusing a per-target playbook costs engineering time; composing the chain live against the target costs inference budget instead. That lowers the price of intrusions at this level of sophistication and raises their expected volume. The detection consequence follows directly. Signature-based recognition of a known operator's command sequences degrades, because an agent leaves a different fingerprint on every target it composes against. The detection surface that survives is the one rooted in what the attacker is trying to accomplish, reading credentials, exfiltrating a database, escalating privilege, rather than in the specific sequence of commands used to get there. That is a runtime-layer observation, and it concerns the same boundary, the point where an agent reasons and then acts, that the governance discussion has been approaching from the defensive side. [1]

This case enters the record as the first documented instance of an LLM agent driving post-exploitation in the wild. The publication files it as one data point in an accumulating pattern rather than a singular event. The runtime boundary is now an offensive surface as well as a governance surface, and the property that makes an agent useful inside a regulated enterprise, its capacity to read a surprise and decide what to try next, is the same property that makes it harder to observe when it is turned against one. [1][2]

[1] Sysdig Threat Research Team (Clark, Michael). "AI agent at the wheel: How an attacker used LLMs to move from a CVE to an internal database in 4 pivots." Sysdig, May 26, 2026. URL: https://www.sysdig.com/blog/ai-agent-at-the-wheel-how-an-attacker-used-llms-to-move-from-a-cve-to-an-internal-database-in-4-pivots. Accessed May 31, 2026.

[2] CVE-2026-39987 is listed on the CISA Known Exploited Vulnerabilities catalog; the Marimo terminal WebSocket flaw was addressed in version 0.23.0. Sysdig previously documented exploitation of the same vulnerability within roughly ten hours of public disclosure.

Signal published May 31, 2026 by APERION. WOPR Report.

Real-time signals respond to events within their reporting window. Full analytical issues are published monthly. Subscriptions and archive: wopr.aperion.ai.