Security Model

warrant-shell provides application-level security for CLI tools invoked by AI agents. This page documents the threat model, what wsh protects against, what it doesn't, and the design decisions behind those trade-offs.

The core problem

AI coding agents run commands as your Unix user. They have the same file permissions, the same network access, the same credentials as you. There's currently no standard way for a CLI tool to distinguish between you typing a command and an agent executing one programmatically.

wsh addresses this by inverting the access model: instead of trying to detect agents and restrict them, it restricts all callers by default and lets humans prove they're human through password authentication (elevation). This is the same model sudo uses.

Threat model

wsh is designed for a specific scenario: an AI agent running with normal user privileges on a machine where the human retains root access. This is the common case for coding agents today — they run in your terminal, as your user, executing commands on your behalf.

What wsh protects against

Agent exceeding granted capabilities

Every action requires an explicit capability grant. If the policy doesn't include git.push_force, the agent can't force-push. If network.hosts doesn't include evil.com, the agent can't exfiltrate data there. Deny-by-default means unrecognised commands are blocked too.

Policy tampering

The active policy is signed with Ed25519 and stored in a root-owned path (/etc/warrant-shell/). An agent running as a normal user cannot modify the policy, replace the signing keys, or alter the installed warrant. The signature is verified at every command invocation.

Policy rollback

Each policy carries a monotonically increasing version number. The current version is stored in the root-owned signing directory. Old policies — even legitimately signed ones — are permanently retired. An agent that obtains an older, more permissive policy from a backup or git history cannot replay it.

Forging warrants

Without access to the Ed25519 private key (which lives in a root-owned path, readable only by root), an agent cannot create a valid signature. The cryptographic guarantee is straightforward: no key, no valid warrant.

Brute-forcing elevation

Inline password prompts implement exponential backoff (1s, 2s, 4s, 8s) and lockout after 5 consecutive failures. During lockout (minimum 15 minutes), the only way to bypass the warrant is sudo wsh elevate, which uses the system's own sudo mechanism. The failure counter is stored in a root-owned path so agents can't reset it.

Interpreter escape

Interpreters (Python, Node.js, Ruby, Perl) are denied by default. If a developer adds one via wsh add python, the agent gains access to that language’s full standard library — file I/O, network requests, subprocess spawning — which can bypass wsh’s command-level policy. This is why each interpreter must be explicitly opted into: the developer makes a conscious security tradeoff for each language their project requires.

Environment variable injection

Agents can set environment variables like LD_PRELOAD, PYTHONPATH, or NODE_OPTIONS to subvert the behaviour of allowed commands. wsh's enforcement checks the command and its arguments against the policy before execution, but the manifest system is also designed to detect common injection vectors.

Symlink-based path traversal

An agent might create a symlink at an allowed path that points to a restricted location. wsh handles common path-based checks, though for complete filesystem isolation, pairing with a container (Docker, warrant-box) provides stronger guarantees.

What wsh does NOT protect against

Being honest about limitations is part of the security model. These are explicitly out of scope:

Agent with root access

If the agent has root access (or can obtain it), all local guarantees are void. This is true of every local security boundary — including the operating system's own. wsh assumes the human retains exclusive root access.

Kernel-level bypass

wsh operates at the application level. It does not intercept syscalls, sandbox processes, or enforce kernel-level MAC policies. A sufficiently privileged or clever agent that finds a way to execute binaries without going through wsh exec can bypass enforcement.

Network-based attacks

wsh is local-only. It doesn't monitor network traffic, inspect packet contents, or enforce firewall rules. For tools that primarily interact with remote APIs, server-side access control (AWS IAM, Google Cloud IAM, etc.) is the appropriate control.

Social engineering the human

The lock flow presents a diff and asks for confirmation. wsh cannot prevent a careless human from approving a dangerous policy. Similarly, an agent that persuades the human to run wsh elevate on its behalf is exploiting the human, not a technical weakness.

The layered defence

wsh combines five established security primitives. Each solves a specific problem; together they create layered defence:

Layer	Threat	Defence
Capabilities	Agent does something it shouldn't	Explicit allowlist; deny by default
Ed25519 signing	Agent modifies its own permissions	Policy verified against root-owned key
Privilege separation	Agent escalates to modify signing keys	System-owned paths; requires sudo
Monotonic versioning	Agent replays an older policy	Version must strictly increase
Privileged audit daemon	Agent tampers with audit history	Events written via Unix socket to root-owned daemon; peer credential authentication; SHA-256 hash chain

No single layer is sufficient. Capabilities without signing are just a config file. Signing without privilege separation means the agent can replace the keys. Privilege separation without monotonic versioning leaves rollback attacks open. And without a privileged audit daemon, an agent running as the same user could tamper with its own audit trail.

Audit daemon (`wsh-auditd`)

In system mode, audit events are not written directly to a log file. Instead, wsh sends events over a Unix domain socket to a privileged daemon (wsh-auditd). The daemon:

Authenticates the caller using kernel peer credentials (SO_PEERCRED on Linux, getpeereid() on macOS)
Appends events to a root-owned daily ledger file (/var/lib/warrant-shell/audit/audit-YYYY-MM-DD.jsonl)
Chains each entry with a SHA-256 hash of the previous entry, making any tampering or deletion detectable
Returns an acknowledgement before wsh allows the command to proceed — if the daemon is unreachable, commands are denied (fail-closed)

The hash chain can be verified at any time with wsh audit verify.

Recommended deployment

For most users: wsh alone

If your agents run in your terminal as your user — which is the case for Claude Code, Codex, Cursor, and most agent frameworks — wsh provides strong application-level enforcement. It prevents the vast majority of real-world agent misuse: accidental destructive commands, data exfiltration, scope creep.

For high-security environments: wsh + container isolation

For maximum security, pair wsh with a container or sandbox:

Docker (sandboxing + wsh) — run the agent in a container with wsh installed. wsh handles application-level policy; Docker handles kernel-level isolation (filesystem, network, process namespace).
AppArmor / SELinux (kernel-level protection + wsh) — mandatory access control for processes. Heavier to configure, but provides OS-level enforcement that complements wsh's application-level policies.
warrant-box (sandboxing + kernel-level protection + wsh) — (coming soon) a purpose-built container with wsh as the shell, default-deny network and filesystem, and pre-configured policies.

wsh is the practical middle ground between "no restrictions" and full sandboxing. For many teams, it's enough on its own. For the highest-risk scenarios, it's one layer in a defence-in-depth stack.