Securing LLM Agents - The Security Trifecta in Dokugent Plans¶
Why AI Workflow Security Matters¶
When agents can access private data, process untrusted content, and call external APIs — you have a recipe for chaos. This is what Simon Willison recently called the “lethal trifecta” of agent security risks — and Andrej Karpathy just boosted the signal.
“I should clarify that the risk is highest if you're running local LLM agents (e.g. Cursor, Claude Code, etc.). If you're just talking to an LLM on a website (e.g. ChatGPT), the risk is much lower unless you start turning on Connectors.”
“For example I just saw ChatGPT is adding MCP support. This will combine especially poorly with all the recently added memory features — e.g. imagine ChatGPT telling everything it knows about you to some attacker on the internet just because you checked the wrong box in the Connectors settings.”
— Andrej Karpathy
At Dokugent, we treat this as a first-class design problem, not an afterthought.
Security Metadata at the Plan Level¶
As of June 2025, every step in a Dokugent plan now supports a security object:
This metadata is collected interactively when designing an agent via dokugent plan, and is embedded into the resulting plan.json. This same plan.json also forms the core of the final signed .cert.json file used in Dokugent's certification system — enabling cryptographic traceability without requiring a blockchain layer.
How It Works in the CLI
During the plan wizard:
Involves any of the following? (space to select) [x] Access to Private Data [x] External Communication
During dokugent simulate, this gets evaluated automatically:
⚠️ Step "web_lookup" is marked HIGH RISK due to:
- untrustedContent
- externalComms
→ Skipping this step unless --force is passed.
Why This Matters
Just like secure coding practices, secure agent design needs to be traceable, auditable, and testable. Dokugent enables that by:
- Making risks explicit
- Logging metadata
- Allowing simulation to enforce policy (or warn)
- Enabling future plugins to gate deployment
What’s Next
This is just the beginning. We’re currently experimenting with:
--securemode in simulate to skip or sandbox risky steps- Security scoring in dokugent certify
- A future security
--doctorcommand to lint entire agent workflows
Try It Yourself
npx dokugent init dokugent plan dokugent simulate
Agent workflows should be safe by default and traceable when they aren’t.