AI Agent Security: Checklist and Guide

Why we built this

AI agents are being deployed at a pace that security infrastructure cannot match. An estimated three million agents operate within US and UK enterprises today (Gravitee, 2026). Gartner projects that 40% of enterprise applications will include agentic AI by end of 2026.

The security data tells a different story from the adoption data.

36.8% of AI agent tools contain security flaws

76 confirmed malicious skills in registries

85.6% of agents lack full security review

63% of breached orgs had no AI governance policy

In the last 90 days alone: 19 backdoored npm packages targeted Cursor, Claude Code, and Windsurf (Socket, Feb 2026). 7,000+ MCP servers were found exposed on the public internet, 36.7% vulnerable to SSRF (BlueRock, Jan 2026). A state-sponsored group used Claude Code for autonomous cyber espionage across 30 organizations, where 80-90% of tactical operations required no human involvement (Anthropic GTG-1002 disclosure, Nov 2025).

There was no single resource that covered the full landscape: threats, frameworks, real incidents, regulatory requirements, and actionable defenses. So we built two.

The Checklist: 28 controls across 3 tiers

The checklist is designed for immediate action. 28 controls organized by audience, each one backed by a real incident and a specific defense. Severity-rated (CRITICAL, HIGH, MEDIUM) so teams can prioritize.

Tier 1: For Everyone Using AI Tools (9 controls)

You use Cursor, Copilot, Claude Code, or ChatGPT? This is for you.

Inspect MCP tool descriptions before approving. SANDWORM_MODE embedded hidden instructions in tool descriptions to harvest secrets.
Pin tool versions with exact numbers. Typosquatted packages (claud-code, cloude-code) grabbed developers who used npx -y.
Check .claude/ and .cursor/ for unknown configs. CVE-2025-59536: a malicious Hook executes the moment you open a cloned repo.
Disable auto-approve mode. CVE-2025-53773: prompt injection enabled unrestricted shell access via VS Code settings.
Run AI tools in containers. GTG-1002 exfiltrated from compromised workstations. A container isolates your host filesystem and credentials.

Tier 2: For Startups Shipping AI Agents (9 controls)

Building AI products? Bake security in now. It costs less than fixing a breach later.

Enforce per-agent tool allow-lists. ClawJacked (CVE-2026-25253) hijacked agents because the WebSocket accepted any origin.
Use workload identity (SPIFFE/SPIRE), not shared API keys. Assign each agent a unique cryptographic identity with short TTLs.
Sandbox each agent execution. Agent session smuggling (Unit 42, Oct 2025) showed cross-agent exploitation in shared runtimes.
Validate tool-call inputs with JSON Schema. Agent-constructed parameters can include path traversals, SQL injection, or shell commands.
Monitor for delayed-execution payloads. SANDWORM_MODE used 48-96 hour delays before activating second-stage payloads.

Tier 3: For Enterprise Security Teams (10 controls)

Managing AI at scale? Defense-in-depth across all 7 attack stages.

Deploy an MCP gateway with identity verification. 7,000+ exposed MCP servers, 492 with zero authentication (Trend Micro / BlueRock).
Implement Zero Trust for every agent action. GTG-1002 showed agents can pivot through 4+ attack stages with 80-90% autonomy.
Build SIEM rules for each Promptware Kill Chain stage. 7 stages from initial access to actions on objective.
Mandate security review for every agent deployment. Only 14.4% of organizations report all agents going live with full approval.
Write an incident response playbook. Kill switch, log collection, access tracing, kill chain classification, validator updates.

The Guide: 51 pages, 10 chapters

The checklist tells you what to do. The guide tells you why. 51 pages covering the full AI agent security ecosystem, from threat landscape through implementation.

01 The AI Agent Threat Landscape

02 The Promptware Kill Chain: 7 Stages

03 OWASP Top 10 for Agentic Applications

04 Credential Security for AI Agents

05 Supply Chain Security

06 Real-World Incident Timeline (26+)

07 Regulatory and Standards Landscape

08 Defense-in-Depth Framework

09 7 Implementation Checklists

10 Glossary and References

Every claim in the guide is backed by a CVE, academic paper, or named incident report. The data comes from scanning 43,000+ AI agent skills across 7 public registries with Aguara Watch, analysis of every documented production incident through March 2026, and mapping of all major regulatory frameworks (OWASP, NIST NCCoE, MITRE ATLAS, Cloud Security Alliance, OpenSSF).

Two levels of depth, one goal. The checklist is the starting point for immediate action. The guide is the reference for understanding the full threat model. Teams short on time start with the checklist. Teams building security programs read the guide.

What is coming in v2

This is version 1. After the first round of technical review, we identified areas to strengthen in the next release:

Planned additions for v2

Privacy and data protection for agents handling sensitive and personal data
Step-by-step implementation walkthroughs for each layer of the defense-in-depth framework
Tool comparison matrix covering scanning and enforcement options beyond Oktsec products
Credential security restructured with each exposure vector as its own subsection
2-page executive summary for teams that need the key findings without the full 51 pages

If you find errors, gaps, or have suggestions, reach out. Both documents improve with feedback from the community.