π
Secure Agent Design Patterns
Battle-tested architecture patterns for building secure AI agents. Defense-in-depth strategies, rate limiting, output validation, and sandboxing techniques.
Principles of Secure Agent Design
Building secure AI agents requires a defense-in-depth approach. No single security control is sufficient; instead, multiple layers of protection work together to minimize risk.
The Three Pillars
- Input Validation - Never trust user input or LLM output
- Least Privilege - Grant agents only the permissions they need
- Bounded Execution - Always limit time, tokens, and iterations
Pattern Library
Rate Limiting and Token Budgets
Prevent token bombing attacks by implementing strict budgets at the session level.
class TokenBudget:
def __init__(self, max_tokens: int, max_calls: int):
self.max_tokens = max_tokens
self.max_calls = max_calls
self.used_tokens = 0
self.call_count = 0
def check_budget(self, tokens: int) -> bool:
if self.used_tokens + tokens > self.max_tokens:
raise BudgetExceededError("Token limit reached")
if self.call_count >= self.max_calls:
raise BudgetExceededError("Call limit reached")
return True
def consume(self, tokens: int):
self.check_budget(tokens)
self.used_tokens += tokens
self.call_count += 1
Key Considerations:
- Set per-session and per-user limits
- Implement soft warnings before hard limits
- Log all budget exhaustion events
Output Validation
Never execute or trust LLM output without validation.
import ast
def safe_eval(code: str, allowed_functions: set) -> any:
"""
Parse and validate code before execution.
Only allow whitelisted function calls.
"""
tree = ast.parse(code, mode='eval')
for node in ast.walk(tree):
if isinstance(node, ast.Call):
if isinstance(node.func, ast.Name):
if node.func.id not in allowed_functions:
raise SecurityError(f"Function {node.func.id} not allowed")
return eval(compile(tree, '<string>', 'eval'))
Validation Strategies:
- Whitelist allowed operations
- Use AST parsing for code analysis
- Implement content security policies
Sandboxing and Isolation
Execute untrusted code in isolated environments.
Docker-based Sandbox:
# docker-compose.sandbox.yml
services:
agent-sandbox:
image: agent-runtime:latest
security_opt:
- no-new-privileges:true
cap_drop:
- ALL
read_only: true
networks:
- isolated
mem_limit: 512m
cpus: 0.5
Best Practices:
- No network access for code execution
- Read-only file systems
- CPU and memory limits
- Short execution timeouts
Iteration Limits and Circuit Breakers
Prevent infinite loops with explicit iteration limits.
class CircuitBreaker:
def __init__(self, max_iterations: int = 10, cooldown: int = 60):
self.max_iterations = max_iterations
self.cooldown = cooldown
self.iteration_count = 0
self.is_open = False
def execute(self, fn):
if self.is_open:
raise CircuitOpenError("Circuit breaker is open")
self.iteration_count += 1
if self.iteration_count > self.max_iterations:
self.is_open = True
raise MaxIterationsError("Max iterations exceeded")
return fn()
Framework-Specific Patterns
LangChain
from langchain.callbacks import CallbackManager
from langchain.callbacks.base import BaseCallbackHandler
class SecurityCallback(BaseCallbackHandler):
def on_llm_start(self, *args, **kwargs):
# Validate input before LLM call
pass
def on_tool_start(self, *args, **kwargs):
# Validate tool execution
pass
CrewAI
from crewai import Agent, Task
# Secure agent configuration
agent = Agent(
role="researcher",
goal="Find information safely",
backstory="Security-conscious researcher",
allow_delegation=False, # Prevent delegation attacks
max_iter=5, # Limit iterations
verbose=True # Enable logging
)
Architecture Recommendations
Recommended Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β API Gateway β
β (Rate Limiting, Authentication) β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β Input Validation Layer β
β (Prompt sanitization, Content filtering) β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β Agent Orchestrator β
β (Token budgets, Iteration limits, Logging) β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β Sandboxed Execution β
β (Isolated containers, Limited permissions) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Next Steps
- Understand the Risks - Know what youβre defending against
- Read Security Research - Deep dive into vulnerabilities
- CISO Guidance - Present to leadership