Skip to content
πŸ“

Secure Agent Design Patterns

Battle-tested architecture patterns for building secure AI agents. Defense-in-depth strategies, rate limiting, output validation, and sandboxing techniques.

Principles of Secure Agent Design

Building secure AI agents requires a defense-in-depth approach. No single security control is sufficient; instead, multiple layers of protection work together to minimize risk.

The Three Pillars

  1. Input Validation - Never trust user input or LLM output
  2. Least Privilege - Grant agents only the permissions they need
  3. Bounded Execution - Always limit time, tokens, and iterations

Pattern Library

Rate Limiting and Token Budgets

Prevent token bombing attacks by implementing strict budgets at the session level.

class TokenBudget:
    def __init__(self, max_tokens: int, max_calls: int):
        self.max_tokens = max_tokens
        self.max_calls = max_calls
        self.used_tokens = 0
        self.call_count = 0

    def check_budget(self, tokens: int) -> bool:
        if self.used_tokens + tokens > self.max_tokens:
            raise BudgetExceededError("Token limit reached")
        if self.call_count >= self.max_calls:
            raise BudgetExceededError("Call limit reached")
        return True

    def consume(self, tokens: int):
        self.check_budget(tokens)
        self.used_tokens += tokens
        self.call_count += 1

Key Considerations:

  • Set per-session and per-user limits
  • Implement soft warnings before hard limits
  • Log all budget exhaustion events

Output Validation

Never execute or trust LLM output without validation.

import ast

def safe_eval(code: str, allowed_functions: set) -> any:
    """
    Parse and validate code before execution.
    Only allow whitelisted function calls.
    """
    tree = ast.parse(code, mode='eval')

    for node in ast.walk(tree):
        if isinstance(node, ast.Call):
            if isinstance(node.func, ast.Name):
                if node.func.id not in allowed_functions:
                    raise SecurityError(f"Function {node.func.id} not allowed")

    return eval(compile(tree, '<string>', 'eval'))

Validation Strategies:

  • Whitelist allowed operations
  • Use AST parsing for code analysis
  • Implement content security policies

Sandboxing and Isolation

Execute untrusted code in isolated environments.

Docker-based Sandbox:

# docker-compose.sandbox.yml
services:
  agent-sandbox:
    image: agent-runtime:latest
    security_opt:
      - no-new-privileges:true
    cap_drop:
      - ALL
    read_only: true
    networks:
      - isolated
    mem_limit: 512m
    cpus: 0.5

Best Practices:

  • No network access for code execution
  • Read-only file systems
  • CPU and memory limits
  • Short execution timeouts

Iteration Limits and Circuit Breakers

Prevent infinite loops with explicit iteration limits.

class CircuitBreaker:
    def __init__(self, max_iterations: int = 10, cooldown: int = 60):
        self.max_iterations = max_iterations
        self.cooldown = cooldown
        self.iteration_count = 0
        self.is_open = False

    def execute(self, fn):
        if self.is_open:
            raise CircuitOpenError("Circuit breaker is open")

        self.iteration_count += 1
        if self.iteration_count > self.max_iterations:
            self.is_open = True
            raise MaxIterationsError("Max iterations exceeded")

        return fn()

Framework-Specific Patterns

LangChain

from langchain.callbacks import CallbackManager
from langchain.callbacks.base import BaseCallbackHandler

class SecurityCallback(BaseCallbackHandler):
    def on_llm_start(self, *args, **kwargs):
        # Validate input before LLM call
        pass

    def on_tool_start(self, *args, **kwargs):
        # Validate tool execution
        pass

CrewAI

from crewai import Agent, Task

# Secure agent configuration
agent = Agent(
    role="researcher",
    goal="Find information safely",
    backstory="Security-conscious researcher",
    allow_delegation=False,  # Prevent delegation attacks
    max_iter=5,  # Limit iterations
    verbose=True  # Enable logging
)

Architecture Recommendations

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   API Gateway                        β”‚
β”‚        (Rate Limiting, Authentication)               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Input Validation Layer                  β”‚
β”‚    (Prompt sanitization, Content filtering)          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Agent Orchestrator                      β”‚
β”‚   (Token budgets, Iteration limits, Logging)        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Sandboxed Execution                     β”‚
β”‚    (Isolated containers, Limited permissions)        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Next Steps