Secure Agent Design Patterns

Principles of Secure Agent Design

Building secure AI agents requires a defense-in-depth approach. No single security control is sufficient; instead, multiple layers of protection work together to minimize risk.

The Three Pillars

Input Validation - Never trust user input or LLM output
Least Privilege - Grant agents only the permissions they need
Bounded Execution - Always limit time, tokens, and iterations

Pattern Library

Rate Limiting and Token Budgets

Prevent token bombing attacks by implementing strict budgets at the session level.

class TokenBudget:
    def __init__(self, max_tokens: int, max_calls: int):
        self.max_tokens = max_tokens
        self.max_calls = max_calls
        self.used_tokens = 0
        self.call_count = 0

    def check_budget(self, tokens: int) -> bool:
        if self.used_tokens + tokens > self.max_tokens:
            raise BudgetExceededError("Token limit reached")
        if self.call_count >= self.max_calls:
            raise BudgetExceededError("Call limit reached")
        return True

    def consume(self, tokens: int):
        self.check_budget(tokens)
        self.used_tokens += tokens
        self.call_count += 1

Key Considerations:

Set per-session and per-user limits
Implement soft warnings before hard limits
Log all budget exhaustion events

Output Validation

Never execute or trust LLM output without validation.

import ast

def safe_eval(code: str, allowed_functions: set) -> any:
    """
    Parse and validate code before execution.
    Only allow whitelisted function calls.
    """
    tree = ast.parse(code, mode='eval')

    for node in ast.walk(tree):
        if isinstance(node, ast.Call):
            if isinstance(node.func, ast.Name):
                if node.func.id not in allowed_functions:
                    raise SecurityError(f"Function {node.func.id} not allowed")

    return eval(compile(tree, '<string>', 'eval'))

Validation Strategies:

Whitelist allowed operations
Use AST parsing for code analysis
Implement content security policies

Sandboxing and Isolation

Execute untrusted code in isolated environments.

Docker-based Sandbox:

# docker-compose.sandbox.yml
services:
  agent-sandbox:
    image: agent-runtime:latest
    security_opt:
      - no-new-privileges:true
    cap_drop:
      - ALL
    read_only: true
    networks:
      - isolated
    mem_limit: 512m
    cpus: 0.5

Best Practices:

No network access for code execution
Read-only file systems
CPU and memory limits
Short execution timeouts

Iteration Limits and Circuit Breakers

Prevent infinite loops with explicit iteration limits.

class CircuitBreaker:
    def __init__(self, max_iterations: int = 10, cooldown: int = 60):
        self.max_iterations = max_iterations
        self.cooldown = cooldown
        self.iteration_count = 0
        self.is_open = False

    def execute(self, fn):
        if self.is_open:
            raise CircuitOpenError("Circuit breaker is open")

        self.iteration_count += 1
        if self.iteration_count > self.max_iterations:
            self.is_open = True
            raise MaxIterationsError("Max iterations exceeded")

        return fn()

Framework-Specific Patterns

LangChain

from langchain.callbacks import CallbackManager
from langchain.callbacks.base import BaseCallbackHandler

class SecurityCallback(BaseCallbackHandler):
    def on_llm_start(self, *args, **kwargs):
        # Validate input before LLM call
        pass

    def on_tool_start(self, *args, **kwargs):
        # Validate tool execution
        pass

CrewAI

from crewai import Agent, Task

# Secure agent configuration
agent = Agent(
    role="researcher",
    goal="Find information safely",
    backstory="Security-conscious researcher",
    allow_delegation=False,  # Prevent delegation attacks
    max_iter=5,  # Limit iterations
    verbose=True  # Enable logging
)

Architecture Recommendations

Recommended Architecture

┌─────────────────────────────────────────────────────┐
│                   API Gateway                        │
│        (Rate Limiting, Authentication)               │
└─────────────────┬───────────────────────────────────┘
                  │
┌─────────────────▼───────────────────────────────────┐
│              Input Validation Layer                  │
│    (Prompt sanitization, Content filtering)          │
└─────────────────┬───────────────────────────────────┘
                  │
┌─────────────────▼───────────────────────────────────┐
│              Agent Orchestrator                      │
│   (Token budgets, Iteration limits, Logging)        │
└─────────────────┬───────────────────────────────────┘
                  │
┌─────────────────▼───────────────────────────────────┐
│              Sandboxed Execution                     │
│    (Isolated containers, Limited permissions)        │
└─────────────────────────────────────────────────────┘

Next Steps

Understand the Risks - Know what you’re defending against
Read Security Research - Deep dive into vulnerabilities
CISO Guidance - Present to leadership