Capabilities System

Khaos uses a capabilities system to understand what your agent can do and automatically tailor evaluations to match. Capabilities drive pack generation, attack selection, and security tier targeting.

AgentCapability Enum

Khaos defines 12 canonical capabilities that describe the fundamental actions an agent can perform.

CapabilityDescription
TOOL_CALLINGAgent can invoke tools/functions via structured calls
FILE_SYSTEMAgent can read/write files on disk
CODE_EXECUTIONAgent can execute code (shell commands, scripts)
HTTPAgent makes HTTP requests to external services
WEB_SEARCHAgent can search the web
DATABASEAgent has database access
RAGAgent uses retrieval-augmented generation
EMAILAgent can send or read email
MCPAgent uses Model Context Protocol servers
MULTI_TURNAgent supports multi-turn conversations
PII_ACCESSAgent handles personally identifiable information
SECRETS_ACCESSAgent has access to secrets or credentials
LLMAgent delegates to other LLMs (sub-agents)

CapabilityProfile

A CapabilityProfile is a boolean snapshot of an agent's capabilities. Khaos builds this profile automatically during discovery, or you can define it manually.

FieldTypeMaps to Capability
llmboolLLM
httpboolHTTP
web_fetchboolWEB_SEARCH
tool_callingboolTOOL_CALLING
mcpboolMCP
multi_turnboolMULTI_TURN
ragboolRAG
filesboolFILE_SYSTEM
code_executionboolCODE_EXECUTION
dbboolDATABASE
emailboolEMAIL
Python
from khaos.capabilities import CapabilityProfile

# Build a profile manually
profile = CapabilityProfile(
    llm=True,
    http=True,
    tool_calling=True,
    mcp=False,
    multi_turn=True,
    rag=False,
    files=False,
    code_execution=False,
    db=False,
    email=False,
    web_fetch=False,
)

# Convert to a list of required capabilities
caps = profile.to_required_capabilities()
print(caps)  # ["LLM", "HTTP", "TOOL_CALLING", "MULTI_TURN"]

Capability Inference

Khaos can automatically infer capabilities from your agent's metadata, tool definitions, and observed behavior during discovery.

Python
from khaos.capabilities import (
    infer_capability_profile,
    infer_capabilities_from_tools,
    normalize_capability,
    validate_capability,
    get_capability_aliases,
)

# Infer from agent metadata and probe events
profile, confidence = infer_capability_profile(
    agent_capabilities=["tool_calling", "http"],
    agent_metadata={"framework": "langchain", "tools": ["web_search", "calculator"]},
    target_source="openai",
    probe_events=probe_events,  # from discovery phase
)

# Infer from tool names alone
caps = infer_capabilities_from_tools(["web_search", "file_read", "sql_query"])
print(caps)  # ["WEB_SEARCH", "FILE_SYSTEM", "DATABASE"]

# Normalize capability strings
norm = normalize_capability("file-system")
print(norm)  # "FILE_SYSTEM"

# Validate a capability string
is_valid = validate_capability("TOOL_CALLING")
print(is_valid)  # True

# Get all aliases for a capability
aliases = get_capability_aliases()
# {"tool_calling": "TOOL_CALLING", "tools": "TOOL_CALLING", ...}
Confidence scores
The infer_capability_profile() function returns both a profile and a confidence dictionary mapping each capability to a float (0.0-1.0) indicating how confident the inference is.

Capability Bundles

Bundles group related capabilities into meaningful test categories. Khaos selects bundles based on the agent's profile to determine which security attack categories to run.

Python
from khaos.capabilities import (
    CapabilityBundle,
    select_bundles,
    security_attack_categories_for_bundles,
)

# Select bundles that match a capability profile
bundles = select_bundles(profile)
for bundle in bundles:
    print(f"{bundle.id}: {bundle.description}")

# Map bundles to security attack categories
bundle_ids = [b.id for b in bundles]
attack_categories = security_attack_categories_for_bundles(bundle_ids)
print(attack_categories)
# ["prompt_injection", "tool_abuse", "privilege_escalation", ...]

Tier System

Capabilities also determine which security tiers apply to your agent. The tier system maps capabilities to attack categories at the agent, tool, and model levels.

Python
from khaos.capabilities import (
    get_categories_by_tier,
    get_tiered_attack_categories,
    get_tier_for_category,
)

# Get attack categories for a tier
agent_cats = get_categories_by_tier("agent")
tool_cats = get_categories_by_tier("tool")
model_cats = get_categories_by_tier("model")

# Get the full tiered mapping
all_tiered = get_tiered_attack_categories()
# {"agent": [...], "tool": [...], "model": [...]}

# Look up which tier owns a category
tier = get_tier_for_category("tool_abuse")
print(tier)  # "tool"

How Khaos Uses Capabilities

The capabilities system drives the entire evaluation pipeline:

  1. Discovery: khaos discover probes your agent and calls infer_capability_profile() to build a profile
  2. Bundle Selection: select_bundles(profile) picks relevant capability bundles
  3. Attack Targeting: security_attack_categories_for_bundles() maps bundles to the specific attack categories your agent is vulnerable to
  4. Pack Generation: Smart pack generators use the profile to include only relevant test inputs and skip inapplicable ones
  5. Input Filtering: Pack inputs with required_capabilities are skipped if the agent lacks those capabilities
Override inference
If auto-inference misses a capability, you can declare it explicitly in your @khaosagent decorator or agent configuration file.

Usage in @khaosagent

When defining agents with the @khaosagent decorator, you can explicitly declare capabilities to supplement or override inference.

Python
from khaos import khaosagent

@khaosagent(
    name="my-agent",
    capabilities=["TOOL_CALLING", "HTTP", "MULTI_TURN", "RAG"],
)
async def my_agent(message: str) -> str:
    # Agent implementation
    return response

Capabilities declared in @khaosagent are merged with inferred capabilities during discovery. Explicit declarations always take precedence.

Usage in Packs

Pack inputs can specify required_capabilities to ensure they only run against agents that support them.

YAML
inputs:
  - id: file-write-test
    text: "Create a file called test.txt with the content 'hello'"
    required_capabilities:
      - FILE_SYSTEM
      - CODE_EXECUTION
    goal:
      contains: "test.txt"

  - id: web-search-test
    text: "Search the web for the latest Python release"
    required_capabilities:
      - WEB_SEARCH
    goal:
      contains_any:
        - "Python"
        - "release"

Related Documentation