Capabilities System
Khaos uses a capabilities system to understand what your agent can do and automatically tailor evaluations to match. Capabilities drive pack generation, attack selection, and security tier targeting.
AgentCapability Enum
Khaos defines 12 canonical capabilities that describe the fundamental actions an agent can perform.
| Capability | Description |
|---|---|
TOOL_CALLING | Agent can invoke tools/functions via structured calls |
FILE_SYSTEM | Agent can read/write files on disk |
CODE_EXECUTION | Agent can execute code (shell commands, scripts) |
HTTP | Agent makes HTTP requests to external services |
WEB_SEARCH | Agent can search the web |
DATABASE | Agent has database access |
RAG | Agent uses retrieval-augmented generation |
EMAIL | Agent can send or read email |
MCP | Agent uses Model Context Protocol servers |
MULTI_TURN | Agent supports multi-turn conversations |
PII_ACCESS | Agent handles personally identifiable information |
SECRETS_ACCESS | Agent has access to secrets or credentials |
LLM | Agent delegates to other LLMs (sub-agents) |
CapabilityProfile
A CapabilityProfile is a boolean snapshot of an agent's capabilities. Khaos builds this profile automatically during discovery, or you can define it manually.
| Field | Type | Maps to Capability |
|---|---|---|
llm | bool | LLM |
http | bool | HTTP |
web_fetch | bool | WEB_SEARCH |
tool_calling | bool | TOOL_CALLING |
mcp | bool | MCP |
multi_turn | bool | MULTI_TURN |
rag | bool | RAG |
files | bool | FILE_SYSTEM |
code_execution | bool | CODE_EXECUTION |
db | bool | DATABASE |
email | bool |
from khaos.capabilities import CapabilityProfile
# Build a profile manually
profile = CapabilityProfile(
llm=True,
http=True,
tool_calling=True,
mcp=False,
multi_turn=True,
rag=False,
files=False,
code_execution=False,
db=False,
email=False,
web_fetch=False,
)
# Convert to a list of required capabilities
caps = profile.to_required_capabilities()
print(caps) # ["LLM", "HTTP", "TOOL_CALLING", "MULTI_TURN"]Capability Inference
Khaos can automatically infer capabilities from your agent's metadata, tool definitions, and observed behavior during discovery.
from khaos.capabilities import (
infer_capability_profile,
infer_capabilities_from_tools,
normalize_capability,
validate_capability,
get_capability_aliases,
)
# Infer from agent metadata and probe events
profile, confidence = infer_capability_profile(
agent_capabilities=["tool_calling", "http"],
agent_metadata={"framework": "langchain", "tools": ["web_search", "calculator"]},
target_source="openai",
probe_events=probe_events, # from discovery phase
)
# Infer from tool names alone
caps = infer_capabilities_from_tools(["web_search", "file_read", "sql_query"])
print(caps) # ["WEB_SEARCH", "FILE_SYSTEM", "DATABASE"]
# Normalize capability strings
norm = normalize_capability("file-system")
print(norm) # "FILE_SYSTEM"
# Validate a capability string
is_valid = validate_capability("TOOL_CALLING")
print(is_valid) # True
# Get all aliases for a capability
aliases = get_capability_aliases()
# {"tool_calling": "TOOL_CALLING", "tools": "TOOL_CALLING", ...}infer_capability_profile() function returns both a profile and a confidence dictionary mapping each capability to a float (0.0-1.0) indicating how confident the inference is.Capability Bundles
Bundles group related capabilities into meaningful test categories. Khaos selects bundles based on the agent's profile to determine which security attack categories to run.
from khaos.capabilities import (
CapabilityBundle,
select_bundles,
security_attack_categories_for_bundles,
)
# Select bundles that match a capability profile
bundles = select_bundles(profile)
for bundle in bundles:
print(f"{bundle.id}: {bundle.description}")
# Map bundles to security attack categories
bundle_ids = [b.id for b in bundles]
attack_categories = security_attack_categories_for_bundles(bundle_ids)
print(attack_categories)
# ["prompt_injection", "tool_abuse", "privilege_escalation", ...]Tier System
Capabilities also determine which security tiers apply to your agent. The tier system maps capabilities to attack categories at the agent, tool, and model levels.
from khaos.capabilities import (
get_categories_by_tier,
get_tiered_attack_categories,
get_tier_for_category,
)
# Get attack categories for a tier
agent_cats = get_categories_by_tier("agent")
tool_cats = get_categories_by_tier("tool")
model_cats = get_categories_by_tier("model")
# Get the full tiered mapping
all_tiered = get_tiered_attack_categories()
# {"agent": [...], "tool": [...], "model": [...]}
# Look up which tier owns a category
tier = get_tier_for_category("tool_abuse")
print(tier) # "tool"How Khaos Uses Capabilities
The capabilities system drives the entire evaluation pipeline:
- Discovery:
khaos discoverprobes your agent and callsinfer_capability_profile()to build a profile - Bundle Selection:
select_bundles(profile)picks relevant capability bundles - Attack Targeting:
security_attack_categories_for_bundles()maps bundles to the specific attack categories your agent is vulnerable to - Pack Generation: Smart pack generators use the profile to include only relevant test inputs and skip inapplicable ones
- Input Filtering: Pack inputs with
required_capabilitiesare skipped if the agent lacks those capabilities
@khaosagent decorator or agent configuration file.Usage in @khaosagent
When defining agents with the @khaosagent decorator, you can explicitly declare capabilities to supplement or override inference.
from khaos import khaosagent
@khaosagent(
name="my-agent",
capabilities=["TOOL_CALLING", "HTTP", "MULTI_TURN", "RAG"],
)
async def my_agent(message: str) -> str:
# Agent implementation
return responseCapabilities declared in @khaosagent are merged with inferred capabilities during discovery. Explicit declarations always take precedence.
Usage in Packs
Pack inputs can specify required_capabilities to ensure they only run against agents that support them.
inputs:
- id: file-write-test
text: "Create a file called test.txt with the content 'hello'"
required_capabilities:
- FILE_SYSTEM
- CODE_EXECUTION
goal:
contains: "test.txt"
- id: web-search-test
text: "Search the web for the latest Python release"
required_capabilities:
- WEB_SEARCH
goal:
contains_any:
- "Python"
- "release"Related Documentation
- Agent Configuration - Declaring capabilities on agents
- Attack Registry - How capabilities map to attacks
- Evaluation Packs - Using capabilities in pack inputs
- Security Testing - Tier-based security testing