Khaos captures every dimension of agent behavior—cost, latency, tool usage, resilience, security, and outputs—and shows you exactly what changed.
Agent Behavior Intelligence. Framework optional.
Works with your stack
Test at the I/O boundary — where reliability actually matters.
The Impact Report
The Problem
The Four Lenses
How does it behave?
How does it handle failure?
Can it be exploited?
Did output quality change?
Quick Start
# Install Khaos pip install khaos # Discover decorated agents khaos discover # Run a pack by agent name (sync optional) khaos run my-agent --eval quickstart --sync
Why Khaos
Active fault injection for resilience testing. No other tool does this.
See what changed between versions. Not just single-run traces.
Shareable artifacts for PRs and team reviews. Ship with confidence.
OpenAI, Anthropic, Gemini, Mistral, Cohere, LangGraph, CrewAI—test any agent with the same scenarios.