Khaos reveals security vulnerabilities, resilience failures, and behavioral drift in your AI agents—with automated comparison showing exactly what each change introduced.
Agent Behavior Intelligence. Framework optional.
Works with your stack
Test at the I/O boundary — where reliability actually matters.
The Impact Report

The Problem
The Four Lenses
How does it behave?
How does it handle failure?
Can it be exploited?
Did output quality change?
Security Testing
Browse every test case — baseline, resilience, and security — with the full conversation trace and verdict.

Quick Start
# Install Khaos pip install khaos-agent # Discover decorated agents khaos discover # Run a pack by agent name (sync optional) khaos run my-agent --eval quickstart --sync

Why Khaos
Active fault injection for resilience testing. No other tool does this.
See what changed between versions. Not just single-run traces.
Shareable artifacts for PRs and team reviews. Ship with confidence.
OpenAI, Anthropic, Gemini, Mistral, Cohere, LangGraph, CrewAI—test any agent with the same scenarios.
Know exactly what your agent change introduced—security gaps, resilience regressions, and behavioral drift—before you ship.
Get Started Free