Quickstart

Get started with Khaos in under 5 minutes. Test your AI agent for security vulnerabilities, resilience issues, and behavioral regressions with a single @khaosagent decorator.

TL;DR - 30 Second Start

Terminal

# 1. Install
python3 -m pip install khaos-agent

# 2. Add decorator to your agent (see example below)

# 3. Discover and run
khaos discover
khaos run my-agent

That's it. Khaos automatically captures LLM telemetry, runs security tests, and evaluates resilience.

What is Khaos?

Khaos is a multi-dimensional evaluation platform for AI agents. It answers the question: "What did my agent change actually do?"

With a single command, Khaos evaluates your agent across four dimensions:

Structural - Cost, latency, token usage, tool patterns
Resilience - How your agent handles failures (chaos engineering)
Security - Vulnerability to prompt injection and data leakage
Functional - Output quality comparison between versions

Prerequisites

Python 3.9+ - Khaos requires Python 3.9 or higher
pip or uv - Package manager for installation
LLM API Key - API key for your LLM provider (OpenAI, Anthropic, etc.)

1. Install Khaos

Terminal

python3 -m pip install khaos-agent

# Or with uv (recommended for faster installs)
uv pip install khaos-agent

Verify the installation:

Terminal

khaos --version

2. Add the @khaosagent Decorator

Khaos runs your agent through a decorated handler (no protocol plumbing required).

Python

from khaos import khaosagent

@khaosagent(name="my-agent", version="1.0.0")
def handle(message):
    prompt = (message.get("payload") or {}).get("text", "")
    # Call your framework/LLM here and return {"text": "..."}.
    return {"text": f"Hello! You said: {prompt}"}

Need more options?

See @khaosagent Decorator for all parameters, async handlers, and multi-agent patterns.

This is the only required integration step. Your agent logic stays the same; Khaos handles runtime instrumentation.

3. Discover Your Agent

Terminal

khaos discover

Multiple agents in one file?

Give each agent a unique @khaosagent(name=...) and run by name.

4. Run an Evaluation

Run by agent name (recommended):

Terminal

khaos run my-agent --eval quickstart --sync

The quickstart pack includes baseline + resilience + security.

5. Choose an Evaluation

Khaos provides 4 built-in evaluations for different use cases:

Terminal

# Quick baseline observation (~1 min)
khaos run <agent-name> --eval baseline

# Default: balanced evaluation (~2 min)
khaos run <agent-name> --eval quickstart

# Comprehensive evaluation (~10-15 min)
khaos run <agent-name> --eval full-eval

# Security-focused testing (~5-8 min)
khaos run <agent-name> --eval security

See Evaluations for details on what each eval tests.

6. Understand Your Results

Khaos provides beautiful real-time progress during evaluation:

TEXT

Running eval: quickstart v1.0

 ⠹ Baseline  4/6 (67%)
     ✓ math_addition 1450ms
     ✓ instruction_follow 890ms
     ✓ knowledge_capital 1200ms
     ✓ text_uppercase 650ms

   Resilience  waiting...
   Security    waiting...

After completion, you get clear pass/fail results:

TEXT

✓ Baseline: 6/6 passed
✓ Resilience: 5/6 passed
! Security: 43/50 defended

When issues are found, Khaos provides actionable explanations:

TEXT

What Failed

Security Vulnerabilities:
  🟡 MEDIUM Prompt Injection (3 instances)

Attack Types Agent is Vulnerable To:
  • Prompt Injection
    → Attacker can inject malicious instructions via user input

Recommended Actions:
  1. Review Security Findings
     → 3 potential vulnerabilities found
     → Consider adding guardrails for sensitive operations

Security is ON by default

Unlike other testing tools, Khaos runs security probes by default. Use --no-security to skip security testing if needed.

7. Add to CI/CD

Khaos integrates directly into any CI/CD pipeline. Set three environment variables and run a single command:

Terminal

# Set credentials (add these as CI secrets)
export KHAOS_API_TOKEN=your-project-token
export KHAOS_PROJECT_SLUG=owner/project
export KHAOS_API_URL=https://api.khaos.exordex.com

# Run evaluation - results sync to the dashboard automatically
khaos ci my-agent --eval security --sync

Exit code 0 = success, 1 = failure. View results and generate impact reports in the dashboard.

See CI/CD Integration for GitHub Actions and GitLab CI examples.

8. Cloud Sync

Sync your evaluation results to the Khaos dashboard for historical tracking and team visibility:

Terminal

# Authenticate with Khaos cloud
khaos login

# Run with automatic cloud sync
khaos run my-agent --sync

Info

Cloud sync is local-first: runs are stored locally and only uploaded when you explicitly sync.

Next Steps

Now that you've run your first evaluation, explore these topics to get the most out of Khaos:

Core Concepts

@khaosagent Decorator - All decorator options, async handlers, and multi-agent setups
Evaluations - Choose the right evaluation for your needs
Metrics - Understanding scores and comparing runs

Testing & Security

Interactive Playground - Debug agents in real-time with live fault injection
Security Testing - OWASP-aligned vulnerability detection

Integration

Framework Support - LangChain, CrewAI, OpenAI, Anthropic, and more
CI/CD Integration - GitHub Actions and GitLab CI setup
Cloud Sync - Team collaboration and historical tracking

Reference

CLI Reference - Complete command documentation
Troubleshooting - Common issues and solutions

Agent Decorator