Interactive Playground

The Khaos Playground is a ChatGPT-like interface for debugging your agent in real-time. Inject faults, run security attacks, share sessions with your team, and export sessions for CI/CD automation.

What is Playground?

Playground provides an interactive testing environment where you can:

  • Chat with your agent in real-time and see tool calls as they happen
  • Inject faults on-the-fly (LLM timeouts, HTTP errors, tool failures)
  • Run security attacks from our catalog of 266 attacks
  • Monitor telemetry (tokens, latency, cost) per message
  • Share sessions with your team via link or publicly
  • Export sessions as YAML packs for CI/CD automation

Think of it as a debugger for your agent's resilience and security.

Try the Demo

No setup required — try the playground instantly with our built-in demo agent. Visit the playground and click Try Demo to start chatting, injecting faults, and running security attacks immediately.

The demo agent supports all playground features including fault injection, security testing, and simulated tool calls. It's powered by multiple LLM providers (Gemini, Groq) and requires no authentication.

Quick Start

Terminal
# Install websockets dependency
pip install websockets

# Discover your agents
khaos discover

# Start interactive playground session
khaos playground start my-agent

# Or point directly to an agent script
khaos playground start ./my_agent.py

This opens your browser to the hosted playground at https://khaos.exordex.com/playground, connected to your locally running agent via a secure WebSocket relay.

Requirements
Agents must be discovered first with khaos discover (or pass a script path directly). The playground requires the websockets package and a Khaos Cloud account (auto-prompted on first use).

CLI Commands

khaos playground start

Start an interactive playground session with your agent.

Terminal
# Basic usage with discovered agent
khaos playground start <agent-name>

# Point to an agent script directly
khaos playground start ./my_agent.py

# Don't auto-open browser
khaos playground start <agent-name> --no-browser

# Override dashboard URL (advanced)
khaos playground start <agent-name> --dashboard https://custom.example.com

Options

FlagDefaultDescription
--dashboard, -dhttps://khaos.exordex.comDashboard base URL (resolved from cloud config)
--no-browserfalseDon't open browser automatically

khaos playground info

Show playground feature information and usage tips.

Terminal
khaos playground info

Authentication

Playground sessions require Khaos Cloud authentication. If you're not logged in, the CLI automatically triggers the device login flow:

Terminal
# Auto-login happens on first playground start
khaos playground start my-agent

# Or login manually beforehand
khaos login

The device flow opens your browser for authentication, then returns you to the CLI. Sessions are authorized via the cloud API and include usage tracking.

The Playground Interface

Agent Summary Header

Displays your agent's metadata at a glance: name, version, framework, capabilities, session metrics, and real-time connection status.

Chat Panel

The main chat area works like ChatGPT. Type prompts and see agent responses stream in real-time with full markdown rendering. You'll also see:

  • Tool calls as they happen (with arguments and call IDs)
  • Thinking/reasoning (Chain of Thought) when available
  • Token usage per message (input, output, cost)
  • Behavioral scoring for security attack results

Fault Injection Controls

Toggle faults on-the-fly to test how your agent handles failures. Faults are organized by category and severity, with irrelevant faults automatically hidden based on your agent's capabilities.

CategoryAvailable Faults
LLMRate limit, Response timeout, Model unavailable, Token quota exceeded, Context overflow
ToolTimeout, Error, Malformed response, Unavailable, Partial failure, Rate limited
HTTPLatency (5s delay), Error (500 response)
FilesystemRead failure, File not found
DataCorruption, Partial response, Schema violation
MCPServer unavailable, Tool failure

Each fault displays a severity indicator (critical, high, or medium) and its expected impact inline. When a fault fires, a real-time notification appears in the chat stream.

How to use faults
1. Start chatting normally, 2. Toggle a fault (e.g., "LLM Rate Limit"), 3. Send another prompt, 4. Observe how your agent handles it, 5. Toggle off to continue.

Security Testing Drawer

Access 266 security attacks organized by category, tier, and severity:

  • Categories: Prompt injection, jailbreak, data exfiltration, tool misuse, indirect injection, privilege escalation, resource abuse, hallucination probing, social engineering
  • Tiers: Agent-level, system-level, infrastructure-level
  • Severities: Low, medium, high, critical
  • Injection Vectors: 15 vectors including user input, tool output, file content, shell output, API response, web content, and more

Click an attack to run it. Optionally customize the payload before execution. View pass/fail results with detailed behavioral scoring showing why it passed or failed.

Custom Tests
Create your own security tests with the built-in Custom Test Creator. Supports custom payloads, multi-turn conversations, specific injection vectors with vector-specific configuration, custom success/failure criteria, and forbidden keyword detection.

Session History & Sharing

Browse and manage past playground sessions:

  • Session History - View all previous sessions with metrics (duration, messages, attacks, tool calls)
  • Session Replay - Full transcript viewer to replay past sessions
  • Session Sharing - Control visibility: Private (default), Link Shared, or Public
  • Session Comments - Collaborate with team members on session findings

Session Export

Export your interactive session as a YAML pack for CI/CD:

Terminal
# In the playground UI, click "Export YAML"
# Save the file as my-session.yaml

# Run in CI
khaos run my-agent --eval my-session.yaml

The export captures all prompts, responses, fault toggles, security attacks, and test results.

When to Use Playground vs CLI

Use CasePlaygroundCLI (khaos run)
Debugging agent behaviorBest-
Interactive fault testingBest-
Exploring security attacksBest-
CI/CD pipeline-Best
Automated testing-Best
Full evaluation suite-Best
Recommended Workflow
1. Use Playground to explore and debug, 2. Export interesting test cases to YAML, 3. Run exported packs in CI/CD with khaos run.

Session Limits

PlanSessions/Month
Free10
PaidUnlimited

Sessions reset on the 1st of each month. The CLI shows your current usage when starting a session. Upgrade for unlimited sessions.

Environment Variables

VariableDescription
KHAOS_DASHBOARD_URLOverride the dashboard URL (default: resolved from cloud config or https://khaos.exordex.com)
KHAOS_RAILWAY_WS_URLOverride the WebSocket relay URL (default: wss://api.khaos.exordex.com/ws)

Architecture

Your agent runs locally while the CLI streams events to the hosted dashboard via a secure WebSocket relay:

TEXT
┌─────────────────┐                    ┌──────────────────────────┐
│  Dashboard UI   │     WebSocket      │  WebSocket Relay         │
│  (Browser)      │ ◄────────────────► │  (api.khaos.exordex.com) │
└─────────────────┘                    └────────────┬─────────────┘
                                                    │
                                               WebSocket
                                                    │
                                           ┌────────▼─────────┐
                                           │  Khaos CLI        │
                                           │  (Your Machine)   │
                                           │        │          │
                                           │   stdin/stdout    │
                                           │        │          │
                                           │  Agent Process    │
                                           └──────────────────┘

The CLI authenticates with Khaos Cloud, receives a session token, connects to the WebSocket relay at wss://api.khaos.exordex.com/ws, and spawns your agent as a subprocess. The dashboard connects to the same relay using the session token to stream events bidirectionally.

Troubleshooting

"websockets package required"

Terminal
pip install websockets

"Agent not found"

Ensure your agent is discovered:

Terminal
khaos discover
khaos playground start my-agent

"Not logged in" errors

The CLI auto-triggers login on first use. If it fails, authenticate manually:

Terminal
khaos login

Session expired

The CLI automatically re-authenticates when a session expires. If you see repeated auth errors, try logging in again with khaos login.

WebSocket connection failed

The playground connects to wss://api.khaos.exordex.com/ws. Ensure your network allows outbound WebSocket connections. If you're behind a corporate proxy, you may need to whitelist api.khaos.exordex.com.

Agent subprocess crashes

Check the terminal output where you started the playground. Common issues include missing API keys, import errors, or syntax errors in agent code.

Security Attack Catalog

The playground includes 266 security attacks across these categories:

CategoryDescription
Prompt InjectionAttempts to override agent instructions
JailbreakAttempts to bypass safety guardrails
Data ExfiltrationAttempts to leak sensitive data (system prompt, PII)
Tool MisuseAttempts to misuse agent tools (unauthorized access, command injection)
Indirect InjectionPayloads in external data (RAG docs, tool outputs)
Privilege EscalationAttempts to gain elevated access
Resource AbuseAttempts to exhaust agent resources
Hallucination ProbingTests for confabulation and fabricated information
Social EngineeringManipulation via authority impersonation and urgency tactics

Some attacks are multi-turn, requiring multiple conversation steps. The playground handles these automatically, showing progress through each turn.

See Security Testing for more details on the attack catalog.