Dashboard
The Khaos Dashboard is your command center for exploring evaluation runs, comparing agent behavior across versions, and managing projects and team settings. Access it at app.khaos.dev.
Getting Started
To start using the dashboard, sync your first evaluation run:
# Run evaluation with cloud sync
khaos run <agent-name> --eval quickstart --sync
# Or sync manually after a run
khaos run <agent-name>
khaos synckhaos sync will automatically prompt for login if you're not authenticated.Dashboard Views
The dashboard provides four main views for exploring your agent evaluations:
Projects Index
Your home screen showing all projects in your namespace. Features include:
- Project cards - Quick overview of each project's recent activity
- Search & filter - Find projects by name or tags
- Recent runs - See latest evaluation results at a glance
- Project creation - Create new projects for different agents
Project Detail
Deep dive into a specific project to see:
- Run history - All evaluation runs for this project
- Score trends - Security and resilience scores over time
- Version comparison - Quick compare between any two runs
- Project settings - Configure alerts, baselines, and thresholds
Run Detail
Explore individual evaluation runs with:
- Score summary - Overall, security, and resilience scores
- LLM trace viewer - Full conversation traces per test case
- MCP telemetry - MCP server interactions and tool calls
- Resilience breakdown - Fault injection results and recovery metrics
- Security findings - Vulnerability details and remediation hints
- Cost analysis - Token usage and estimated costs per case
Comparison View
The heart of Khaos - comparing two runs side by side:
- Four-lens delta - Changes across structural, resilience, security, and functional dimensions
- Output diff - Line-by-line comparison of agent outputs
- Cost projection - Estimated impact at scale
- Regression detection - Automatic flagging of degraded metrics
Project Identity
Khaos uses owner-scoped project identifiers everywhere:
Format: owner_slug/project_slug
Examples:
myteam/customer-support-agent
johndoe/code-assistant
acme-corp/internal-botLLM Trace Viewer
For pack evaluations, the LLM Trace tab provides detailed conversation inspection:
- Per-case traces - See exactly what happened in each test case
- Collapsed identical traces - Quickly spot differences between cases
- Phase filtering - Filter by baseline, resilience, or security phase
- Token breakdown - Prompt and completion tokens per message
- Timing data - TTFT (time to first token) and total duration
# Generate pack runs with LLM traces
khaos run <agent-name> --eval full-eval --synckhaos run <agent-name> --eval full-eval --sync to get comprehensive LLM traces in the dashboard.API Tokens
Generate project-scoped API tokens for CI/CD integration and programmatic access:
- Navigate to your project in the dashboard
- Click Settings → API Tokens
- Click Generate New Token
- Copy the token (it won't be shown again)
# Use token in CI/CD
export KHAOS_TOKEN=your-project-token
khaos ci <agent-name> --syncTeam Settings
On Team plans, manage team members and permissions:
- Invite members - Add team members by email
- Role management - Assign admin, developer, or viewer roles
- Project access - Control which projects each member can access
- Audit log - Track team activity and changes
| Role | View Runs | Sync Runs | Manage Project | Manage Team |
|---|---|---|---|---|
| Viewer | Yes | No | No | No |
| Developer | Yes | Yes | No | No |
| Admin | Yes | Yes | Yes | Yes |
Alerts & Notifications
Configure alerts to stay informed about evaluation results:
- Threshold alerts - Get notified when scores drop below thresholds
- Regression alerts - Automatic notification on detected regressions
- Email notifications - Summary emails for CI/CD runs
Keyboard Shortcuts
Navigate the dashboard efficiently with keyboard shortcuts:
| Shortcut | Action |
|---|---|
g p | Go to Projects |
g r | Go to Recent Runs |
/ | Focus search |
c | Start comparison (with two runs selected) |
? | Show all shortcuts |
URL Structure
Dashboard URLs follow a predictable pattern for easy navigation and sharing:
# Projects list
https://app.khaos.dev/projects
# Project detail
https://app.khaos.dev/{owner}/{project}
# Run detail
https://app.khaos.dev/{owner}/{project}/runs/{run-id}
# Comparison view
https://app.khaos.dev/{owner}/{project}/compare/{run-id-1}/{run-id-2}