Dashboard

The Khaos Dashboard is your command center for exploring evaluation runs, comparing agent behavior across versions, and managing projects and team settings. Access it at app.khaos.dev.

Getting Started

To start using the dashboard, sync your first evaluation run:

Terminal
# Run evaluation with cloud sync
khaos run <agent-name> --eval quickstart --sync

# Or sync manually after a run
khaos run <agent-name>
khaos sync
Auto-Login
Running khaos sync will automatically prompt for login if you're not authenticated.

Dashboard Views

The dashboard provides four main views for exploring your agent evaluations:

Projects Index

Your home screen showing all projects in your namespace. Features include:

  • Project cards - Quick overview of each project's recent activity
  • Search & filter - Find projects by name or tags
  • Recent runs - See latest evaluation results at a glance
  • Project creation - Create new projects for different agents

Project Detail

Deep dive into a specific project to see:

  • Run history - All evaluation runs for this project
  • Score trends - Security and resilience scores over time
  • Version comparison - Quick compare between any two runs
  • Project settings - Configure alerts, baselines, and thresholds

Run Detail

Explore individual evaluation runs with:

  • Score summary - Overall, security, and resilience scores
  • LLM trace viewer - Full conversation traces per test case
  • MCP telemetry - MCP server interactions and tool calls
  • Resilience breakdown - Fault injection results and recovery metrics
  • Security findings - Vulnerability details and remediation hints
  • Cost analysis - Token usage and estimated costs per case

Comparison View

The heart of Khaos - comparing two runs side by side:

  • Four-lens delta - Changes across structural, resilience, security, and functional dimensions
  • Output diff - Line-by-line comparison of agent outputs
  • Cost projection - Estimated impact at scale
  • Regression detection - Automatic flagging of degraded metrics

Project Identity

Khaos uses owner-scoped project identifiers everywhere:

TEXT
Format: owner_slug/project_slug

Examples:
  myteam/customer-support-agent
  johndoe/code-assistant
  acme-corp/internal-bot
Why Owner Scoping?
Owner scoping prevents naming collisions. Two different teams can both have a project named "demo" without conflict.

LLM Trace Viewer

For pack evaluations, the LLM Trace tab provides detailed conversation inspection:

  • Per-case traces - See exactly what happened in each test case
  • Collapsed identical traces - Quickly spot differences between cases
  • Phase filtering - Filter by baseline, resilience, or security phase
  • Token breakdown - Prompt and completion tokens per message
  • Timing data - TTFT (time to first token) and total duration
Terminal
# Generate pack runs with LLM traces
khaos run <agent-name> --eval full-eval --sync
Workflow Tip
Run khaos run <agent-name> --eval full-eval --sync to get comprehensive LLM traces in the dashboard.

API Tokens

Generate project-scoped API tokens for CI/CD integration and programmatic access:

  1. Navigate to your project in the dashboard
  2. Click Settings API Tokens
  3. Click Generate New Token
  4. Copy the token (it won't be shown again)
Terminal
# Use token in CI/CD
export KHAOS_TOKEN=your-project-token
khaos ci <agent-name> --sync
Token Security
API tokens have the same permissions as your user account for that project. Store them securely in your CI/CD secrets.

Team Settings

On Team plans, manage team members and permissions:

  • Invite members - Add team members by email
  • Role management - Assign admin, developer, or viewer roles
  • Project access - Control which projects each member can access
  • Audit log - Track team activity and changes
RoleView RunsSync RunsManage ProjectManage Team
ViewerYesNoNoNo
DeveloperYesYesNoNo
AdminYesYesYesYes

Alerts & Notifications

Configure alerts to stay informed about evaluation results:

  • Threshold alerts - Get notified when scores drop below thresholds
  • Regression alerts - Automatic notification on detected regressions
  • Email notifications - Summary emails for CI/CD runs

Keyboard Shortcuts

Navigate the dashboard efficiently with keyboard shortcuts:

ShortcutAction
g pGo to Projects
g rGo to Recent Runs
/Focus search
cStart comparison (with two runs selected)
?Show all shortcuts

URL Structure

Dashboard URLs follow a predictable pattern for easy navigation and sharing:

TEXT
# Projects list
https://app.khaos.dev/projects

# Project detail
https://app.khaos.dev/{owner}/{project}

# Run detail
https://app.khaos.dev/{owner}/{project}/runs/{run-id}

# Comparison view
https://app.khaos.dev/{owner}/{project}/compare/{run-id-1}/{run-id-2}