LLM Observability

Khaos captures LLM call telemetry during runs and writes it to llm-events-<run_id>.jsonl. You can inspect it locally, export it, and sync it to the dashboard.

Automatic Capture

During khaos run and khaos ci, Khaos sets telemetry env wiring and records LLM events when your agent uses supported providers/frameworks.

Terminal

# Run any evaluation
khaos run my-agent --eval quickstart

# Find the telemetry artifact path
khaos artifacts <run-id>

# Export all artifacts (trace, metrics, llm events references)
khaos export <run-id> --out run.json

Event Format

Each line in llm-events-*.jsonl is one event.

JSON

{
  "ts": "2026-02-26T00:00:00.000000+00:00",
  "event": "llm.call",
  "payload": {
    "model": "gpt-4o",
    "provider": "openai",
    "tokens": {
      "prompt": 123,
      "completion": 45
    },
    "cost_usd": 0.00123,
    "latency_ms": 420.5,
    "content": {
      "prompt": "...",
      "completion": "...",
      "mode": "mask"
    },
    "metadata": {}
  },
  "meta": {
    "pii_detected": false
  }
}

Manual Instrumentation

For custom integrations, use the public telemetry helpers exported from khaos.

Python

from khaos import emit_llm_call, observe_llm_call

emit_llm_call(
    model="gpt-4o",
    provider="openai",
    prompt="What is 2+2?",
    completion="4",
    tokens_in=8,
    tokens_out=2,
    latency_ms=120.0,
)

with observe_llm_call(model="gpt-4o", provider="openai") as obs:
    # call your custom client
    response_text = "example"
    obs.prompt = "hello"
    obs.completion = response_text
    obs.tokens_in = 10
    obs.tokens_out = 12

Telemetry Sink

These helpers emit only when KHAOS_LLM_EVENT_FILE is set by the runtime.

Cost Estimation

Use the stable cost helpers to estimate spend from token counts.

Python

from khaos.costs import load_cost_table, estimate_cost_usd

rates = load_cost_table()
cost_usd, source = estimate_cost_usd(
    provider="openai",
    model="gpt-4o",
    prompt_tokens=1200,
    completion_tokens=300,
    table=rates,
)
print(cost_usd, source)

Query Local Telemetry

Terminal

# Count calls
wc -l ~/.khaos/runs/llm-events-<run-id>.jsonl

# Total LLM cost from raw events
cat ~/.khaos/runs/llm-events-<run-id>.jsonl | python3 -c "
import json,sys
print(sum(json.loads(l)['payload'].get('cost_usd', 0.0) for l in sys.stdin))
"

# Inspect high-latency calls
cat ~/.khaos/runs/llm-events-<run-id>.jsonl | jq 'select(.payload.latency_ms > 3000)'

Dashboard Workflow

To view telemetry in Khaos Cloud, upload the run artifacts with sync.

Terminal

# Upload a specific run
khaos sync --run <run-id>

# Check upload/login status
khaos sync --status

Artifacts - run files and export workflow
Metrics - score and cost interpretation
Cloud Sync - auth and upload details

Deterministic Runs

Overview

LLM Observability

Automatic Capture

Event Format

Manual Instrumentation

Cost Estimation

Query Local Telemetry

Dashboard Workflow

Related