Deterministic Runs

Use --seed to make Khaos run scheduling reproducible. This is the fastest way to debug failures and compare candidate changes under the same test conditions.

What Is Deterministic

Deterministic: Khaos scheduling and fault/attack sampling decisions tied to the seed
Not fully deterministic: LLM provider output itself may still vary between runs

Practical Interpretation

Same seed means "same evaluation setup". It does not guarantee identical model text output.

Run With a Fixed Seed

Terminal

# Reproducible local run
khaos run my-agent --eval quickstart --seed 12345

# Reproducible security run
khaos run my-agent --eval security --seed 12345

If you omit --seed, Khaos generates one and records it in run artifacts.

Inspect Seed and Artifacts

Terminal

# Get artifact paths for a run
khaos artifacts <run-id>

# Inspect report data (includes seed when set/generated)
khaos export <run-id> --out run.json
cat run.json | jq '.metrics.seed'

Compare Two Seeded Runs

Terminal

# Baseline
khaos run my-agent --eval quickstart --seed 12345 --name baseline-seed-12345

# Candidate
khaos run my-agent --eval quickstart --seed 12345 --name candidate-seed-12345

# Compare
khaos compare baseline-seed-12345 candidate-seed-12345

CI Pattern

YAML

# .github/workflows/khaos.yml
- name: Reproducible Khaos CI
  run: |
    khaos ci my-agent       --eval quickstart       --seed 12345       --security-threshold 80       --resilience-threshold 70

Use Named Runs

Add --name on khaos run so run IDs are easier to compare and revisit.

Troubleshooting Drift

Confirm both runs use the same --seed
Confirm same eval selection and thresholds
Confirm same model/provider and env vars
Use khaos export to diff metrics/trace payloads

Next Steps

Artifacts - inspect run outputs and provenance
CLI Reference - full run, ci, and compare flags
CI/CD - production pipeline integration

Scenario Authoring

LLM Observability