Deterministic Runs

Use --seed to make Khaos run scheduling reproducible. This is the fastest way to debug failures and compare candidate changes under the same test conditions.

What Is Deterministic

  • Deterministic: Khaos scheduling and fault/attack sampling decisions tied to the seed
  • Not fully deterministic: LLM provider output itself may still vary between runs
Practical Interpretation
Same seed means "same evaluation setup". It does not guarantee identical model text output.

Run With a Fixed Seed

Terminal
# Reproducible local run
khaos run my-agent --eval quickstart --seed 12345

# Reproducible security run
khaos run my-agent --eval security --seed 12345

If you omit --seed, Khaos generates one and records it in run artifacts.

Inspect Seed and Artifacts

Terminal
# Get artifact paths for a run
khaos artifacts <run-id>

# Inspect report data (includes seed when set/generated)
khaos export <run-id> --out run.json
cat run.json | jq '.metrics.seed'

Compare Two Seeded Runs

Terminal
# Baseline
khaos run my-agent --eval quickstart --seed 12345 --name baseline-seed-12345

# Candidate
khaos run my-agent --eval quickstart --seed 12345 --name candidate-seed-12345

# Compare
khaos compare baseline-seed-12345 candidate-seed-12345

CI Pattern

YAML
# .github/workflows/khaos.yml
- name: Reproducible Khaos CI
  run: |
    khaos ci my-agent       --eval quickstart       --seed 12345       --security-threshold 80       --resilience-threshold 70
Use Named Runs
Add --name on khaos run so run IDs are easier to compare and revisit.

Troubleshooting Drift

  • Confirm both runs use the same --seed
  • Confirm same eval selection and thresholds
  • Confirm same model/provider and env vars
  • Use khaos export to diff metrics/trace payloads

Next Steps