CI/CD Integration
Integrate Khaos into your CI/CD pipeline to catch security vulnerabilities and resilience issues before they reach production. Khaos provides threshold-based gating, JUnit XML output, and official actions for GitHub and GitLab.
Quick Start
Use the khaos ci command for a single-step CI integration:
# Run with default thresholds (security: 80, resilience: 70)
khaos ci <agent-name>
# Custom thresholds
khaos ci <agent-name> --security-threshold 85 --resilience-threshold 75
# Generate JUnit XML for CI test reporting
khaos ci <agent-name> --format junit --output-file results.xmlExit Codes
Use exit codes to control pipeline flow:
| Code | Meaning | Action |
|---|---|---|
0 | All gates passed | Continue pipeline |
1 | Security threshold not met | Fail build |
2 | Resilience threshold not met | Fail build |
3 | Both thresholds failed | Fail build |
4 | Baseline tests failed | Fail build |
5 | Regression detected vs baseline | Fail build (if --fail-on-regression) |
GitHub Actions
Use the official Khaos GitHub Action for turnkey integration:
# .github/workflows/agent-test.yml
name: Agent Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Test Agent
uses: exordex/khaos-test@v1
with:
agent: ./my_agent.py
eval: quickstart
security-threshold: 80
resilience-threshold: 70
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}Action Inputs
| Input | Required | Default | Description |
|---|---|---|---|
agent | Yes | - | Path to agent script |
eval | No | quickstart | Evaluation to run |
security-threshold | No | 80 | Minimum security score (0-100) |
resilience-threshold | No | 70 | Minimum resilience score (0-100) |
baseline | No | - | Baseline name for comparison |
save-baseline | No | - | Save run as named baseline |
fail-on-regression | No | false | Fail on regression detection |
seed | No | random | Random seed for reproducible runs |
Action Outputs
- name: Test Agent
id: khaos
uses: exordex/khaos-test@v1
with:
agent: ./my_agent.py
- name: Check Results
run: |
echo "Security: ${{ steps.khaos.outputs.security-score }}"
echo "Resilience: ${{ steps.khaos.outputs.resilience-score }}"
echo "Overall: ${{ steps.khaos.outputs.overall-score }}"
echo "Passed: ${{ steps.khaos.outputs.passed }}"GitLab CI
Use the Khaos CI template for GitLab:
# .gitlab-ci.yml
include:
- remote: 'https://raw.githubusercontent.com/exordex/khaos/main/.gitlab/khaos-ci.yml'
khaos-test:
extends: .khaos-test
variables:
KHAOS_AGENT: "./my_agent.py"
KHAOS_EVAL: "quickstart"
KHAOS_SECURITY_THRESHOLD: "80"
KHAOS_RESILIENCE_THRESHOLD: "70"
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == "main"Template Variables
| Variable | Default | Description |
|---|---|---|
KHAOS_AGENT | - | Path to agent script (required) |
KHAOS_EVAL | quickstart | Evaluation to run |
KHAOS_SECURITY_THRESHOLD | 80 | Minimum security score |
KHAOS_RESILIENCE_THRESHOLD | 70 | Minimum resilience score |
KHAOS_BASELINE | - | Baseline for comparison |
KHAOS_SAVE_BASELINE | - | Save as named baseline |
KHAOS_FAIL_ON_REGRESSION | false | Fail on regression |
KHAOS_SEED | random | Random seed for reproducibility |
Manual Integration
For other CI systems, use the CLI directly:
#!/bin/bash
# ci-test.sh
pip install khaos
# Run evaluation with JUnit output and fixed seed
khaos ci <agent-name> \
--eval quickstart \
--security-threshold 80 \
--resilience-threshold 70 \
--seed 42 \
--format junit \
--output-file results.xml
# Exit code indicates pass/fail
exit $?--format junit --output-file results.xml to generate it.Baseline Comparison
Compare against a stored baseline to detect regressions:
# On main branch: save baseline
khaos ci <agent-name> --save-baseline main
# On feature branches: compare against main
khaos ci <agent-name> --baseline main --fail-on-regressionThis pattern ensures that changes don't degrade security or resilience compared to the main branch.
Output Formats
Khaos CI supports multiple output formats:
| Format | Use Case |
|---|---|
text | Human-readable console output (default) |
json | Machine-readable for scripts and dashboards |
junit | Test reporting in CI systems |
markdown | PR comments and documentation |
all | Generate all formats at once |
# Generate all formats to a directory
khaos ci <agent-name> --format all --output-file reports/khaos
# Generates: reports/khaos.xml, reports/khaos.json, reports/khaos.mdReproducibility
Use the --seed flag to ensure reproducible runs across CI environments. The seed is recorded in all artifacts for provenance tracking.
# GitHub Actions with fixed seed
- name: Test Agent
uses: exordex/khaos-test@v1
with:
agent: ./my_agent.py
seed: 42 # Ensures deterministic fault scheduling
# GitLab CI with fixed seed
khaos-test:
variables:
KHAOS_SEED: "42" # Reproducible across runnersBenefits of using seeds in CI:
- Deterministic results - Same seed produces same fault injection sequence
- Debuggable failures - Reproduce exact failure conditions locally
- Baseline validity - Config hash ensures you're comparing compatible runs
Best Practices
- Use fixed seeds in CI - Ensures reproducible, debuggable runs
- Run on every PR - Catch issues before merge
- Use quickstart for PRs - Fast feedback (~2 min)
- Use full-eval for main - Comprehensive before release
- Save baselines on main - Track regressions over time
- Set realistic thresholds - Start at 70-80, increase gradually
- Upload artifacts - Store results for debugging