CI/CD Integration

Integrate Khaos into your CI/CD pipeline to catch security vulnerabilities and resilience issues before they reach production. Khaos provides two complementary CI commands:

khaos ci — Run evaluation packs with thresholds, reporting, and cloud sync
khaos test — Run @khaostest-decorated Python tests with JUnit/JSON/Markdown output

Both work in any CI environment — no custom actions or templates required.

Quick Start

Three environment variables and one command are all you need for khaos ci:

Terminal

# Set credentials (required for khaos ci with --sync)
export KHAOS_API_TOKEN=your-project-token
export KHAOS_PROJECT_SLUG=owner/project
export KHAOS_API_URL=https://api.khaos.exordex.com

# Install and run evaluation
python3 -m pip install khaos-agent
khaos ci my-agent --eval quickstart --sync

# Or run @khaostest tests (no credentials required)
khaos test --format junit -o results.xml

Project Tokens

Generate a project-scoped API token from your project's Tokens page in the Dashboard. Tokens are scoped with granular permissions (e.g. ingest:write, runs:read).

Environment Variables

Variable	Required	Description
`KHAOS_API_TOKEN`	`khaos ci --sync`	Project-scoped API token for authentication
`KHAOS_PROJECT_SLUG`	`khaos ci --sync`	Project identifier (e.g. `myteam/my-agent`)
`KHAOS_API_URL`	`khaos ci --sync`	API endpoint (`https://api.khaos.exordex.com`)
`KHAOS_STATE_DIR`	No	Local artifact storage (default: `~/.khaos`). Set to workspace dir in CI for easy artifact collection.

khaos test needs no credentials

khaos test runs locally against your @khaosagent handlers and does not require any API credentials. Only khaos ci --sync needs the token and project slug.

khaos ci — Evaluation Pipeline

The khaos ci command runs an evaluation pack against your agent, checks thresholds, and generates reports. It supports multiple output formats and integrates with GitHub Actions step summaries.

Terminal

# Basic CI run
khaos ci my-agent --eval quickstart

# With thresholds and output
khaos ci my-agent \
  --eval full-eval \
  --security-threshold 85 \
  --resilience-threshold 75 \
  --format junit --output-file results.xml \
  --json-file results.json

# GA mode: simplified exit codes (0=pass, 1=threshold fail, 2=infra error)
khaos ci my-agent --exit-code-mode ga --sync

# Validate credentials before running (dry run)
khaos ci my-agent --preflight-only

# Also run @khaostest tests in the same pipeline
khaos ci my-agent --eval quickstart --test
khaos ci my-agent --eval quickstart --test --test-path tests/integration/

# Baseline comparison and regression detection
khaos ci my-agent --baseline main --fail-on-regression
khaos ci my-agent --save-baseline main

Key Flags

Flag	Description	Default
`--eval, -e`	Evaluation pack to run	`quickstart`
`--security-threshold`	Minimum security score (0-100)	`80`
`--resilience-threshold`	Minimum resilience score (0-100)	`70`
`--format, -f`	Output format: `text`, `json`, `junit`, `markdown`, `all`	`text`
`--output-file, -o`	Write primary output to file (format inferred from extension)	-
`--json-file`	Write JSON results to file (in addition to primary format)	-
`--sync / --no-sync`	Upload results to dashboard	Auto in GitHub Actions
`--exit-code-mode`	`ga` (0/1/2) or `detailed` (multi-code)	`ga` in GHA
`--test`	Also run `@khaostest` tests and merge results into output	`false`
`--test-path`	Paths to search for `@khaostest` tests (used with `--test`)	`tests/`
`--baseline, -b`	Compare against a named baseline	-
`--save-baseline`	Save this run as a named baseline	-
`--fail-on-regression`	Exit non-zero if regression detected	`false`
`--preflight-only`	Validate credentials without running evaluation	`false`

khaos test — Agent Tests in CI

Run your @khaostest-decorated Python tests with machine-readable output. No cloud credentials required — tests run locally against your agent handlers. See Agent Testing for how to write tests.

Terminal

# JUnit XML for CI test reporters
khaos test --format junit -o khaos-test-results.xml

# JSON for scripting
khaos test --format json -o khaos-test-results.json

# Both at once
khaos test --format junit -o results.xml --json-file results.json

# All formats (writes .xml, .json, .md)
khaos test --format all -o khaos-tests

GitHub Actions Auto-Detection

When running in GitHub Actions, khaos test automatically writes a Markdown report to $GITHUB_STEP_SUMMARY and outputs total, passed, failed, and verdict to $GITHUB_OUTPUT.

Exit Codes

Both commands return meaningful exit codes for pipeline control:

GA Mode (`--exit-code-mode ga`, default in GitHub Actions)

Code	Meaning	Action
`0`	All gates passed	Continue pipeline
`1`	Threshold or test failure	Fail build
`2`	Infrastructure / config error	Investigate setup

Detailed Mode (`--exit-code-mode detailed`)

Code	Meaning
`0`	All gates passed
`1`	Security threshold not met
`2`	Resilience threshold not met
`3`	Both security and resilience failed
`4`	Baseline tests failed
`5`	Regression detected vs baseline
`6`	`@khaostest` tests failed (when using `--test`)
`10`	Configuration error
`11`	Runtime error

khaos test (standalone) uses simple exit codes: 0 = all passed, 1 = any failed.

GitHub Actions

A complete workflow running both evaluation and @khaostest tests:

.github/workflows/khaos.yml

name: Khaos Evaluation

on:
  push:
    branches: [main]
  pull_request:

jobs:
  # Job 1: Run evaluation packs
  evaluate:
    runs-on: ubuntu-latest
    env:
      KHAOS_API_TOKEN: ${{ secrets.KHAOS_API_TOKEN }}
      KHAOS_PROJECT_SLUG: ${{ secrets.KHAOS_PROJECT_SLUG }}
      KHAOS_API_URL: ${{ secrets.KHAOS_API_URL }}
      KHAOS_STATE_DIR: ${{ github.workspace }}/.khaos
      KHAOS_PACK: ${{ github.event_name == 'pull_request' && 'quickstart' || 'full-eval' }}
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Run Khaos CI
        id: khaos
        run: |
          pip install "khaos>=1.0.0,<2"
          khaos ci path/to/agent.py \
            --eval "$KHAOS_PACK" \
            --sync \
            --exit-code-mode ga \
            --format junit --output-file khaos-results.xml \
            --json-file khaos-results.json

      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: khaos-results
          path: |
            khaos-results.json
            khaos-results.xml

  # Job 2: Run @khaostest tests (no credentials needed)
  khaos-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - run: pip install "khaos>=1.0.0,<2"

      - name: Run @khaostest tests
        run: |
          khaos test \
            --format junit \
            --output-file khaos-test-results.xml \
            --json-file khaos-test-results.json

      - uses: mikepenz/action-junit-report@v4
        if: always()
        with:
          report_paths: 'khaos-test-results.xml'
          check_name: 'Khaos @khaostest Results'

Add KHAOS_API_TOKEN and KHAOS_PROJECT_SLUG to your repository secrets under Settings > Secrets and variables > Actions.

Reusable Action

Khaos also provides a reusable composite action at .github/actions/khaos-test/action.yml with inputs for agent path, pack, thresholds, baseline comparison, and run-khaostests to include @khaostest results.

GitLab CI

Add these jobs to your .gitlab-ci.yml:

.gitlab-ci.yml

stages: [khaos]

# Evaluation job
khaos:ci:
  stage: khaos
  image: python:3.11
  variables:
    KHAOS_API_TOKEN: $KHAOS_API_TOKEN
    KHAOS_PROJECT_SLUG: $CI_PROJECT_PATH
    KHAOS_API_URL: https://api.khaos.exordex.com
    KHAOS_CI: "1"
    KHAOS_PACK: "quickstart"
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
      variables: { KHAOS_PACK: "quickstart" }
    - if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH'
      variables: { KHAOS_PACK: "full-eval" }
  script:
    - pip install "khaos>=1.0.0,<2"
    - khaos ci path/to/agent.py --eval "$KHAOS_PACK" --sync --exit-code-mode ga
        --format junit --output-file khaos-results.xml
  artifacts:
    when: always
    expire_in: 30 days
    paths: [khaos-results.xml]

# @khaostest job
khaos:test:
  stage: khaos
  image: python:3.11
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH'
  script:
    - pip install "khaos>=1.0.0,<2"
    - khaos test --format junit --output-file khaos-test-results.xml
  artifacts:
    when: always
    reports:
      junit: khaos-test-results.xml

Add KHAOS_API_TOKEN as a CI/CD variable under Settings > CI/CD > Variables (masked and protected).

CircleCI

Add these jobs to your .circleci/config.yml:

.circleci/config.yml

version: 2.1

jobs:
  khaos-ci:
    docker:
      - image: cimg/python:3.11
    environment:
      KHAOS_CI: "1"
    steps:
      - checkout
      - run:
          name: Run Khaos evaluation
          command: |
            pip install "khaos>=1.0.0,<2"
            khaos ci path/to/agent.py \
              --eval quickstart --sync --exit-code-mode ga \
              --format junit --output-file khaos-results.xml \
              --json-file khaos-results.json
      - store_artifacts:
          path: khaos-results.xml
      - store_artifacts:
          path: khaos-results.json

  khaos-test:
    docker:
      - image: cimg/python:3.11
    steps:
      - checkout
      - run:
          name: Run @khaostest tests
          command: |
            pip install "khaos>=1.0.0,<2"
            khaos test --format junit --output-file khaos-test-results.xml
      - store_test_results:
          path: khaos-test-results.xml

workflows:
  khaos:
    jobs:
      - khaos-ci
      - khaos-test

Other CI Systems

For any CI system (Jenkins, Buildkite, Azure Pipelines, etc.), install the CLI and use the appropriate output format:

Terminal

#!/bin/bash
pip install "khaos>=1.0.0,<2"

# Run evaluation with JUnit output
khaos ci path/to/agent.py \
  --eval quickstart \
  --format junit --output-file results.xml \
  --json-file results.json

# Run @khaostest tests separately
khaos test --format junit --output-file test-results.xml

# Exit code indicates pass/fail
exit $?

Preflight Validation

Use --preflight-only to validate credentials and connectivity before running a full evaluation. This is useful as a separate CI step to fail fast on configuration issues.

Terminal

# Validate setup without running an evaluation
khaos ci my-agent --preflight-only --sync

# In CI: add as a separate step before the real run
# Step 1: Preflight
khaos ci my-agent --preflight-only --sync
# Step 2: Real evaluation
khaos ci my-agent --eval full-eval --sync

Choosing an Evaluation

Select the right evaluation for your pipeline stage:

Evaluation	Use Case	Duration
`quickstart`	Fast smoke test for every PR	~2 min
`security-standard`	Security-focused evaluation	~5 min
`full-eval`	Comprehensive evaluation before release	~10 min

Pipeline Strategy

Run quickstart + khaos test on every PR for fast feedback, and full-eval on merges to main for comprehensive coverage.

Viewing Results

CI runs with --sync are automatically synced to the dashboard. After a run completes:

Open your project in the Dashboard
Navigate to Evaluations to see the run
Click Compare to generate a 4-lens impact report against any previous run
Share the comparison URL in your PR for team review

Deployment Gating

Use the gate API to block deployments when scores fall below your threshold. The endpoint returns HTTP 200 when all scores pass, or HTTP 422 with details when any score fails the gate.

YAML

# Add to your GitHub Actions workflow
- name: Check Khaos Gate
  env:
    DASHBOARD_URL: ${{ vars.KHAOS_DASHBOARD_URL }}
    RUN_ID: ${{ steps.khaos.outputs.run_id }}
    KHAOS_API_TOKEN: ${{ secrets.KHAOS_API_TOKEN }}
  run: |
    GATE=$(curl -sf "${DASHBOARD_URL}/api/runs/${RUN_ID}/gate?threshold=70" \
      -H "x-webhook-secret: ${KHAOS_API_TOKEN}")
    echo "$GATE" | jq -e '.passed'

Custom Thresholds

Pass ?threshold=80 for stricter gating. The default threshold is 70 (the warning level). Scores checked: overall_score, security_score, and resilience_score.

PR Comments

Post a Khaos summary directly to your pull request. The summary endpoint returns a markdown report with scores, comparisons, and links to the full dashboard.

YAML

# Add to your GitHub Actions workflow
- name: Post Khaos Report
  if: github.event_name == 'pull_request'
  env:
    DASHBOARD_URL: ${{ vars.KHAOS_DASHBOARD_URL }}
    RUN_ID: ${{ steps.khaos.outputs.run_id }}
    KHAOS_API_TOKEN: ${{ secrets.KHAOS_API_TOKEN }}
  run: |
    SUMMARY=$(curl -sf "${DASHBOARD_URL}/api/runs/${RUN_ID}/summary" \
      -H "x-webhook-secret: ${KHAOS_API_TOKEN}")
    gh pr comment ${{ github.event.pull_request.number }} --body "$SUMMARY"

JSON Format

Add ?format=json to get the summary wrapped in a JSON object ({ markdown: "..." }) for programmatic use.

Best Practices

Run on every PR — Catch issues before merge
Use quickstart + khaos test for PRs — Fast feedback (~2 min)
Use full-eval for main — Comprehensive before release
Use --test flag — Include @khaostest results alongside evaluations
Use JUnit output — Native integration with GitHub, GitLab, CircleCI, and Jenkins test reporters
Save baselines on main — Use --save-baseline main after successful merges, --baseline main --fail-on-regression on PRs
Compare runs in the dashboard — Generate impact reports to understand what changed
Share comparison URLs in PRs — Give reviewers a direct link to the diff
Validate setup with --preflight-only — Fail fast on credential issues
Rotate tokens periodically — Generate new tokens and revoke old ones

Framework Support

Badge Generation

CI/CD Integration

Quick Start

Environment Variables

khaos ci — Evaluation Pipeline

Key Flags

khaos test — Agent Tests in CI

Exit Codes

GA Mode (--exit-code-mode ga, default in GitHub Actions)

Detailed Mode (--exit-code-mode detailed)

GitHub Actions

GitLab CI

CircleCI

Other CI Systems

Preflight Validation

Choosing an Evaluation

Viewing Results

Deployment Gating

PR Comments

Best Practices

GA Mode (`--exit-code-mode ga`, default in GitHub Actions)

Detailed Mode (`--exit-code-mode detailed`)