Testing API Surface (Current)
This page documents the testing API that exists in the current stable SDK. It is intentionally strict: only symbols that are exported today are shown.
Available Exports
Python
from khaos.testing import (
khaostest,
KhaosTestConfig,
AgentTestClient,
AgentResponse,
)@khaostest Decorator
@khaostest marks test functions for discovery by khaos test. You can run in declarative mode (no agent parameter) or imperative mode (with an agent parameter).
Python
from khaos.testing import khaostest
@khaostest(agent="my-agent", tags=["smoke"])
def test_declarative():
# Declarative mode: Khaos drives execution
pass
@khaostest(agent="my-agent")
def test_imperative(agent):
# Imperative mode: you call the agent directly
result = agent("What is 2+2?")
assert result.success
assert "4" in result.textKhaosTestConfig
Decorator config supports agent name, optional faults/attacks/inputs, timeouts, expectations, and tags.
Python
from khaos.testing import khaostest
@khaostest(
agent="my-agent",
faults=["timeout"],
attacks=["pi-ignore-instructions"],
inputs=["hello"],
timeout_ms=30000,
expect_blocked=None,
min_score=None,
tags=["security", "ci"],
)
def test_with_config(agent):
result = agent("hello")
assert result.successAgentResponse
Imperative agent calls return an AgentResponse dataclass.
| Field | Type | Description |
|---|---|---|
success | bool | Whether the invocation completed without agent-level error status |
text | str | Normalized output text extracted from the agent response |
raw | dict[str, Any] | Raw response payload returned by the agent |
latency_ms | float | End-to-end call latency in milliseconds |
tokens_in | int | Input token count when present in metadata |
tokens_out | int | Output token count when present in metadata |
error | str | None | Error string if invocation fails |
AgentTestClient
AgentTestClient is callable and records call history for test-time inspection.
Python
from khaos.testing import khaostest
@khaostest(agent="my-agent")
def test_history(agent):
r1 = agent("hello")
r2 = agent("what is 2+2?")
assert len(agent.call_history) == 2
assert agent.call_history[0].text == r1.text
assert agent.call_history[1].text == r2.textRunning Tests
Terminal
khaos test
khaos test tests/test_security.py
khaos test --format junit -o results.xmlSee Testing for full runner options and CI examples.
What Is Not Shipped Yet
No Enriched Helper Layer in Stable 1.0
Tri-state classifiers, screenplay assertions, paired-trial helpers, MCP scan helpers, redaction helpers, and pytest fixtures/plugins are not currently exported by
khaos.testing in this repository's stable SDK.Related Docs
- Testing Framework — primary
@khaostestusage - CLI Reference —
khaos testcommand options - Agent Decorator — configure discoverable agents