Policies

A policy says what the proxy does for one class. Every active class has exactly one published policy at any time. Edits create new versions; the active version is “the most recent one with published_at set.”

For the full field-by-field reference, see reference/policy-schema.md.

What’s in a policy

version: 1
description: "Engineering — default safety policy"

fail_mode: closed              # default when a detector errors with no on_failure
global_timeout_ms: 5000        # per-detector timeout if a stage doesn't override
series_mode: exhaustive        # how the cascade decides termination

stages:
  - name: cheap-inline
    direction: both             # request | response | both
    detectors: [regex_pii, keyword_blocklist]
    timeout_ms: 200

  - name: hosted-scan
    direction: both
    detectors: [presidio]
    timeout_ms: 2000

detectors:
  regex_pii:
    enabled: true
    weight: 1.0
    thresholds: { flag: 0.5, block: 0.85 }
    on_failure:
      - { cause: timeout, action: continue }
      - { cause: error, action: block }

  presidio:
    enabled: true
    weight: 1.0
    thresholds: { flag: 0.5, block: 0.85 }
    parameters:
      endpoint: { secret_ref: PRESIDIO_URL }
      entities: [EMAIL_ADDRESS, US_SSN, PHONE_NUMBER]

budgets:
  cost_usd_per_day: 100.00
  cost_usd_per_month: null
  on_exceeded: { action: block }

The two key pieces

Stages

The list of stages is the cascade. Stages run serially in declared order. Detectors within a stage run in parallel.

direction decides whether a stage runs on the request, the response, or both:

direction	Runs when
`request`	Before forwarding to the LLM (pre-flight)
`response`	While streaming the LLM response back
`both`	Both

timeout_ms caps each detector in the stage. If unset, the policy’s global_timeout_ms applies.

A stage’s combined effect is computed by the lattice — Block > Approve > Modify > Flag > Allow. The first stage that combines to Block halts the cascade; the request is refused with 403 (pre-flight) or an SSE event: error (mid-stream).

Detectors

The detectors map is keyed by the detector’s name (which is what stages[].detectors[] references). For each detector you say:

enabled — flip off without removing from stages
weight — for the score combiner
thresholds — how flag / block are derived from a score (used by some detectors)
category_overrides — per-category threshold tweaks (e.g. block US_SSN at 0.5 instead of 0.85)
parameters — adapter-specific config (frozen at deployment for v1; per-policy variation is a future feature)
on_failure — see below

`on_failure` per detector

By default, a failed detector falls back to the policy’s global fail_mode:

fail_mode: open → failure becomes Allow (request proceeds)
fail_mode: closed → failure becomes Block (request refused)

But specific (detector, cause) pairs can override:

detectors:
  presidio:
    on_failure:
      - { cause: timeout, action: continue }   # don't fail on slow Presidio
      - { cause: error, action: block }         # but errors are critical

cause is one of timeout or error. action is one of:

continue — treat as Allow (proceed with the cascade)
flag — record a Flag step, proceed
block — refuse the request

Budgets

If budgets is present, the proxy checks per-class spend before forwarding upstream:

cost_usd_per_day — rolling 24h cap
cost_usd_per_month — rolling 30d cap

When a cap is hit:

action: block — refuse with PolicyBlocked, write a detector=budget step row with effect Block
action: flag — write a Flag step, proceed
action: throttle — currently degrades to Block (rate-limit machinery is a roadmap item)

Versions

Every edit creates a new draft (published_at = NULL). Publishing stamps published_at. The active version is the latest published_at. Rollback creates a new published version with an older body — no destructive operations.

See Guides → Publish and rollback for the operator flow.

What stays the same across versions

Each version has its own body (the YAML above, in JSONB). Some fields belong to the policy row, not the body:

id (UUID of this version)
class_id (which class this is the policy for)
version (monotonic integer per class)
published_at (null for drafts)

Edit the body. Don’t edit the metadata.

What’s not in a policy yet

Two things designed but deferred:

Per-tenant inheritance — “workspace can only tighten, not loosen” multi-tenant overlay. Out of v1.
Per-detector parameters at runtime — the proxy currently builds detector instances at deployment with their parameters frozen. The parameters field on a DetectorConfig is reserved for the future factory-registry pattern.