Skip to content

Architecture overview

Quayside is small. Three things to understand:

  1. The three-layer model — class, instance, session
  2. The proxy lifecycle — how a single call flows through
  3. The service boundaries — the modules and how they compose

Three-layer model

LayerWhat it isCardinalityCreated when
ClassThe durable, registered “kind of agent”: eng/code-reviewer. Pins purpose, scope, owner, lifecycle. The unit of governance — policy attaches here.tens to low hundreds per orgHuman-driven via the registry
InstanceA specific (class, principal) pair. The runtime identity. Same pair always resolves to the same instance — restart, crash, different machine, same instance.one per user per classAuto-claimed on first proxy call
SessionOne conversation / runtime invocation. Tied to an audit.runs row.many per instancePer call

Class is what humans care about. Instance is what the proxy tracks. Session is what audit reports against.

A single call

Client (Claude SDK, curl, your script)
│ POST /v1/messages
│ x-api-key: qsk_<JWT>
Quayside proxy
│ 1. Validate JWT → principal_id + class_slug
│ 2. Resolve class → instance (auto-claim if first time)
│ 3. Load active policy for the class
│ 4. Check budgets (per-class cost caps)
│ 5. Run request-side detector stages
│ (parallel within stage, serial across stages)
│ 6. If Block → 403; else open audit run
│ 7. Forward to LLM provider (Anthropic, OpenAI, …)
│ 8. Stream response back, running response-side stages
│ in a sliding window over accumulated text
│ 9. Parse token usage from the SSE stream
│ 10. Close audit run with final effect
Audit + Token-usage rows in Postgres

Pre-flight failures (404, 403, 422) happen before any chunk reaches the client. Mid-stream blocks emit an event: error SSE frame and terminate the response.

Services

Quayside ships as a single Python service with module boundaries that are designed to split later:

ModuleWhat it owns
authJWT validation, dev-token minting, principal extraction
registryClass CRUD, instance auto-claim, lifecycle transitions, shadow log
policyPolicy schema (PolicyBody), versions, draft/publish/rollback
detectorsDetectorAdapter Protocol + shipped adapters: RegexDetector, PresidioDetector, NullDetector
proxyPipeline orchestration: identity → policy → cascade → upstream → response detection
auditRun + step rows; query endpoints for the dashboard
tokensPer-call usage capture; spend roll-ups; budget queries

Every module exposes a XService Protocol — modules talk to each other via Protocols, not concrete classes. That means when one module grows to need its own service, only the composition root in app.py changes. Call sites stay identical.

What you build on top

The default deployment ships enough for a real workload — the RegexDetector covers most “block these strings” needs, the PresidioDetector covers PII via Microsoft’s open-source library, and the registry/policy/audit surface is complete.

You build on top by:

  • Writing custom detectors — anything that needs your own rules, your own services, or your own LLM-judges (guide)
  • Authoring class-specific policies — most of the per-tenant config goes into policies, not code (reference)
  • Integrating with your IdP — the default dev_mode mint endpoint is for development; production wires a real OIDC provider via the auth module’s settings

Where this fits

Quayside is a proxy-pattern control plane — it sits in the call path. That’s a different shape from an observability tool (which inspects after the fact) or an SDK (which sits inside agent code). The trade-off: Quayside has to be in the call path to enforce, but you get the same control plane to attach audit, budget, and policy to every call your agents make.