Architecture overview

Quayside is small. Three things to understand:

The three-layer model — class, instance, session
The proxy lifecycle — how a single call flows through
The service boundaries — the modules and how they compose

Three-layer model

Layer	What it is	Cardinality	Created when
Class	The durable, registered “kind of agent”: `eng/code-reviewer`. Pins purpose, scope, owner, lifecycle. The unit of governance — policy attaches here.	tens to low hundreds per org	Human-driven via the registry
Instance	A specific `(class, principal)` pair. The runtime identity. Same pair always resolves to the same instance — restart, crash, different machine, same instance.	one per user per class	Auto-claimed on first proxy call
Session	One conversation / runtime invocation. Tied to an `audit.runs` row.	many per instance	Per call

Class is what humans care about. Instance is what the proxy tracks. Session is what audit reports against.

A single call

   Client (Claude SDK, curl, your script)
       │  POST /v1/messages
       │  x-api-key: qsk_<JWT>
       ▼
   Quayside proxy
       │  1. Validate JWT → principal_id + class_slug
       │  2. Resolve class → instance (auto-claim if first time)
       │  3. Load active policy for the class
       │  4. Check budgets (per-class cost caps)
       │  5. Run request-side detector stages
       │     (parallel within stage, serial across stages)
       │  6. If Block → 403; else open audit run
       │  7. Forward to LLM provider (Anthropic, OpenAI, …)
       │  8. Stream response back, running response-side stages
       │     in a sliding window over accumulated text
       │  9. Parse token usage from the SSE stream
       │ 10. Close audit run with final effect
       ▼
   Audit + Token-usage rows in Postgres

Pre-flight failures (404, 403, 422) happen before any chunk reaches the client. Mid-stream blocks emit an event: error SSE frame and terminate the response.

Services

Quayside ships as a single Python service with module boundaries that are designed to split later:

Module	What it owns
`auth`	JWT validation, dev-token minting, principal extraction
`registry`	Class CRUD, instance auto-claim, lifecycle transitions, shadow log
`policy`	Policy schema (`PolicyBody`), versions, draft/publish/rollback
`detectors`	`DetectorAdapter` Protocol + shipped adapters: `RegexDetector`, `PresidioDetector`, `NullDetector`
`proxy`	Pipeline orchestration: identity → policy → cascade → upstream → response detection
`audit`	Run + step rows; query endpoints for the dashboard
`tokens`	Per-call usage capture; spend roll-ups; budget queries

Every module exposes a XService Protocol — modules talk to each other via Protocols, not concrete classes. That means when one module grows to need its own service, only the composition root in app.py changes. Call sites stay identical.

What you build on top

The default deployment ships enough for a real workload — the RegexDetector covers most “block these strings” needs, the PresidioDetector covers PII via Microsoft’s open-source library, and the registry/policy/audit surface is complete.

You build on top by:

Writing custom detectors — anything that needs your own rules, your own services, or your own LLM-judges (guide)
Authoring class-specific policies — most of the per-tenant config goes into policies, not code (reference)
Integrating with your IdP — the default dev_mode mint endpoint is for development; production wires a real OIDC provider via the auth module’s settings

Where this fits

Quayside is a proxy-pattern control plane — it sits in the call path. That’s a different shape from an observability tool (which inspects after the fact) or an SDK (which sits inside agent code). The trade-off: Quayside has to be in the call path to enforce, but you get the same control plane to attach audit, budget, and policy to every call your agents make.