Architecture overview
Quayside is small. Three things to understand:
- The three-layer model — class, instance, session
- The proxy lifecycle — how a single call flows through
- The service boundaries — the modules and how they compose
Three-layer model
| Layer | What it is | Cardinality | Created when |
|---|---|---|---|
| Class | The durable, registered “kind of agent”: eng/code-reviewer. Pins purpose, scope, owner, lifecycle. The unit of governance — policy attaches here. | tens to low hundreds per org | Human-driven via the registry |
| Instance | A specific (class, principal) pair. The runtime identity. Same pair always resolves to the same instance — restart, crash, different machine, same instance. | one per user per class | Auto-claimed on first proxy call |
| Session | One conversation / runtime invocation. Tied to an audit.runs row. | many per instance | Per call |
Class is what humans care about. Instance is what the proxy tracks. Session is what audit reports against.
A single call
Client (Claude SDK, curl, your script) │ POST /v1/messages │ x-api-key: qsk_<JWT> ▼ Quayside proxy │ 1. Validate JWT → principal_id + class_slug │ 2. Resolve class → instance (auto-claim if first time) │ 3. Load active policy for the class │ 4. Check budgets (per-class cost caps) │ 5. Run request-side detector stages │ (parallel within stage, serial across stages) │ 6. If Block → 403; else open audit run │ 7. Forward to LLM provider (Anthropic, OpenAI, …) │ 8. Stream response back, running response-side stages │ in a sliding window over accumulated text │ 9. Parse token usage from the SSE stream │ 10. Close audit run with final effect ▼ Audit + Token-usage rows in PostgresPre-flight failures (404, 403, 422) happen before any chunk reaches the client. Mid-stream blocks emit an event: error SSE frame and terminate the response.
Services
Quayside ships as a single Python service with module boundaries that are designed to split later:
| Module | What it owns |
|---|---|
auth | JWT validation, dev-token minting, principal extraction |
registry | Class CRUD, instance auto-claim, lifecycle transitions, shadow log |
policy | Policy schema (PolicyBody), versions, draft/publish/rollback |
detectors | DetectorAdapter Protocol + shipped adapters: RegexDetector, PresidioDetector, NullDetector |
proxy | Pipeline orchestration: identity → policy → cascade → upstream → response detection |
audit | Run + step rows; query endpoints for the dashboard |
tokens | Per-call usage capture; spend roll-ups; budget queries |
Every module exposes a XService Protocol — modules talk to each other via Protocols, not concrete classes. That means when one module grows to need its own service, only the composition root in app.py changes. Call sites stay identical.
What you build on top
The default deployment ships enough for a real workload — the RegexDetector covers most “block these strings” needs, the PresidioDetector covers PII via Microsoft’s open-source library, and the registry/policy/audit surface is complete.
You build on top by:
- Writing custom detectors — anything that needs your own rules, your own services, or your own LLM-judges (guide)
- Authoring class-specific policies — most of the per-tenant config goes into policies, not code (reference)
- Integrating with your IdP — the default
dev_modemint endpoint is for development; production wires a real OIDC provider via the auth module’s settings
Where this fits
Quayside is a proxy-pattern control plane — it sits in the call path. That’s a different shape from an observability tool (which inspects after the fact) or an SDK (which sits inside agent code). The trade-off: Quayside has to be in the call path to enforce, but you get the same control plane to attach audit, budget, and policy to every call your agents make.