ADR-0010: Spotter agent — tool surface, confirm-on-write, per-org cost cap
Status: Accepted (Phase 1 shipped per FIT-161) Date: ~2026-04 (FIT-161 milestone) Context owner: Owner
Context
Coaches and studio owners do a lot of repetitive admin: creating workouts, booking clients, adjusting plans, checking compliance status, drafting messages. An LLM agent that can drive the API on their behalf is high-leverage — IF the safety, cost, and trust model is right.
We picked Claude as the model (Anthropic SDK), and committed to building the agent in-app rather than handing it off to a third-party automation platform.
Decision
The Spotter agent lives in apps/api/src/ai/agent/. Key architectural choices:
Tools as a 1:1 mirror of services
Each tool calls the same service code path as the equivalent HTTP route. Routers under apps/api/src/ai/agent/tools/ (e.g., workouts/, members/, bookings/, forms/, messages/, class_sessions/). Phase 1 (FIT-161) shipped 56 tools across 8 routers. Phase 2 (FIT-162) extends this.
Schemas in libs/shared/src/lib/agent-schemas/. The agent sees the same Zod schemas the HTTP layer uses, plus a category field per tool (read / write / destructive) that drives UX behavior.
As of 2026-05-28 the surface is 63 tools across 11 routers (the original FIT-161 brief cited 56/8 — Phase 1 expanded during shipping). features/spotter-agent/README.md carries the live count.
Confirm-on-write, undo-on-destructive
- Read tools execute immediately.
- Write tools execute immediately but emit a “what I did” card with a 15s undo button (where the tool has an inverse).
- Destructive tools (delete, cancel, suspend) show a confirmation card BEFORE execution. The user clicks confirm; only then does the tool actually run.
A bulk-confirmation card for multi-destructive turns is on the Phase 2 backlog (FIT-162).
Per-org daily $ cost cap
Each org has a daily token-spend cap. Hitting it returns a structured response (not an error) with an upsell. PostHog tracks cap-hit events. Implemented because LLM tokens cost real money and an automated workflow could otherwise drain the budget in seconds.
Prompt caching architecture
Anthropic prompt caching is configured with:
- 1-hour TTL on stable blocks (system prompt cookbook, tool definitions). These rarely change.
- A 4th breakpoint on the message tail for conversation context within a session.
PostHog tracks per-turn cost, cache-hit ratio, and storage growth so we can tune.
Audit + inverse plumbing
Every write tool records an agent_tool_call row. Tools with snapshot-restore inverses (Phase 2 work, FIT-162) snapshot prior state before write, making the undo durable.
Observability
PostHog events per turn:
agent.turn.completed— model, tokens, cost, cache hit, duration.agent.tool.executed— router, action, outcome.agent.daily_cap.hit— org, cap value.
Plus storage growth metrics for the agent_* tables.
Consequences
Positive
- The agent uses the same auth, the same Zod schemas, the same services as the HTTP layer. Bug in the API → bug in the agent (and vice versa) — they fail together. No drift.
- Confirmation flow keeps trust high for destructive ops without making read flows annoying.
- Cost cap is visible to the operator and predictable per org.
- New tools are cheap to add (one schema + one service call + an enum entry + i18n labels).
Negative
- Tool surface grows with the API surface. Phase 2 adds bookings-for-others, member writes, workouts.get, member history, etc.
- The agent can do things the user could do “by hand” — including things they shouldn’t, if permissions on the underlying service are misconfigured. Tools inherit the user’s permissions, but we still need careful destructive-action UX.
- Snapshot-restore for undo is harder than it looks for cross-table writes (e.g., a workout edit that touches sections + movements). Phase 1 doesn’t have it on every write tool.
Discipline
For every new tool:
- Tool registered in
apps/api/src/ai/agent/tools/leaves/with properaudit+confirm+inverse(where applicable). - Schema in
libs/shared/src/lib/agent-schemas/, re-exported from the index. - Action enum in
tool-categories.ts. - EN/HE/RU labels under
agent.tools.*in the dictionaries. - System-prompt cookbook updated if the tool has non-obvious usage.
agent.tool.executedevent fires withrouter+action+outcome.- One real prompt exercises the tool end-to-end (Spotter has its own playground harness).
Open questions / future
- Member-facing agent. Today only coaches/owners can use Spotter. A member-facing chat (book classes, log workout, check schedule) is v3.
- Cross-staff visibility. Owners can’t see what coaches asked the agent. Privacy vs ownership question to resolve.
- WhatsApp / mobile invocation. Once the WhatsApp ISV integration (FIT-140) ships, the agent becomes invocable from chat. Different latency budget, same tools.
- Payments writes. Refunds and chargebacks are explicitly out of Phase 1/2 scope. Treat as a separate ticket with elevated review.
Related
- features/spotter-agent/README.md
- ADR-0005: API-first
- ADR-0008: Platform tiers
- Linear: FIT-161 (Phase 1 shipped), FIT-162 (Phase 2 backlog)