Spotter Agent
FitKit’s embedded AI assistant — a conversational, tool-calling Claude agent integrated into the dashboard. Just like a spotter in the gym, it doesn’t lift the weight for the user; it watches, supports, and steps in when needed.
What
A streaming, tool-using agent that reads and writes FitKit data via 63 typed tools across 11 routers. It runs inside the API (apps/api/src/ai/agent/), surfaces in the dashboard via a slide-in sheet (apps/web/src/components/agent/), and powers prompts like:
- “build a workout: 3 rounds — 200m run, 30 double-unders”
- “who hasn’t paid this month?”
- “schedule open gym tomorrow 13:00–14:00”
- “what’s our MRR?”
- “delete the workout I just made” (with one-click undo)
Why
- The dashboard has 50+ pages. Most owner asks span 3-4 of them. Spotter collapses multi-page workflows into one prompt.
- Coaches’ real-world ask is rarely “open the schedule, scroll, click new, fill in 11 fields” — it’s “put open gym on tomorrow at 1pm”. Spotter is that interface.
- Programming workouts is the single most-time-consuming task in a gym day. Structured-workout creation via Spotter takes <10s versus 2-3 minutes hand-clicking the builder.
- Owners and coaches share a mental model of the gym; one prompt-driven agent is far cheaper to learn than 50 page-specific affordances.
Who
- Owner / admin / coach — full access in v1 (FIT-161 phase 1). Members are 403’d at the controller.
- Spotter agent itself — has its own audit log entries with
metadata.agent: trueso the admin app can pivot agent actions next to human actions.
Persona impact
| Persona | Impact |
|---|---|
| Owner | Run-the-business prompts: revenue, debt, at-risk members, refunds (via dashboard), staff onboarding. |
| Coach | Build-the-day prompts: workouts, assignments, class sessions, daily programming. Single biggest time-saver in the product. |
| Admin | Mirror of owner with no payment ceiling. |
| Member | Out of scope in v1; planned for v2. |
Capabilities
- 63 typed tools across 11 routers (read, workouts, programs, assignments, bookings, class_sessions, class_types, analytics, tasks, program_templates, forms). Each registered via the
@AgentTooldecorator and auto-discovered at module init. - Anthropic prompt caching with three stable
cache_controlbreakpoints (1h TTL) + a 4th breakpoint on the message tail so steady-state cache-hit ratio stabilizes around 0.85-0.95. - Disambiguation flow (
read.ask_user_to_pick) — when ambient context surfaces multiple candidates, the agent suspends the run and asks the user to click one in a card. - Confirmation flow — destructive tools (
confirm: 'destructive') and always-confirm tools (confirm: 'always') suspend the run and emit aconfirmation_pendingSSE event. The user clicks Approve/Reject; the run resumes. - Undo — non-destructive write tools declare an
inverse. A 15-second toast-driven undo invokes the inverse without re-prompting the model. - Per-org daily $ cap — tier-driven AI budget. Lite $1, Pro $5, Elite $25 (USD micros). Pre-check gates every turn; over-cap → immediate
agent_budget_exceededSSE error with a localized upgrade prompt. - Per-conversation cost tracking in
ai_conversations.cost_usd_micros; per-org per-user per-day inai_usage_daily. - Conversation persistence — every turn is a row in
ai_messageswith the Anthropic content blocks stored verbatim for lossless replay. - Replay-integrity guard — catches pending tool_use blocks that never got a tool_result and auto-repairs them so a stuck conversation never blocks the user.
- Compaction — opportunistic Haiku-based summarization once a conversation exceeds 24 messages past the latest anchor.
- Title generation — first message → Haiku-generated 4-6 word title (3s budget, non-blocking).
- Ambient RAG — hybrid lexical+semantic search across programs/workouts/exercises/members; results injected as
<ambient_context>so the model can reference ids without an extra search round-trip. - Page context — the web sends
pathname+selectionso prompts are page-aware (“delete this workout” works because the workout id is in context). - Observability — Pino + Sentry + PostHog event fan-out via
AgentObservabilityService. PII-scrubbed event props. - Daily storage snapshot —
agent_storage_snapshotcron tracks table sizes for cost forecasting.
Related features
analytics/insights— read-only domain endpoints the agent’s analytics router wraps.webhooks— Clerk webhooks create users; agent uses those user rows for theruntime.usercontext.event-tracking— PostHog facade reused by agent observability.admin-app— surfaces agent traces, costs, and conversations in the platform observability dashboard.payments/platform-billing— tier source-of-truth that drives the daily $ cap.exercises(search service) — exercise resolve + ambient RAG.embeddings(apps/api/src/ai/embeddings/) — vector embeddings for workouts, programs, exercises, member profiles. Powers RAG.
Status
- FIT-161 Phase 1 — shipped. 63 tools, prompt caching, undo, confirmation, disambiguation, replay guard, cost cap, observability, conversation persistence, compaction, ambient RAG. Live in production.
- FIT-162 Phase 2 — backlog. Planned: member-role read tools (member self-service), WhatsApp bot wrapper, multilingual TTS reply, agent → email composer, agent → CSV export for ad-hoc queries.
Gaps
- Member-role tools are 403’d at the controller in v1 (
Agent chat is staff-only in this release.). Member surface needs a curated subset (no writes, scoped reads). - No long-running tools — every tool runs inline. A “rebuild embeddings for 10k workouts” prompt would block the turn beyond the SSE timeout. Long-running work should land as a queued task with progress polling.
- No model routing — DEFAULT_MODEL is
claude-sonnet-4-5for everything. Title and compaction useclaude-haiku-4-5, but the main loop never demotes. A planner+executor split (Haiku for read-only, Sonnet for writes) would cut cost ~3x. - No structured “what happened” recap at end-of-turn beyond the
doneevent. Multi-tool turns rely on the assistant’s text summary. - Agent has no opinion about programming quality — it builds what you ask; doesn’t suggest improvements. Future: a coach-style critic loop.
- No background agent — the agent only acts when prompted. No “ping the owner at 9am with their day’s triage” yet.
- Conversation privacy is per-user — staff in the same org cannot see each other’s conversations. Acceptable for v1; cross-staff sharing is FIT-162.