Skip to Content
Living documentation — last reviewed 2026-05-28
FeaturesSpotter AgentSpotter Agent

Spotter Agent

FitKit’s embedded AI assistant — a conversational, tool-calling Claude agent integrated into the dashboard. Just like a spotter in the gym, it doesn’t lift the weight for the user; it watches, supports, and steps in when needed.

What

A streaming, tool-using agent that reads and writes FitKit data via 63 typed tools across 11 routers. It runs inside the API (apps/api/src/ai/agent/), surfaces in the dashboard via a slide-in sheet (apps/web/src/components/agent/), and powers prompts like:

  • “build a workout: 3 rounds — 200m run, 30 double-unders”
  • “who hasn’t paid this month?”
  • “schedule open gym tomorrow 13:00–14:00”
  • “what’s our MRR?”
  • “delete the workout I just made” (with one-click undo)

Why

  • The dashboard has 50+ pages. Most owner asks span 3-4 of them. Spotter collapses multi-page workflows into one prompt.
  • Coaches’ real-world ask is rarely “open the schedule, scroll, click new, fill in 11 fields” — it’s “put open gym on tomorrow at 1pm”. Spotter is that interface.
  • Programming workouts is the single most-time-consuming task in a gym day. Structured-workout creation via Spotter takes <10s versus 2-3 minutes hand-clicking the builder.
  • Owners and coaches share a mental model of the gym; one prompt-driven agent is far cheaper to learn than 50 page-specific affordances.

Who

  • Owner / admin / coach — full access in v1 (FIT-161 phase 1). Members are 403’d at the controller.
  • Spotter agent itself — has its own audit log entries with metadata.agent: true so the admin app can pivot agent actions next to human actions.

Persona impact

PersonaImpact
OwnerRun-the-business prompts: revenue, debt, at-risk members, refunds (via dashboard), staff onboarding.
CoachBuild-the-day prompts: workouts, assignments, class sessions, daily programming. Single biggest time-saver in the product.
AdminMirror of owner with no payment ceiling.
MemberOut of scope in v1; planned for v2.

Capabilities

  • 63 typed tools across 11 routers (read, workouts, programs, assignments, bookings, class_sessions, class_types, analytics, tasks, program_templates, forms). Each registered via the @AgentTool decorator and auto-discovered at module init.
  • Anthropic prompt caching with three stable cache_control breakpoints (1h TTL) + a 4th breakpoint on the message tail so steady-state cache-hit ratio stabilizes around 0.85-0.95.
  • Disambiguation flow (read.ask_user_to_pick) — when ambient context surfaces multiple candidates, the agent suspends the run and asks the user to click one in a card.
  • Confirmation flow — destructive tools (confirm: 'destructive') and always-confirm tools (confirm: 'always') suspend the run and emit a confirmation_pending SSE event. The user clicks Approve/Reject; the run resumes.
  • Undo — non-destructive write tools declare an inverse. A 15-second toast-driven undo invokes the inverse without re-prompting the model.
  • Per-org daily $ cap — tier-driven AI budget. Lite $1, Pro $5, Elite $25 (USD micros). Pre-check gates every turn; over-cap → immediate agent_budget_exceeded SSE error with a localized upgrade prompt.
  • Per-conversation cost tracking in ai_conversations.cost_usd_micros; per-org per-user per-day in ai_usage_daily.
  • Conversation persistence — every turn is a row in ai_messages with the Anthropic content blocks stored verbatim for lossless replay.
  • Replay-integrity guard — catches pending tool_use blocks that never got a tool_result and auto-repairs them so a stuck conversation never blocks the user.
  • Compaction — opportunistic Haiku-based summarization once a conversation exceeds 24 messages past the latest anchor.
  • Title generation — first message → Haiku-generated 4-6 word title (3s budget, non-blocking).
  • Ambient RAG — hybrid lexical+semantic search across programs/workouts/exercises/members; results injected as <ambient_context> so the model can reference ids without an extra search round-trip.
  • Page context — the web sends pathname + selection so prompts are page-aware (“delete this workout” works because the workout id is in context).
  • Observability — Pino + Sentry + PostHog event fan-out via AgentObservabilityService. PII-scrubbed event props.
  • Daily storage snapshotagent_storage_snapshot cron tracks table sizes for cost forecasting.
  • analytics / insights — read-only domain endpoints the agent’s analytics router wraps.
  • webhooks — Clerk webhooks create users; agent uses those user rows for the runtime.user context.
  • event-tracking — PostHog facade reused by agent observability.
  • admin-app — surfaces agent traces, costs, and conversations in the platform observability dashboard.
  • payments / platform-billing — tier source-of-truth that drives the daily $ cap.
  • exercises (search service) — exercise resolve + ambient RAG.
  • embeddings (apps/api/src/ai/embeddings/) — vector embeddings for workouts, programs, exercises, member profiles. Powers RAG.

Status

  • FIT-161 Phase 1 — shipped. 63 tools, prompt caching, undo, confirmation, disambiguation, replay guard, cost cap, observability, conversation persistence, compaction, ambient RAG. Live in production.
  • FIT-162 Phase 2 — backlog. Planned: member-role read tools (member self-service), WhatsApp bot wrapper, multilingual TTS reply, agent → email composer, agent → CSV export for ad-hoc queries.

Gaps

  • Member-role tools are 403’d at the controller in v1 (Agent chat is staff-only in this release.). Member surface needs a curated subset (no writes, scoped reads).
  • No long-running tools — every tool runs inline. A “rebuild embeddings for 10k workouts” prompt would block the turn beyond the SSE timeout. Long-running work should land as a queued task with progress polling.
  • No model routing — DEFAULT_MODEL is claude-sonnet-4-5 for everything. Title and compaction use claude-haiku-4-5, but the main loop never demotes. A planner+executor split (Haiku for read-only, Sonnet for writes) would cut cost ~3x.
  • No structured “what happened” recap at end-of-turn beyond the done event. Multi-tool turns rely on the assistant’s text summary.
  • Agent has no opinion about programming quality — it builds what you ask; doesn’t suggest improvements. Future: a coach-style critic loop.
  • No background agent — the agent only acts when prompted. No “ping the owner at 9am with their day’s triage” yet.
  • Conversation privacy is per-user — staff in the same org cannot see each other’s conversations. Acceptable for v1; cross-staff sharing is FIT-162.