Skip to content

feat(llm): cost telemetry + per-user daily token budget [STORY-012]#17

Merged
khoks merged 1 commit intomainfrom
story/012-cost-telemetry
Apr 26, 2026
Merged

feat(llm): cost telemetry + per-user daily token budget [STORY-012]#17
khoks merged 1 commit intomainfrom
story/012-cost-telemetry

Conversation

@khoks
Copy link
Copy Markdown
Owner

@khoks khoks commented Apr 26, 2026

Summary

  • Versioned cost calculatorMODEL_PRICING table for Opus/Sonnet/Haiku stamped with PRICING_VERSION = "2026-04-26"; costFor() returns cost_usd=0 + known_model=false for unknown models so analytics can flag operator-stale tables without breaking the runtime path.
  • Per-user daily token budgetUsageStore interface + InMemoryUsageStore keyed by (user_id, UTC date); DailyTokenBudget with assertWithinBudget() (throws TokenBudgetExceededError at limit) + decideModel() (downgrade ladder Opus → Sonnet → Haiku, kicks in at threshold = 0.8).
  • BudgetGatedLLMProvider decorator — wraps any LLMProvider: pre-call assert + decideModel (may downgrade), post-call record. Embed passes through. limit=0 means unlimited (self-hosted default); explicit req.model always wins.
  • Telemetry schema extensionLLMTelemetryEventSchema gains cost_usd, pricing_version, optional session_id / cached_tokens / tool_used. AnthropicProvider now stamps cost via costFor(); toolCall populates tool_used from the first invocation. CompleteRequestSchema gains optional session_id.
  • DB persistence splitagent_calls Drizzle migration + DB-backed sink + API 429 mapping moved to STORY-060 to keep STORY-012 at its S estimate. Interfaces (UsageStore, LLMTelemetrySink) are stable; STORY-060 just adds Drizzle impls behind them.

Acceptance criteria

  • All telemetry fields present on LLMTelemetryEvent (provider, model, role, user_id, session_id, task, input/output_tokens, cached_tokens, cost_usd, pricing_version, tool_used, latency_ms, ok, decided_at, prompt_version) — agent_calls table is in STORY-060.
  • Daily budget enforced server-side (BudgetGatedLLMProvider) — applied at the provider layer so any caller goes through it.
  • Graceful model downgrade kicks in at threshold (default 80%) via the MODEL_TIERS ladder.
  • At 100%, TokenBudgetExceededError carries a human-friendly message; HTTP 429 mapping in STORY-060.
  • Cost calculation uses a versioned price table — PRICING_VERSION = "2026-04-26".

Test plan

  • pnpm --filter @learnpro/llm test — 72 passed / 1 skipped (integration test, needs ANTHROPIC_API_KEY).
  • 38 new tests across pricing.test.ts (6), budget.test.ts (20), budget-gated-provider.test.ts (12).
  • pnpm typecheck — green across all 12 packages.
  • pnpm lint — green across all 8 lintable packages.
  • pnpm test — green across the monorepo.
  • Manual smoke (deferred to STORY-060): with LEARNPRO_DAILY_TOKEN_LIMIT=100 + a real Anthropic key, hit the playground twice and observe the friendly 429 on call chore(meta): lift no-code rule, fix grace-days, add PR workflow #2.

🤖 Generated with Claude Code

Adds three new building blocks under @learnpro/llm:

- pricing.ts — versioned MODEL_PRICING table (Opus/Sonnet/Haiku) +
  costFor() calculator. Append-only convention: bump PRICING_VERSION +
  add a new constant when prices change. Unknown models record cost=0
  + known_model=false so analytics can flag operator-stale tables
  without breaking the runtime path.

- budget.ts — UsageStore interface + InMemoryUsageStore (keyed by
  user_id + UTC date). DailyTokenBudget with assertWithinBudget() +
  decideModel() (downgrade ladder: premium=Opus → mid=Sonnet →
  cheap=Haiku, kicks in at the threshold, default 80%) + record().
  limit=0 means unlimited (self-hosted default); explicit req.model
  always wins.

- budget-gated-provider.ts — decorator that wraps any LLMProvider:
  pre-call assertWithinBudget (throws TokenBudgetExceededError),
  pre-call decideModel (may downgrade), post-call record. Embed
  passes through (no per-user attribution for embeddings yet).

LLMTelemetryEventSchema gains cost_usd, pricing_version, optional
session_id / cached_tokens / tool_used. AnthropicProvider stamps
cost via costFor() in recordTelemetry; toolCall populates tool_used
from the first invocation. CompleteRequestSchema gains optional
session_id.

DB-backed UsageStore + agent_calls Drizzle migration + API 429
mapping are split into STORY-060 to keep STORY-012 within its S
estimate (interface + decorator, no migration). Interfaces are
stable; STORY-060 just adds Drizzle impls behind them.

38 new tests across 3 files; 72 tests passing in @learnpro/llm.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@khoks khoks merged commit 3dfafc6 into main Apr 26, 2026
1 check passed
@khoks khoks deleted the story/012-cost-telemetry branch April 26, 2026 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant