Consume AI model catalog from middlecache by mvvmm · Pull Request #30505 · cloudflare/cloudflare-docs

mvvmm · 2026-05-01T01:51:39Z

Fetches AI model data from the workers_ai_model_catalog middlecache pipeline instead of committed JSON files.

What changed

Data sources

Replaces catalog-models + workers-ai-models content collections (committed JSON) with ai-catalog + workers-ai-catalog backed by middlecacheLoader
Deletes ~154 committed model files; all model data now fetched from R2 at build time via all-models-detail.json
Build fails loudly if middlecache is unavailable rather than silently dropping pages

Model detail pages

Parameters section rendered at build time from parameters.json (pre-processed SchemaRowData per model, extracted from models.tar.gz by bin/fetch-models.ts)
Raw schema files (sync-input.json etc.) served from R2 via worker proxy at request time — no longer static build outputs
Dev middleware proxies /ai/models/** schema requests to middlecache

Catalog index pages

ModelCatalog converted from a React island to Astro — cards are static HTML baked at build time, filter/sort/URL state managed by a vanilla JS script
Filter dropdowns (searchable multi-select combobox, sort) remain as minimal React islands via @base-ui/react
ModelBadges converted from React to Astro

Infra

workflow_dispatch added to publish-production.yml so middlecache graduations can trigger a docs rebuild
bin/fetch-models.ts pre-extracts models.tar.gz as a predev/prebuild hook; --force re-fetches and clears stale cached catalog files

Replace locally-committed model JSON files with live data fetched from the workers_ai_model_catalog middlecache pipeline. Changes: - content.config.ts: replace workers-ai-models + catalog-models collections (local dataLoader) with ai-catalog + workers-ai-catalog collections backed by middlecacheLoader - model-resolver.ts: rewrite getResolvedModels/getLegacyModels to read from the new collections; add fetchModelDetail() to fetch per-model detail.json from middlecache at build time - [...name].astro (both): call fetchModelDetail() concurrently in getStaticPaths to get full model data for detail pages - [...schema].json.ts (both): deleted — schema files are now served directly from R2 via the worker proxy at request time - worker/index.ts: add runtime proxy route for schema JSON files matching /ai/models/**/(sync|streaming|batch|schema)-(input|output).json and the workers-ai equivalent; R2 key mirrors URL path exactly - src/schemas/ai-model-catalog.ts: new Zod schemas for the middlecache catalog file format (AiModelCard, AiModelDetail)

card.slug and detail.slug in the middlecache output have @ stripped (they are R2 path keys). The URL-facing slug must preserve @ to match the existing /ai/models/@cf/... URL structure that ModelDetailPage also uses for schemaBasePath construction.

- Add schemaFiles to ResolvedModel type - Populate schemaFiles in detailToResolved from schema_manifest.files (extracts filenames from full R2 paths, no schema content needed) - Split hasSchema (Parameters gate) from hasSchemaFiles (API Schemas gate) - Render API Schemas (Raw) section from schemaFiles list, deriving human-readable labels from filenames (sync-input → Synchronous · Input)

Astro parses <string, string> as JSX inside template expressions. Move label derivation logic to frontmatter as a typed helper function.

In production, schema files are served by the Worker via R2. In local dev the Worker doesn't run, so proxy matching requests to middlecache.

Instead of fetching schema files at build time (which would add ~300 extra HTTP requests), SchemaDisplayLazy fetches the schema client-side when the user scrolls to the Parameters section (IntersectionObserver with 200px margin). - SchemaDisplayLazy.tsx: React component that fetches a schema URL, runs the same @stoplight/json-schema-tree processing as SchemaDisplay.astro, and renders via SchemaTree/SchemaVariantSelector - ModelDetailPage.astro: use SchemaDisplayLazy (client:visible) when schema is not in memory but schemaFiles URLs are available from schema_manifest; keep existing SchemaDisplay path for build-time schema

github-actions · 2026-05-01T01:52:23Z

This pull request requires reviews from CODEOWNERS as it changes files that match the following patterns:

Pattern	Owners
`/.github/`	`@cloudflare/content-engineering`, `@kodster28`, `@mvvmm`, `@colbywhite`, `@ahaywood`, `@MohamedH1998`
`*`	`@cloudflare/pcx-technical-writing`, `@cloudflare/product-owners`
`*.ts`	`@cloudflare/content-engineering`, `@kodster28`
`*.js`	`@cloudflare/content-engineering`, `@kodster28`
`/bin/fetch-catalog-models.ts`	`@abhishekkankani`, `@palashgo`, `@thebongy`, `@roerohan`, `@kathayl`, `@mchenco`, `@zeke`, `@superhighfives`, `@bfirsh`, `@mattrothenberg`, `@ethulia`, `@cloudflare/content-engineering`, `@cloudflare/pcx-technical-writing`, `@cloudflare/product-owners`, `@kodster28`
`package.json`	`@cloudflare/content-engineering`
`*.astro`	`@cloudflare/content-engineering`, `@kodster28`
`/src/components/`	`@cloudflare/content-engineering`, `@kodster28`
`/src/content/catalog-models/`	`@abhishekkankani`, `@palashgo`, `@thebongy`, `@roerohan`, `@kathayl`, `@mchenco`, `@zeke`, `@superhighfives`, `@bfirsh`, `@mattrothenberg`, `@ethulia`, `@cloudflare/content-engineering`, `@cloudflare/pcx-technical-writing`, `@cloudflare/product-owners`
`/src/content/workers-ai-models/`	`@craigsdennis`, `@cloudflare/content-engineering`, `@cloudflare/pcx-technical-writing`, `@cloudflare/product-owners`, `@kodster28`

No longer needed — model data is fetched from middlecache at build time.

github-actions · 2026-05-01T02:23:05Z

Preview URL: https://5365f2f2.preview.developers.cloudflare.com
Preview Branch URL: https://middlecache-ai-models.preview.developers.cloudflare.com

- fetchAllModelDetails() downloads all-models-detail.json once via downloadToDotTempIfNotPresent (cached for the build session) - Returns a slug→ResolvedModel map; detail pages look up their model with a single O(1) map access - Throws on failure so the build fails loudly instead of silently generating index links that 404 - Both [...name].astro pages now fetch the map once with Promise.all alongside the card collection read

The package uses require() at module level — it's CJS-only and breaks when @stoplight/json-schema-tree is imported client-side. Since none of the AI model schemas use allOf, a no-op stub (return schema unchanged) is safe. The real package still runs in Node for build-time SSG usage.

SchemaDisplayLazy now fetches .rows.json (pre-processed SchemaRowData[]) instead of raw schema + running @stoplight/json-schema-tree client-side. The CJS library runs in Node at pipeline time; the browser just receives plain JSON and renders it directly via SchemaTree. - SchemaDisplayLazy: simplified to fetch RowsPayload and render; no more json-schema-tree dependency in browser bundle - ModelDetailPage: use .rows.json URLs for lazy Parameters rendering - worker/index.ts + middleware: extend regex to also proxy .rows.json - astro.config.ts: remove @stoplight/json-schema-merge-allof alias - src/util/merge-allof-stub.ts: deleted (no longer needed)

mvvmm · 2026-05-01T13:39:02Z

/bonk please review

ask-bonk

A few issues to address before merge.

Out-of-scope notes (not blocking):

bin/compare-model-sources.ts references the deleted local JSON directories. Since model data now lives in middlecache, this utility is dead code and should be removed or rewritten in a follow-up PR.
AGENTS.md still references the deleted workers-ai-models content collection. Update it to reflect the new ai-catalog and workers-ai-catalog collections.

ask-bonk · 2026-05-01T13:51:36Z

Review complete. No build breakers found (pnpm run check and pnpm run lint both pass). Posted 2 inline suggestions and noted 2 out-of-scope items.

Issues:

HIGH — src/util/model-resolver.ts:66: schema_manifest.files contains both raw .json schema files and pre-processed .rows.json files for lazy UI rendering. Because detailToResolved keeps all basenames, .rows.json files leak into the API Schemas (Raw) download section where schemaFileLabel mislabels them (e.g., "Rows" instead of "Synchronous · Input"). Suggested filtering out .rows.json files when building schemaFiles.
LOW — src/components/models/ModelDetailPage.astro:163: Comment says schemaFiles stores full R2 paths, but model-resolver.ts maps them to basenames only. Comment needs updating.

Out-of-scope notes (not blocking):

bin/compare-model-sources.ts is dead code referencing the deleted local JSON directories — should be removed or rewritten for middlecache in a follow-up.
AGENTS.md still references the deleted workers-ai-models collection and should be updated to ai-catalog + workers-ai-catalog.

github run

- All schema files now live under /ai/models/{slug}/ for both catalog and legacy models — schemaBasePath uses r2Slug (@ stripped) for all - parameters.json replaces multiple .rows.json files: single fetch contains pre-processed SchemaRowData for all modes - SchemaDisplayLazy updated to fetch parameters.json and render both flat (input/output tabs) and multi-mode (primary mode tabs) layouts - Worker proxy + dev middleware simplified: single regex for /ai/models/<slug>/(parameters.json|{mode}-{dir}.json) → R2 models/<slug>/ - schema_manifest.{parameters_json, raw_files} shape consumed correctly - No workers-ai URL changes — page URLs unchanged, just schema fetch paths

SchemaDisplayLazy now replicates ModelDetailPage's production rendering: - Flat models: Input heading + RowsDisplay, Output heading + RowsDisplay - Multi-mode with shared input: Input once + per-mode output collapsibles (Synchronous, Streaming, Batch) matching the production collapsible layout - Multi-mode with differing inputs: per-mode collapsibles with Input/Output Detects shared inputs by comparing JSON serialization of RowsPayload, mirroring the production firstInputJson/allInputsIdentical check.

…meters - Fetch parameters.json at build time in ModelDetailPage.astro frontmatter - SchemaDisplayFromRows.astro renders pre-processed SchemaRowData using SchemaTreeView/SchemaVariantSelector (pure Astro, no React component) - ModelDetailPage uses four clean separate conditional blocks instead of one ternary chain — Starlight Tabs used correctly for flat Input/Output, matching production layout exactly - Delete SchemaDisplayLazy.tsx (React component no longer needed)

Plain <h2 id="..."> elements inside StarlightPage slots don't pass through the rehype pipeline, so they never get the .heading-wrapper anchor link structure. Replace all section-level h2s with AnchorHeading (same pattern used by WranglerCommand, CompatibilityFlags, etc.).

Replace repeated Open/Download Lucide SVGs in ModelDetailPage with a dedicated SchemaFileLinks.astro component.

Replace Date.now() with a stable hash of the schema JSON so sessionStorage expand/collapse state persists across builds.

ModelDetailPage.astro now imports ModelInfo.astro (pure display, no interactivity). ModelInfo.tsx kept for use inside ModelCatalog.tsx (client React context).

Fixes any types, fragment-in-tr HTML bug, and locale-sensitive Intl formatting that could differ between build and client.

…ayout Cases 1 and 4 (build-time schema and pre-processed rows) now share the same MultiModeParams component for the shared-input/differing-input collapsible structure.

…psible layout" This reverts commit f373f62.

/workers-ai/glossary/ → /workers-ai/platform/glossary/

mvvmm · 2026-05-01T16:41:11Z

/bonk please review

ask-bonk

A few minor issues to address.

Out-of-scope notes (not blocking):

Cases 1 and 2 in ModelDetailPage.astro (lines 460–561) are now dead code because always sets and . With the middlecache path, only Cases 3 and 4 (parametersFlat / parametersMulti) ever execute. Consider removing the legacy build-time schema paths in a follow-up PR.
still references deleted local JSON directories — should be removed or rewritten for middlecache.

ask-bonk

A few minor issues to address.

Out-of-scope notes (not blocking):

Cases 1 and 2 in ModelDetailPage.astro (lines 460-561) are now dead code because detailToResolved always sets schema to empty objects and apiModes to undefined. With the middlecache path, only Cases 3 and 4 (parametersFlat / parametersMulti) ever execute. Consider removing the legacy build-time schema paths in a follow-up PR.
bin/compare-model-sources.ts still references deleted local JSON directories -- should be removed or rewritten for middlecache.

ask-bonk · 2026-05-01T16:54:58Z

Review complete. pnpm run check and pnpm run lint both pass. No build breakers found. Posted 3 inline suggestions.

Out-of-scope notes (not blocking):

Cases 1 and 2 in ModelDetailPage.astro (lines 460–561) are dead code now that detailToResolved always sets empty schema and apiModes: undefined. Only Cases 3 and 4 execute with middlecache data — consider removing the legacy build-time schema paths in a follow-up.
bin/compare-model-sources.ts still references deleted local JSON directories — should be removed or rewritten for middlecache.

github run

- model-resolver.ts: filter .rows.json from schemaFiles defensively (worker proxy only serves raw .json schema files) - ModelDetailPage.astro: clarify Cases 1 and 2 are legacy/unused paths now that detailToResolved always sets empty schema/apiModes - ModelFeatures.astro: add trailing slash to /workers-ai/function-calling/ - bin/compare-model-sources.ts: delete dead code referencing deleted local JSON directories - AGENTS.md: replace deleted workers-ai-models collection with new ai-catalog and workers-ai-catalog middlecache-backed collections

- bin/fetch-models.ts: downloads models.tar.gz from middlecache and extracts to .tmp/middlecache/v1/workers-ai-model-catalog/models/ (same pattern as bin/fetch-skills.ts) - package.json: wire into prebuild and predev hooks - ModelDetailPage.astro: use downloadToDotTempIfNotPresent for parameters.json — reads from .tmp/ if already extracted by fetch-models.ts (no-op), otherwise fetches from middlecache HTTP

Allows triggering a production rebuild programmatically via the GitHub Actions API without needing to push a commit.

…fetch-models - getDotTmpPath() in model-resolver.ts resolves .tmp/ relative to src/util/ (repo root) — ModelDetailPage.astro was resolving from src/components/models/ giving src/.tmp/ instead of .tmp/ - fetch-models.ts now clears stale catalog files on each run so all-models-detail.json etc. are always re-fetched after a pipeline graduation

- [fetch-models] and [fetch-skills] blue prefix tags via kleur - dim text for skip message, cyan for re-fetch hint - fetch-models: move stale cache cleanup before download so downloadToDotTempIfNotPresent always re-fetches after --force

- ModelCatalog.astro: server-renders all cards as static HTML with data-* attributes; vanilla JS <script> owns filter/sort/URL state - FilterDropdown.tsx + FilterDropdownWrapper.tsx: extracted from old ModelCatalog.tsx; wrapper self-initialises from URL params on mount and communicates via model-catalog-filter CustomEvents - SortSelect.tsx + SortSelectWrapper.tsx: same pattern for sort order - ModelBadges.astro: direct port of ModelBadges.tsx (no interactivity) - ModelInfo.astro: widened Props to accept ModelCardData | ResolvedModel - Deleted ModelCatalog.tsx

mvvmm · 2026-05-04T05:12:52Z

/bonk please review

ask-bonk

Review complete. All previously flagged issues have been addressed (pnpm run check, pnpm run lint, and pnpm run format:core:check all pass). Posted 3 inline suggestions for a history-state bug and duplicate event dispatches.

ask-bonk · 2026-05-04T05:21:06Z

Review complete. All previously flagged issues have been addressed (pnpm run check, pnpm run lint, and pnpm run format:core:check all pass). Posted 3 inline suggestions.

Issues:

MEDIUM — src/components/ModelCatalog.astro:385-393: syncUrl unconditionally calls history.pushState, creating duplicate history entries even when the filter state matches the current URL. This breaks browser back navigation. Suggested adding a guard that only pushes when the URL would actually change.
MEDIUM — src/components/models/FilterDropdown.tsx:32-40 and src/components/models/SortSelect.tsx:23-30: Both inner components dispatch catalog filter/sort events, but their wrappers (FilterDropdownWrapper, SortSelectWrapper) already dispatch the same events via onChange. Every user interaction therefore fires the event twice, causing applyFilters (and syncUrl) to run twice. Suggested removing the duplicate dispatches from the presentational components and letting the wrappers own the events.

Out-of-scope notes (not blocking):

src/components/models/ModelFeatures.tsx (the old React component) is now unused dead code — the new ModelFeatures.astro has superseded it. Consider removing the .tsx file in a follow-up.

github run

- FilterDropdown: remove CustomEvent dispatch (FilterDropdownWrapper owns it — both were firing on every selection causing applyFilters to run twice) - SortSelect: same fix — remove duplicate dispatch - FilterDropdown: remove unused filterKey prop (only the wrapper needs it) - ModelCatalog.astro syncUrl: only pushState when URL actually changes (previously pushed duplicate history entries on every filter pass, breaking browser back navigation) - Delete ModelFeatures.tsx (superseded by ModelFeatures.astro) - compare-model-sources.ts was already deleted in a prior commit

downloadToDotTempIfNotPresent was fetching from middlecache when the file wasn't found, but individual model parameters.json files aren't served as standalone HTTP endpoints (they only exist in models.tar.gz). In CI this caused every model detail page to get a 404 HTML response and silently render without a Parameters section. parameters.json is always extracted by bin/fetch-models.ts (prebuild); read it directly with readFileSync instead.

Previously ai-catalog.json, workers-ai-catalog.json, and all-models-detail.json were fetched lazily by downloadToDotTempIfNotPresent during the Astro build. Now fetch-models.ts pulls all four files (three JSONs + tarball) in parallel before the build starts, so the entire catalog payload is on disk and nothing depends on HTTP during the build.

…tags - All log/warn/error messages now start with the blue tag - Warnings are yellow, errors are red, success is green - Extracted tag into a const to avoid repetition

import.meta.url in getDotTmpPath() and downloadToDotTempIfNotPresent resolves relative to the *compiled* file location during astro build, which is dist/ — making ../../.tmp resolve to dist/.tmp/ instead of the repo root .tmp/ where fetch-models.ts actually writes files. process.cwd() is always the repo root regardless of where the compiled output lives.

mvvmm added 7 commits April 30, 2026 20:19

Fix: move Record<string,string> type annotation out of JSX template

0530a75

Astro parses <string, string> as JSX inside template expressions. Move label derivation logic to frontmatter as a typed helper function.

Proxy AI model schema JSON files in dev middleware

20f6e7d

In production, schema files are served by the Worker via R2. In local dev the Worker doesn't run, so proxy matching requests to middlecache.

Fix: disable content encoding on schema proxy fetch in dev middleware

c7b4f04

github-actions Bot added the size/xl label May 1, 2026

github-actions Bot assigned kodster28 May 1, 2026

mvvmm added 2 commits April 30, 2026 20:53

Delete local model JSON files and fetch scripts

a56d8c3

No longer needed — model data is fetched from middlecache at build time.

Fix: run prettier on changed files

233bc1d

mvvmm added 4 commits April 30, 2026 21:38

Remove stale fetchModelDetail references in comments

dbb6e5f

ask-bonk Bot reviewed May 1, 2026

View reviewed changes

Comment thread src/util/model-resolver.ts Outdated

Comment thread src/components/models/ModelDetailPage.astro Outdated

mvvmm added 10 commits May 1, 2026 09:09

Fix: remove unused useCallback import

89f424b

Extract inline SVGs to SchemaFileLinks.astro

a018462

Replace repeated Open/Download Lucide SVGs in ModelDetailPage with a dedicated SchemaFileLinks.astro component.

Fix non-deterministic schemaId in SchemaDisplay.astro

1f6f7a1

Replace Date.now() with a stable hash of the schema JSON so sessionStorage expand/collapse state persists across builds.

Fix invalid <p><ul> nesting in nova-3 pricing Aside

c72181b

Convert ModelInfo from React to Astro for detail page usage

a26277e

ModelDetailPage.astro now imports ModelInfo.astro (pure display, no interactivity). ModelInfo.tsx kept for use inside ModelCatalog.tsx (client React context).

Convert ModelFeatures from React to Astro

ef1c0e6

Fixes any types, fragment-in-tr HTML bug, and locale-sensitive Intl formatting that could differ between build and client.

mvvmm added 3 commits May 1, 2026 11:24

Extract MultiModeParams.astro to deduplicate multi-mode collapsible l…

f373f62

…ayout Cases 1 and 4 (build-time schema and pre-processed rows) now share the same MultiModeParams component for the shared-input/differing-input collapsible structure.

Revert "Extract MultiModeParams.astro to deduplicate multi-mode colla…

13fb3c7

…psible layout" This reverts commit f373f62.

Fix glossary link URL in ModelFeatures

008db28

/workers-ai/glossary/ → /workers-ai/platform/glossary/

ask-bonk Bot reviewed May 1, 2026

View reviewed changes

Comment thread src/util/model-resolver.ts Outdated

Comment thread src/components/models/ModelDetailPage.astro Outdated

Comment thread src/components/models/ModelFeatures.astro Outdated

ask-bonk Bot reviewed May 1, 2026

View reviewed changes

Comment thread src/util/model-resolver.ts Outdated

Comment thread src/components/models/ModelDetailPage.astro Outdated

Comment thread src/components/models/ModelFeatures.astro Outdated

ask-bonk Bot reviewed May 1, 2026

View reviewed changes

Comment thread src/util/model-resolver.ts Outdated

ask-bonk Bot reviewed May 1, 2026

View reviewed changes

Comment thread src/components/models/ModelDetailPage.astro Outdated

ask-bonk Bot reviewed May 1, 2026

View reviewed changes

Comment thread src/components/models/ModelFeatures.astro Outdated

mvvmm added 6 commits May 1, 2026 11:58

Add workflow_dispatch to publish-production workflow

226ee16

Allows triggering a production rebuild programmatically via the GitHub Actions API without needing to push a commit.

ask-bonk Bot reviewed May 4, 2026

View reviewed changes

Comment thread src/components/ModelCatalog.astro Outdated

Comment thread src/components/models/FilterDropdown.tsx

Comment thread src/components/models/SortSelect.tsx

mvvmm added 5 commits May 4, 2026 00:29

Prefix all console output with colored [fetch-models]/[fetch-skills] …

b1e5fa8

…tags - All log/warn/error messages now start with the blue tag - Warnings are yellow, errors are red, success is green - Extracted tag into a const to avoid repetition

Conversation

mvvmm commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed

Uh oh!

github-actions Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mvvmm commented May 1, 2026

Uh oh!

ask-bonk Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ask-bonk Bot commented May 1, 2026

Uh oh!

mvvmm commented May 1, 2026

Uh oh!

ask-bonk Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ask-bonk Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ask-bonk Bot commented May 1, 2026

Uh oh!

mvvmm commented May 4, 2026

Uh oh!

ask-bonk Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ask-bonk Bot commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mvvmm commented May 1, 2026 •

edited

Loading

github-actions Bot commented May 1, 2026 •

edited

Loading

github-actions Bot commented May 1, 2026 •

edited

Loading