PRD: Auto-sync examples from upstream repos via PR-per-change pipeline

## Problem Statement

As the maintainer of the Liquid AI documentation site, I need every example MDX page under `examples/**` to faithfully reflect its upstream source in either `Liquid4All/cookbook` or `Liquid4All/LeapSDK-Examples`. Today this is a manual process: whenever someone updates a cookbook example or adds a new LeapSDK demo, I have to spot the change, open the upstream README, and hand-edit the corresponding MDX page in this repo. The work is tedious and easy to forget, which produces "rotten examples" where the docs and the upstream code disagree (a CLI flag changes, a default model is swapped, a dependency is upgraded, and the docs lie about it).

I want an automated pipeline that opens a PR in this docs repo every time an upstream example's README changes, so I stay in the loop for review and quality but lose the toil.

## Solution

A sync pipeline driven by `repository_dispatch` events. Each upstream repo (`cookbook`, `LeapSDK-Examples`) hosts a small GitHub Actions workflow that fires a webhook into this docs repo whenever a README under its examples directory changes on `main`. The docs repo has a workflow that receives those events. For each event:

1. It looks up which docs MDX file the upstream example maps to (via a manifest committed to this repo).
2. It calls an LLM to produce an updated (or brand-new) MDX file from the README.
3. It opens or appends to a sync PR on a deterministic per-example branch.

The maintainer reviews each PR, tweaks if needed, and merges. The bot maintains one open PR per example at a time. Closing a PR without merging tells the bot not to nag again until upstream advances further.

New examples (folders that don't yet have a manifest entry) land in an `examples/_inbox/` directory with a categorization checklist. A CI guard prevents PRs from merging while any file is still in the inbox, forcing the maintainer to assign a final category and slug during review.

## User Stories

1. As a docs maintainer, I want a PR opened automatically when an upstream example's README changes, so I no longer have to manually scan upstream repos for changes.
2. As a docs maintainer, I want the bot to draft the MDX changes for me, so I only have to review and tweak rather than write from scratch.
3. As a docs maintainer, I want the bot to detect README changes on `main` of either upstream repo, so I don't have to instrument any other branches.
4. As a docs maintainer, I want at most one open sync PR per example at any time, so my review queue isn't flooded with duplicates.
5. As a docs maintainer, I want bursts of upstream commits (e.g., four typo fixes in one merge) to collapse into a single PR, so I'm not spammed.
6. As a docs maintainer, I want subsequent upstream changes during my review to append commits to the existing PR, so I see one stack of bot updates rather than a swarm of PRs.
7. As a docs maintainer, I want appended bot commits to not clobber any review edits I've already made on the PR branch, so my manual tweaks survive.
8. As a docs maintainer, when I close a sync PR without merging, I want the bot to stop opening PRs for that example until upstream advances past the SHA I rejected, so rejection actually means something.
9. As a docs maintainer, when upstream eventually changes the example again, I want the rejection gate to release automatically, so I don't have to remember to re-enable syncing.
10. As a docs maintainer, when an upstream contributor adds a brand-new example folder, I want a PR that creates a draft MDX in an inbox location, so I don't have to do the initial bot-quality drafting myself.
11. As a docs maintainer, on a new-example PR, I want a checklist that reminds me to move the file out of the inbox, choose a category, set the slug, and update the manifest, so I don't accidentally merge an uncategorized file.
12. As a docs maintainer, I want CI to block any PR that still has files in the inbox from being merged, so categorization can never be skipped.
13. As a docs maintainer, I want the bot to learn my house style for example MDX (Card opener, Accordion conventions, frontmatter shape) from a single style guide file plus a few exemplar MDX pages, so PRs feel consistent without me writing the style guide into every prompt.
14. As a docs maintainer, I want few-shot exemplars chosen from the same docs category as the target example (e.g., a Web example syncs against other Web exemplars), so the bot mimics category-specific conventions.
15. As a docs maintainer, I want to mark exemplar MDX files explicitly in the manifest, so the bot doesn't reach for an arbitrary "close enough" file that might itself be in a weird state.
16. As a docs maintainer, I want the bot to receive a list of valid internal docs routes (e.g., `/lfm/models/lfm25-1.2b-thinking`) as part of every prompt, so it cannot hallucinate broken internal links.
17. As a docs maintainer, when the bot updates an existing MDX, I want it to be given the previous upstream README, the new upstream README, and the current MDX, so it can produce a minimal targeted edit rather than a full rewrite.
18. As a docs maintainer, I want the bot to preserve manual edits I've made to an MDX between syncs (Accordions I added, prose I rephrased), so my human-curated content isn't silently overwritten.
19. As a docs maintainer, I want the bot to commit an updated `last_synced_sha` into the same PR as the MDX change, so merging the PR advances the sync state in a single atomic action.
20. As a docs maintainer, I want the sync to be a no-op when the upstream README hasn't actually changed since the last sync, so I never get a PR with an empty diff.
21. As a docs maintainer, I want PR titles to follow a predictable format (`Sync: <docs_path>` for updates, `New example: <slug> (needs categorization)` for inbox PRs), so I can scan my queue quickly.
22. As a docs maintainer, I want each sync PR body to link to the upstream commit that triggered it, so I can see the source change without leaving the PR.
23. As a docs maintainer, I want the bot to commit messages that reference the upstream commit short-SHA, so the per-PR commit log is an audit trail.
24. As a docs maintainer, I want the manifest to live in this docs repo (not in upstream), so I don't have to ask upstream contributors to maintain anything specific to docs.
25. As a docs maintainer, I want the manifest to be a single YAML file I can inspect and hand-edit if needed, so I'm not locked out of recovery when something goes wrong.
26. As a docs maintainer, when an upstream folder is renamed, I want the ability to manually patch the manifest's `path` field, so I don't have to choose between losing history and re-creating the entry.
27. As a docs maintainer, I do not want the bot to validate MDX in v1, so the initial implementation stays simple. CI will surface MDX parse errors and broken links on the PR.
28. As an upstream contributor in `cookbook`, I want the docs-sync workflow in my repo to be small and lightweight, so it doesn't slow down my normal pushes.
29. As an upstream contributor, I do not want to maintain a docs manifest in my repo, so the docs sync remains the docs team's concern.
30. As an upstream contributor, I want my README to be the only file that triggers a docs sync, so I have a clear contract: keep the README in sync with my code, and the docs follow automatically.
31. As a docs maintainer, I want the upstream workflow's PAT to be scoped narrowly (only `repository_dispatch:write` on the docs repo), so a leak does minimal damage.
32. As a docs maintainer, I want the upstream workflows to fire one dispatch per changed README (not per push), so each example syncs independently even if a single push touches several.
33. As a docs maintainer, I want the docs-repo workflow to use a `concurrency` group keyed on `(upstream_repo, example_path)` with `cancel-in-progress: true`, so concurrent bursts naturally resolve to the latest state.
34. As a docs maintainer, I want the bot to skip the LLM call when the upstream README's SHA matches the manifest's `last_synced_sha`, so I don't burn API credit on no-ops.
35. As a docs maintainer, when an event arrives for an upstream example with no manifest entry, I want it routed through the new-example flow, so the inbox PR mechanism kicks in.
36. As a docs maintainer, I want the `last_rejected_sha` mechanism to be stored next to each manifest entry, so per-example rejection state is co-located with mapping state.
37. As a docs maintainer, I want the bot to choose a deterministic branch name per example (`sync/<docs-slug>` for updates, `sync/new-<slug>` for inbox PRs), so branch state is always predictable.
38. As a docs maintainer, I want the LLM model used to be configurable via the workflow file, so I can swap Sonnet for Opus (or back) without code changes.
39. As a documentation reader, I want example pages on the live site to accurately reflect the corresponding upstream code, so I'm not misled when I clone the example and run it.
40. As a documentation reader, I want example pages to keep using consistent Mintlify components, so the docs feel coherent across categories.
41. As a docs maintainer, I want the architecture and design rationale captured in a written plan in the repo, so a future contributor (or AI agent) can implement and extend this system without having to reconstruct the design from chat history.

## Implementation Decisions

The decisions below were made in a `/grill-me` design session and are captured in detail in `EXAMPLES_SYNC_AUTOMATION.md` at the repo root.

### Architectural shape

- **Event-driven**, not polling. Each upstream repo hosts a workflow that fires `repository_dispatch` into this docs repo when a README under its example tree changes on `main`.
- **Manifest in docs repo**, not upstream. A single YAML file in this repo holds all mappings and sync state. No instrumentation required in upstream beyond the dispatch workflow.
- **One docs-repo workflow** receives dispatches, runs the sync, opens/updates PRs. Concurrency-grouped on `(upstream_repo, example_path)` with `cancel-in-progress: true` to collapse bursts.

### Content generation

- **Mode**: LLM transformation. Two distinct prompt modes:
  - **Update mode**: receives previous upstream README, new upstream README, current MDX, exemplars, valid-routes list, style guide. Produces a minimal edit. Designed to preserve human edits in the current MDX.
  - **New-example mode**: receives new upstream README, exemplars, valid-routes list, style guide. Produces a complete MDX from scratch.
- **Source surface**: upstream `README.md` only. Code files are not watched. The contract with upstream owners is that they update their README when behavior or commands change.
- **Style guidance**: a maintained `STYLE.md` (system prompt) + 1 to 2 category-scoped exemplar MDX files marked `exemplar: true` in the manifest.
- **Internal link integrity**: the prompt includes the list of valid internal docs routes (enumerated from `lfm/`, `leap/`, `examples/`, `deployment/`), preventing hallucinated paths.
- **Model**: Claude Sonnet 4.6 as default. Configurable. Opus 4.7 as upgrade path if quality is insufficient. Per-sync cost estimated under $0.20.

### Manifest schema

Prototype-derived YAML shape (kept for precision):

```yaml
mappings:
  - upstream: Liquid4All/cookbook
    path: examples/flight-search-assistant
    docs_path: examples/laptop-examples/flight-search-assistant.mdx
    last_synced_sha: <sha-of-upstream-README-at-last-merged-sync>
    last_rejected_sha: <sha-at-last-closed-unmerged-PR, or null>
    exemplar: false
```

`last_synced_sha` drives diff-mode by letting the bot fetch the "old README" version. `last_rejected_sha` gates re-opening PRs after a maintainer rejection.

### PR lifecycle

- **Deterministic branch name** per example: `sync/<docs-slug>` for updates, `sync/new-<slug>` for new examples. The MDX path's slug is the source of truth.
- **One open PR per example, ever.** Before opening, the workflow checks whether a PR is already open from the branch. If yes, it appends a commit instead of opening a duplicate.
- **Append, not force-push.** Human edits made to the PR branch during review are preserved by appending bot commits on top.
- **PR includes the manifest bump.** Merging the PR atomically advances `last_synced_sha`. Without this, the bot would loop indefinitely on the same diff.
- **Rejected-PR semantics.** A closed-without-merge PR is detected by a companion workflow listening on `pull_request: closed` for `sync/*` branches. That workflow writes `last_rejected_sha` directly to `main` on the manifest. Future dispatches with the same SHA are skipped; once upstream advances, the gate releases.

### New-example flow

- Dispatch arrives for an upstream example with no manifest entry. Bot enters new-example mode.
- MDX is written to `examples/_inbox/<slug>.mdx`. A provisional manifest entry is appended with `docs_path` pointing at the inbox.
- PR body includes a categorization checklist: move file, rename slug, update title, update manifest `docs_path`.
- **Inbox CI guard**: a workflow on PR and on `main` fails if any file exists under `examples/_inbox/`. This prevents the inbox from being merged.

### Module breakdown

The implementation should be carved into deep modules with simple interfaces:

- **Manifest store**: loads, queries, mutates the manifest YAML. Methods: `find(repo, path)`, `upsert(entry)`, `mark_rejected(entry, sha)`, `bump_synced(entry, sha)`, `exemplars_for_category(category)`. Hides serialization. Replaceable.
- **PR orchestrator**: executes the idempotency state machine. Given (branch, files, metadata), decides between create-PR, append-commit, skip-rejected, fresh-PR-after-advance. Hides all `gh` CLI calls.
- **LLM content generator**: two methods, `generate_update(...)` and `generate_new(...)`. Hides prompt construction, Anthropic SDK, transient-error retries.
- **Sync state machine**: the workflow entry point. Receives the dispatch payload, decides UPDATE vs NEW_EXAMPLE vs SKIP, composes the modules above.
- **Upstream fetcher** (shallow): wraps `gh api` for README fetches at specific SHAs.
- **Routes enumerator** (shallow): walks the docs filesystem to produce the valid internal routes list.
- **Inbox guard** (shallow): CI script that fails if `examples/_inbox/*` is non-empty.

### Validation

- **Skipped in v1.** Mintlify CI build and the existing `link-snapshot.yaml` check catch MDX errors at PR-check time. If failure rates prove too high, v2 will add a pre-PR validation step with a retry loop (LLM is re-prompted with the specific error).

### Secrets

- `ANTHROPIC_API_KEY` in this docs repo's secrets.
- `DOCS_REPO_PAT` in each upstream repo's secrets, fine-grained, scoped only to `repository_dispatch:write` on `Liquid4All/docs`. Annual rotation.

### Implementation phases

Recommended build order (each phase independently shippable):

1. Manifest scaffold: generate `.examples-sync.yaml` from current state of `examples/`, manually verify, commit.
2. Style guide: write `.examples-sync/STYLE.md` distilled from existing MDX conventions and the repo's `CLAUDE.md`.
3. Sync script (local-only): build the orchestrator and modules. Run manually against one example with args.
4. Docs-repo workflow: add the GitHub Actions workflow. Configure `ANTHROPIC_API_KEY`. Test with manual dispatch.
5. Inbox guard CI.
6. Upstream workflows in both upstream repos, with `DOCS_REPO_PAT` secret in each.
7. End-to-end test with a deliberate upstream README change.
8. Closed-PR listener that maintains `last_rejected_sha`.

## Testing Decisions

### Test philosophy

Tests should exercise external behavior of each module, not its implementation. The contract that matters is the module's interface, not how it gets to the answer. A test like "calling `manifest.find(repo, path)` after `upsert` returns the upserted entry" tests behavior. A test like "the YAML is serialized with two-space indents" tests implementation and would block valid refactors.

For the LLM client specifically: tests should verify the prompt is shaped correctly (correct sections, correct exemplars, correct routes list) using recorded LLM responses (cassettes), not verify that the model produces specific MDX content. We don't own the model's output; we own the prompt.

### Modules to test

| Module | Priority | Test style |
|---|---|---|
| Manifest store | high | Unit. Fixture YAML in, behavior assertions out. Tests cover lookup, upsert, mark_rejected, bump_synced, exemplar selection. |
| PR orchestrator | high | Unit with mocked `gh` CLI. Cover all five state transitions: no-branch creates PR, open-PR appends, closed-unmerged-same-SHA skips, closed-unmerged-advanced-SHA creates fresh PR, merged-PR (clean slate) creates new. |
| LLM content generator | medium | Cassette-style. Record an Anthropic response, replay in tests. Assert that the prompt sent to the API contains the right sections and ordering. |
| Sync state machine | medium | Integration-style with all sub-modules mocked. Cover the three paths: UPDATE, NEW_EXAMPLE, SKIP_REJECTED. |
| Upstream fetcher | skip | Thin wrapper; covered transitively. |
| Routes enumerator | skip | Thin wrapper; covered transitively. |
| Inbox guard | skip | CI test itself is the regression check. |

### Prior art

The repo already uses `tsx`-run scripts under `scripts/` (e.g., `scripts/generateLinkSnapshot.ts`). New scripts should follow the same `tsx scripts/<file>.ts` pattern, per `CLAUDE.md`. The link-snapshot system (`link-snapshot.yaml` + `scripts/generateLinkSnapshot.ts` + `.github/workflows/check-link-snapshot.yaml`) is a strong reference: append-only ledger, CI enforcement, pre-commit hook. The examples-sync system has a similar shape (manifest + CI enforcement) and should reuse the same conventions.

## Out of Scope

- **Watching code files in upstream.** Only `README.md` is watched. If upstream changes a default model in code but doesn't update the README, docs won't auto-sync. This is explicitly delegated to upstream owners.
- **HF Spaces as a third upstream source.** Mentioned as an existing source of examples but excluded from v1 by the maintainer.
- **Pre-PR validation with retry loop.** Deferred to v2. v1 skips validation and relies on CI to surface failures.
- **Automatic categorization of new examples.** Inbox PR + checklist is the chosen UX. LLM-based category guessing is explicitly rejected because category names are not perfectly self-describing.
- **Manifest-in-upstream.** Considered and rejected. The decision to keep the manifest in docs accepts the tradeoff that upstream folder renames will surface as "new example detected" PRs.
- **Real-time sub-minute latency.** Dispatch latency is acceptable. Daily safety-net cron is also out of scope for v1 but listed as a v2 enhancement.
- **Replacing maintainer review.** The bot drafts; the maintainer always merges. No auto-merge.
- **Multi-context monorepo configuration.** This is a single-context docs repo. No `CONTEXT-MAP.md` is required.

## Further Notes

- The detailed architecture, mermaid flow diagram, schema specifics, prompt structures, and phase plan are captured in `EXAMPLES_SYNC_AUTOMATION.md` at the repo root. That document is the reference for implementers; this PRD is the user-story and decision-record view.
- The chosen "social contract" with upstream owners is: keep your README in sync with your code. This contract is intentional and should be documented in each upstream repo's `CONTRIBUTING.md` as part of the rollout. Optionally, an upstream CI check that fails when non-README files in an example folder change without the README being touched would harden this contract; that check is a separate piece of work.
- The `last_rejected_sha` field is a deliberate design choice to make rejection durable without making it permanent. Permanent opt-out is achievable by setting `last_rejected_sha` to a future-proof value or by adding a `sync: false` field; this is a v2 extension.
- The system is designed to gracefully degrade. If the LLM API is down, the dispatch fails and can be retried (via re-dispatching from upstream, or by manually running the workflow). If GitHub webhook delivery fails, a v2 daily safety-net cron is the planned remedy.
- Cost estimate: at Sonnet 4.6 rates, with ~25k input tokens (READMEs + exemplars + style guide + routes list) and ~5k output tokens per sync, each sync costs roughly $0.10 to $0.20. At expected upstream change volume, monthly cost is well under $20.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PRD: Auto-sync examples from upstream repos via PR-per-change pipeline #101

Problem Statement

Solution

User Stories

Implementation Decisions

Architectural shape

Content generation

Manifest schema

PR lifecycle

New-example flow

Module breakdown

Validation

Secrets

Implementation phases

Testing Decisions

Test philosophy

Modules to test

Prior art

Out of Scope

Further Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Module	Priority	Test style
Manifest store	high	Unit. Fixture YAML in, behavior assertions out. Tests cover lookup, upsert, mark_rejected, bump_synced, exemplar selection.
PR orchestrator	high	Unit with mocked `gh` CLI. Cover all five state transitions: no-branch creates PR, open-PR appends, closed-unmerged-same-SHA skips, closed-unmerged-advanced-SHA creates fresh PR, merged-PR (clean slate) creates new.
LLM content generator	medium	Cassette-style. Record an Anthropic response, replay in tests. Assert that the prompt sent to the API contains the right sections and ordering.
Sync state machine	medium	Integration-style with all sub-modules mocked. Cover the three paths: UPDATE, NEW_EXAMPLE, SKIP_REJECTED.
Upstream fetcher	skip	Thin wrapper; covered transitively.
Routes enumerator	skip	Thin wrapper; covered transitively.
Inbox guard	skip	CI test itself is the regression check.

PRD: Auto-sync examples from upstream repos via PR-per-change pipeline #101

Description

Problem Statement

Solution

User Stories

Implementation Decisions

Architectural shape

Content generation

Manifest schema

PR lifecycle

New-example flow

Module breakdown

Validation

Secrets

Implementation phases

Testing Decisions

Test philosophy

Modules to test

Prior art

Out of Scope

Further Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions