Skip to content

docs: backport 26.05 doc fixes (#2074, #2082, #2088) to main#2091

Merged
sosahi merged 5 commits into
NVIDIA:mainfrom
kheiss-uwzoo:kheiss/backport-docs-26.05-to-main
May 21, 2026
Merged

docs: backport 26.05 doc fixes (#2074, #2082, #2088) to main#2091
sosahi merged 5 commits into
NVIDIA:mainfrom
kheiss-uwzoo:kheiss/backport-docs-26.05-to-main

Conversation

@kheiss-uwzoo
Copy link
Copy Markdown
Collaborator

Summary

Cherry-picks documentation merged on 26.05 onto main so the live docs site (published from main) matches the release-line fixes.

Source PR Status on main before this PR Backported
#2074 — captioning/chart extraction vs Helm NIM topology (NVBugs 6195023, 6195296) Not present Yes
#2082 — air-gapped deployment (NVBugs 6195103, PR #2052) Not present Yes
#2088 — experimental retriever CLI subcommands (NVBugs 6199005, 6198526) Not present Yes
#2067 — Helm NIM defaults, GA VL embedder Already on main N/A (skipped)
#2062 — telemetry page removal, Omni caption NIM Already on main N/A (skipped)

Note: The listed PRs were merged into 26.05, not 26.03. This branch uses focused cherry-picks from 26.05main (not a 26.05main merge PR).

Files changed (9, docs only)

  • docs/docs/extraction/audio-video.md
  • docs/docs/extraction/deployment-options.md
  • docs/docs/extraction/multimodal-extraction.md
  • docs/docs/extraction/prerequisites-support-matrix.md
  • docs/docs/extraction/troubleshoot.md
  • nemo_retriever/README.md
  • nemo_retriever/docs/cli/README.md
  • nemo_retriever/docs/cli/benchmarking.md
  • nemo_retriever/helm/README.md

Adaptations for main

  • GitHub/doc links use blob/main (not 26.05).
  • Image captioning anchor is #image-captioning (not #image-captioning-2605).

Test plan

  • Review MkDocs build / link check on main after merge
  • Confirm air-gap section in deployment-options.md and Helm README cross-links resolve
  • Spot-check optional NIM table and Omni caption footnote in prerequisites support matrix

@kheiss-uwzoo kheiss-uwzoo requested review from a team as code owners May 21, 2026 22:06
@kheiss-uwzoo kheiss-uwzoo requested a review from jdye64 May 21, 2026 22:06
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 21, 2026

Greptile Summary

This is a documentation-only backport of three 26.05 release fixes to main, covering air-gapped deployment guidance, corrections to the charts/infographics pipeline description, and experimental-subcommand disclaimers for the retriever CLI.

  • Air-gapped deployment (deployment-options.md, audio-video.md, troubleshoot.md, helm/README.md): New dedicated section with a container-image mirror table, Helm registry-override YAML, and a Dockerfile snippet for embedding ffmpeg; existing FFmpeg troubleshooting steps are reorganised to split connected vs. disconnected paths and cross-link to the new section.
  • Charts/captioning corrections (multimodal-extraction.md, prerequisites-support-matrix.md, helm/README.md): Updates that charts use layout detection + OCR rather than a graphic_elements NIM; converts the optional-NIM bullet list to a keyed table with the correct Helm flag names; adds an explicit #image-captioning anchor and updates the Omni caption NIM image tag from latest to a pinned version string.
  • CLI experimental-subcommand disclaimers (nemo_retriever/docs/cli/README.md, benchmarking.md): Adds a "Supported vs development / experimental subcommands" section and updates quick-start examples to remove deprecated --graphic-elements-invoke-url and --ocr-version v1 flags.

Confidence Score: 4/5

Safe to merge with one item to confirm: the 1.7.0-variant image tag introduced in the air-gapped mirror table should be validated against the live NGC catalog before the docs publish.

All nine changed files are documentation only. The cross-links between the new air-gapped section and the files that reference it are internally consistent, section anchors were added where needed, and the CLI example commands have the deprecated flags removed. The one item that warrants a double-check is the -variant suffix on the pinned NIM image tags in the Helm air-gapped mirror table and in prerequisites footnote ³ — if those tags don't exist on NGC, anyone following the mirroring guide will hit an opaque pull failure.

nemo_retriever/helm/README.md — the new air-gapped image mirror table and the Dockerfile USER instruction; docs/docs/extraction/prerequisites-support-matrix.md footnote ³ for the same tag string.

Important Files Changed

Filename Overview
docs/docs/extraction/audio-video.md Adds air-gapped cross-reference at page top and in the Helm ffmpeg section; refines wording around FFmpeg runtime install and Parakeet Helm setup.
docs/docs/extraction/deployment-options.md Adds new "Air-gapped and disconnected deployment" section with audio/video warning block and offline captioning guidance; updates related-links list.
docs/docs/extraction/multimodal-extraction.md Corrects charts/infographics description (layout detection + OCR, not graphic-elements NIM); simplifies captioning section and adds link to prerequisites matrix.
docs/docs/extraction/prerequisites-support-matrix.md Adds anchor to optional NIMs section, converts bullet list to table, adds Image captioning subsection; footnote ³ changes the Omni NIM tag from latest to 1.7.0-variant, which is an unusual tag suffix worth verifying.
docs/docs/extraction/troubleshoot.md Adds explicit anchor to FFmpeg section; restructures troubleshooting steps to split connected vs. air-gapped paths; some prescriptive custom-image steps now delegated to deployment-options.
nemo_retriever/README.md Removes --graphic-elements-invoke-url and --ocr-version v1 from CLI examples; updates OCR URL from nemoretriever-ocr-v1 to nemotron-ocr-v1.
nemo_retriever/docs/cli/README.md Adds "Supported vs development / experimental subcommands" disclaimer section; updates caption model name and removes deprecated CLI flags from quick-start examples.
nemo_retriever/docs/cli/benchmarking.md Adds experimental disclaimer at page top; relabels harness section heading and updates launcher guidance to reflect experimental status.
nemo_retriever/helm/README.md Adds comprehensive air-gapped deployment section (image mirror table, registry override values, Dockerfile snippet); updates directory listing and NIM count; introduces 1.7.0-variant tag string for Nemotron Parse and Omni caption that may need verification against the actual published image.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Deployment start] --> B{Cluster type?}
    B -->|Connected| C[Standard install\nHelm chart defaults]
    B -->|Air-gapped| D[Mirror images\nto private registry]
    D --> E[Override nimOperator.*\nimage in values.yaml]
    C --> F{Audio/Video\nneeded?}
    E --> F
    F -->|Yes + Connected| G[service.installFfmpeg=true\nRuntime apt install]
    F -->|Yes + Air-gapped| H[Build custom service image\nwith ffmpeg pre-installed]
    F -->|No| I[Deploy core pipeline\npage_elements + ocr + vlm_embed]
    G --> I
    H --> I
    I --> J{Image captioning?}
    J -->|Yes| K[Enable nemotron_3_nano_omni\nin nimOperator]
    J -->|No| L[Pipeline ready]
    K --> L
Loading

Comments Outside Diff (2)

  1. nemo_retriever/helm/README.md, line 538-539 (link)

    P1 Verify 1.7.0-variant tag before publishing

    The tags nvcr.io/nim/nvidia/nemotron-parse-v1.2:1.7.0-variant and nvcr.io/nim/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:1.7.0-variant use a -variant suffix that is uncommon on NGC container tags. If this string was copied from a staging registry or a release-candidate naming scheme rather than the final published tag, users following the air-gapped mirroring steps will get a docker pull failure with an opaque "manifest unknown" error. The same string appears in the footnote ³ of prerequisites-support-matrix.md. Please confirm against the live NGC catalog before the docs go live.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: nemo_retriever/helm/README.md
    Line: 538-539
    
    Comment:
    **Verify `1.7.0-variant` tag before publishing**
    
    The tags `nvcr.io/nim/nvidia/nemotron-parse-v1.2:1.7.0-variant` and `nvcr.io/nim/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:1.7.0-variant` use a `-variant` suffix that is uncommon on NGC container tags. If this string was copied from a staging registry or a release-candidate naming scheme rather than the final published tag, users following the air-gapped mirroring steps will get a `docker pull` failure with an opaque "manifest unknown" error. The same string appears in the footnote ³ of `prerequisites-support-matrix.md`. Please confirm against the live NGC catalog before the docs go live.
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. nemo_retriever/helm/README.md, line 458-463 (link)

    P2 USER nemo may not exist in the base image

    The Dockerfile snippet ends with USER nemo, but the base image nemo-retriever-service may not define a nemo user — if the service runs as a numeric UID or a differently-named user, this instruction will fail with unable to find user nemo. Consider using the exact user name/UID from the base image, or showing a USER 1000 fallback. Since this is a copy-paste template users will run verbatim, a silent build-time failure here would be frustrating to debug.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: nemo_retriever/helm/README.md
    Line: 458-463
    
    Comment:
    **`USER nemo` may not exist in the base image**
    
    The Dockerfile snippet ends with `USER nemo`, but the base image `nemo-retriever-service` may not define a `nemo` user — if the service runs as a numeric UID or a differently-named user, this instruction will fail with `unable to find user nemo`. Consider using the exact user name/UID from the base image, or showing a `USER 1000` fallback. Since this is a copy-paste template users will run verbatim, a silent build-time failure here would be frustrating to debug.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
nemo_retriever/helm/README.md:538-539
**Verify `1.7.0-variant` tag before publishing**

The tags `nvcr.io/nim/nvidia/nemotron-parse-v1.2:1.7.0-variant` and `nvcr.io/nim/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:1.7.0-variant` use a `-variant` suffix that is uncommon on NGC container tags. If this string was copied from a staging registry or a release-candidate naming scheme rather than the final published tag, users following the air-gapped mirroring steps will get a `docker pull` failure with an opaque "manifest unknown" error. The same string appears in the footnote ³ of `prerequisites-support-matrix.md`. Please confirm against the live NGC catalog before the docs go live.

### Issue 2 of 2
nemo_retriever/helm/README.md:458-463
**`USER nemo` may not exist in the base image**

The Dockerfile snippet ends with `USER nemo`, but the base image `nemo-retriever-service` may not define a `nemo` user — if the service runs as a numeric UID or a differently-named user, this instruction will fail with `unable to find user nemo`. Consider using the exact user name/UID from the base image, or showing a `USER 1000` fallback. Since this is a copy-paste template users will run verbatim, a silent build-time failure here would be frustrating to debug.

Reviews (1): Last reviewed commit: "docs: fix image captioning cross-link on..." | Re-trigger Greptile

@sosahi sosahi merged commit 56db1ec into NVIDIA:main May 21, 2026
7 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants