docs: backport 26.05 doc fixes (#2074, #2082, #2088) to main#2091
Conversation
…195023, 6195296 (NVIDIA#2074) Co-authored-by: Randy Gelhausen <rgelhau@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
…(NVBugs 6199005, 6198526) (NVIDIA#2088)
Greptile SummaryThis is a documentation-only backport of three 26.05 release fixes to
|
| Filename | Overview |
|---|---|
| docs/docs/extraction/audio-video.md | Adds air-gapped cross-reference at page top and in the Helm ffmpeg section; refines wording around FFmpeg runtime install and Parakeet Helm setup. |
| docs/docs/extraction/deployment-options.md | Adds new "Air-gapped and disconnected deployment" section with audio/video warning block and offline captioning guidance; updates related-links list. |
| docs/docs/extraction/multimodal-extraction.md | Corrects charts/infographics description (layout detection + OCR, not graphic-elements NIM); simplifies captioning section and adds link to prerequisites matrix. |
| docs/docs/extraction/prerequisites-support-matrix.md | Adds anchor to optional NIMs section, converts bullet list to table, adds Image captioning subsection; footnote ³ changes the Omni NIM tag from latest to 1.7.0-variant, which is an unusual tag suffix worth verifying. |
| docs/docs/extraction/troubleshoot.md | Adds explicit anchor to FFmpeg section; restructures troubleshooting steps to split connected vs. air-gapped paths; some prescriptive custom-image steps now delegated to deployment-options. |
| nemo_retriever/README.md | Removes --graphic-elements-invoke-url and --ocr-version v1 from CLI examples; updates OCR URL from nemoretriever-ocr-v1 to nemotron-ocr-v1. |
| nemo_retriever/docs/cli/README.md | Adds "Supported vs development / experimental subcommands" disclaimer section; updates caption model name and removes deprecated CLI flags from quick-start examples. |
| nemo_retriever/docs/cli/benchmarking.md | Adds experimental disclaimer at page top; relabels harness section heading and updates launcher guidance to reflect experimental status. |
| nemo_retriever/helm/README.md | Adds comprehensive air-gapped deployment section (image mirror table, registry override values, Dockerfile snippet); updates directory listing and NIM count; introduces 1.7.0-variant tag string for Nemotron Parse and Omni caption that may need verification against the actual published image. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Deployment start] --> B{Cluster type?}
B -->|Connected| C[Standard install\nHelm chart defaults]
B -->|Air-gapped| D[Mirror images\nto private registry]
D --> E[Override nimOperator.*\nimage in values.yaml]
C --> F{Audio/Video\nneeded?}
E --> F
F -->|Yes + Connected| G[service.installFfmpeg=true\nRuntime apt install]
F -->|Yes + Air-gapped| H[Build custom service image\nwith ffmpeg pre-installed]
F -->|No| I[Deploy core pipeline\npage_elements + ocr + vlm_embed]
G --> I
H --> I
I --> J{Image captioning?}
J -->|Yes| K[Enable nemotron_3_nano_omni\nin nimOperator]
J -->|No| L[Pipeline ready]
K --> L
Comments Outside Diff (2)
-
nemo_retriever/helm/README.md, line 538-539 (link)Verify
1.7.0-varianttag before publishingThe tags
nvcr.io/nim/nvidia/nemotron-parse-v1.2:1.7.0-variantandnvcr.io/nim/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:1.7.0-variantuse a-variantsuffix that is uncommon on NGC container tags. If this string was copied from a staging registry or a release-candidate naming scheme rather than the final published tag, users following the air-gapped mirroring steps will get adocker pullfailure with an opaque "manifest unknown" error. The same string appears in the footnote ³ ofprerequisites-support-matrix.md. Please confirm against the live NGC catalog before the docs go live.Prompt To Fix With AI
This is a comment left during a code review. Path: nemo_retriever/helm/README.md Line: 538-539 Comment: **Verify `1.7.0-variant` tag before publishing** The tags `nvcr.io/nim/nvidia/nemotron-parse-v1.2:1.7.0-variant` and `nvcr.io/nim/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:1.7.0-variant` use a `-variant` suffix that is uncommon on NGC container tags. If this string was copied from a staging registry or a release-candidate naming scheme rather than the final published tag, users following the air-gapped mirroring steps will get a `docker pull` failure with an opaque "manifest unknown" error. The same string appears in the footnote ³ of `prerequisites-support-matrix.md`. Please confirm against the live NGC catalog before the docs go live. How can I resolve this? If you propose a fix, please make it concise.
-
nemo_retriever/helm/README.md, line 458-463 (link)USER nemomay not exist in the base imageThe Dockerfile snippet ends with
USER nemo, but the base imagenemo-retriever-servicemay not define anemouser — if the service runs as a numeric UID or a differently-named user, this instruction will fail withunable to find user nemo. Consider using the exact user name/UID from the base image, or showing aUSER 1000fallback. Since this is a copy-paste template users will run verbatim, a silent build-time failure here would be frustrating to debug.Prompt To Fix With AI
This is a comment left during a code review. Path: nemo_retriever/helm/README.md Line: 458-463 Comment: **`USER nemo` may not exist in the base image** The Dockerfile snippet ends with `USER nemo`, but the base image `nemo-retriever-service` may not define a `nemo` user — if the service runs as a numeric UID or a differently-named user, this instruction will fail with `unable to find user nemo`. Consider using the exact user name/UID from the base image, or showing a `USER 1000` fallback. Since this is a copy-paste template users will run verbatim, a silent build-time failure here would be frustrating to debug. How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
nemo_retriever/helm/README.md:538-539
**Verify `1.7.0-variant` tag before publishing**
The tags `nvcr.io/nim/nvidia/nemotron-parse-v1.2:1.7.0-variant` and `nvcr.io/nim/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:1.7.0-variant` use a `-variant` suffix that is uncommon on NGC container tags. If this string was copied from a staging registry or a release-candidate naming scheme rather than the final published tag, users following the air-gapped mirroring steps will get a `docker pull` failure with an opaque "manifest unknown" error. The same string appears in the footnote ³ of `prerequisites-support-matrix.md`. Please confirm against the live NGC catalog before the docs go live.
### Issue 2 of 2
nemo_retriever/helm/README.md:458-463
**`USER nemo` may not exist in the base image**
The Dockerfile snippet ends with `USER nemo`, but the base image `nemo-retriever-service` may not define a `nemo` user — if the service runs as a numeric UID or a differently-named user, this instruction will fail with `unable to find user nemo`. Consider using the exact user name/UID from the base image, or showing a `USER 1000` fallback. Since this is a copy-paste template users will run verbatim, a silent build-time failure here would be frustrating to debug.
Reviews (1): Last reviewed commit: "docs: fix image captioning cross-link on..." | Re-trigger Greptile
Summary
Cherry-picks documentation merged on
26.05ontomainso the live docs site (published frommain) matches the release-line fixes.mainbefore this PRmainmainFiles changed (9, docs only)
docs/docs/extraction/audio-video.mddocs/docs/extraction/deployment-options.mddocs/docs/extraction/multimodal-extraction.mddocs/docs/extraction/prerequisites-support-matrix.mddocs/docs/extraction/troubleshoot.mdnemo_retriever/README.mdnemo_retriever/docs/cli/README.mdnemo_retriever/docs/cli/benchmarking.mdnemo_retriever/helm/README.mdAdaptations for
mainblob/main(not26.05).#image-captioning(not#image-captioning-2605).Test plan
mainafter mergedeployment-options.mdand Helm README cross-links resolve