Skip to content

feat(install): provision prereqs + fail-fast robustness (CUDA, firewall, Git Bash guard, node/npm checks)#1612

Open
joelteply wants to merge 3 commits into
canaryfrom
feat/install-provisions-prereqs
Open

feat(install): provision prereqs + fail-fast robustness (CUDA, firewall, Git Bash guard, node/npm checks)#1612
joelteply wants to merge 3 commits into
canaryfrom
feat/install-provisions-prereqs

Conversation

@joelteply

Copy link
Copy Markdown
Contributor

Why

Modern-app install behavior: if a prerequisite is detected as necessary and it's reasonable to do so, the install should provide it — idempotently, on every run/update — not merely warn.

Two regressions/gaps this closes:

1. CUDA toolkit (install.sh)

detect_gpu only warned when nvcc was missing, with a wrong comment ("needed for training, not inference"). nvcc is required to build --features cuda (candle-kernels/cudarc compile GPU kernels at build time) — so a CUDA box silently fell back to CPU.

New install_cuda_toolkit():

  • Runs when a CUDA GPU is detected; installs the toolkit from NVIDIA's apt repo.
  • wsl-ubuntu repo under WSL2 (the GPU driver comes from the Windows host — never a Linux driver); ubuntu2404 native; sbsa for aarch64.
  • Targets 12.9 (the Blackwell sm_120 / RTX 5090 floor is 12.8; 12.9 is the newest 12.x with broad cudarc/candle support).
  • Idempotent: re-run skips when a recent-enough toolkit is present.

2. airc inbound firewall (windows-setup-autostart.ps1)

Cross-machine grid routing needs other nodes to reach this box's airc LAN listener (an ephemeral TCP port). Previously this required a manual one-off New-NetFirewallRule paste — the exact thing blocking the keystone's inbound leg. Now the (already-elevated) Windows setup script adds an idempotent inbound allow-rule for airc.exe by program path.

Validation

  • bash -n install.sh clean; PowerShell parser clean on the ps1.
  • Both changes are no-ops on non-CUDA / non-airc boxes and safe to re-run.
  • Infra-script-only change (shell + PowerShell, no TS/Rust) — committed with the precommit config's documented ENABLE_TYPESCRIPT_CHECK=false ENABLE_BROWSER_TEST=false knobs (the inapplicable phases), not --no-verify.

🤖 Generated with Claude Code

…requisites

Modern-app install behavior: detect a needed prerequisite and provide it,
idempotently, on every run/update — instead of merely warning.

install.sh: install_cuda_toolkit() — when a CUDA GPU is detected and nvcc is
missing or below the Blackwell floor (12.8, sm_120 / RTX 5090), install the
CUDA toolkit from NVIDIAs apt repo (wsl-ubuntu repo under WSL2; the GPU driver
comes from the Windows host, never a Linux driver). Idempotent: a re-run skips
when a recent-enough toolkit exists. Replaces the warn-only detect_gpu path,
whose comment wrongly called nvcc training-only — its required to build
--features cuda (candle-kernels/cudarc compile GPU kernels at build time).

windows-setup-autostart.ps1: idempotent inbound firewall allow-rule for
airc.exe so other grid nodes can reach this boxs airc LAN listener (ephemeral
port) — the missing piece that forced a manual netsh paste to close the
cross-machine keystones inbound leg.

Skipped TS-compile + browser precommit phases via the configs documented env
knobs (ENABLE_TYPESCRIPT_CHECK=false ENABLE_BROWSER_TEST=false) — change is
shell + PowerShell infra only, no TS/Rust.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e success

Running tools/scripts/install.sh from Git Bash/MSYS (uname MINGW64_NT) fell
through preflight_detect_platform Darwin/Linux cases to "unknown", then
plowed into apt-based Node/npm steps that cant work there — a cascade of
"command not found" AND a false "node installed" (the echo ran node --version
unconditionally; the npm step | tail masked the missing-binary exit).

- preflight_detect_platform: detect MINGW*/MSYS*/CYGWIN* -> windows-shell.
- install.sh: fail fast on windows-shell/unknown with a clear redirect to WSL2
  (or install.ps1) BEFORE any package step runs.
- install_node: verify node is on PATH after install; error if not.
- npm build step: preflight command -v npm; check npms real exit via
  PIPESTATUS[0] (not tails) and stop on failure.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@joelteply joelteply changed the title feat(install): provision CUDA toolkit + airc firewall as detected prerequisites feat(install): provision prereqs + fail-fast robustness (CUDA, firewall, Git Bash guard, node/npm checks) Jun 12, 2026
… + pin toolchain

The from-source dev script (tools/scripts/install.sh) billed itself as THE
installer, so users (and agents) ran it — often in Git Bash where it cant
work — instead of the real one-command path. Rewrite its header to redirect
most users to install.ps1 (Windows) / curl install.sh (Linux/macOS), which
handle WSL2 + Docker + GPU in one shot. Pair with the windows-shell fail-fast
guard so the dev script is unmistakable.

Also add rust-toolchain.toml pinning 1.92.0 so the from-source build is
reproducible (continuum-core ICEs rustc >=1.93 on x86_64-linux).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant