Skip to content

fix(remote): unblock audio-separator-remote for typical files#288

Merged
beveradb merged 2 commits into
mainfrom
feat/sess-20260518-1532-cli-gcs-upload
May 18, 2026
Merged

fix(remote): unblock audio-separator-remote for typical files#288
beveradb merged 2 commits into
mainfrom
feat/sess-20260518-1532-cli-gcs-upload

Conversation

@beveradb
Copy link
Copy Markdown
Collaborator

Summary

audio-separator-remote was unusable end-to-end against the GCP Cloud Run deployment for two independent reasons. This PR fixes both:

  1. 413 on real audio files — Cloud Run caps request bodies at 32 MiB; the CLI only did multipart upload, so anything >32 MiB failed. Now it auto-uploads to GCS and uses the existing gcs_uri server path (same one karaoke-gen has been using all along).
  2. Server silently ran on CPU instead of GPUpip install ".[gpu]" only swaps in onnxruntime-gpu; the default torch>=2.3 constraint pulls PyPI's CPU-only wheel. Result: ~10× slowdown (vocal_balanced preset took ~50 min instead of ~5 min). karaoke-gen's Dockerfile.gpu-base already documents this gotcha; mirroring that fix here.

Changes

  • audio_separator/remote/cli.py: auto-upload to GCS for files >30 MiB, pass gcs_uri instead of file_path. New --gcs-bucket flag + AUDIO_SEPARATOR_GCS_INPUT_BUCKET env var (default: nomadkaraoke-audio-separator-outputs, separator SA already has objectAdmin). try/finally cleanup after each job; bucket's 1-day lifecycle is the safety net. google-cloud-storage lazy-imported with a clear install hint.
  • Dockerfile.cloudrun: install torch==2.6.0+cu126 + torchvision==0.21.0+cu126 from the cu126 index before pip install ".[gpu]", so audio-separator[gpu] sees torch as already satisfied. Includes inline comment explaining why.
  • audio_separator/remote/README.md: documents the >30 MiB upload behavior and --gcs-bucket option.
  • tests/unit/test_remote_cli.py: 7 new tests + 4 updated. Covers GCS helpers directly (blob path format, URI parsing, error handling, missing-lib hint), bucket resolution priority, and handle_separate_command paths (large/small file, cleanup on failure, upload failure).
  • pyproject.toml: 0.44.1 → 0.44.2

Testing

  • Unit tests pass (267 + 7 new = all green)
  • CodeRabbit CLI review: no findings
  • Verified the CLI's GCS upload path end-to-end against the current Cloud Run service (server received from GCS, started separation, cleanup ran on failure)
  • Post-deploy: verify Cloud Run logs show CUDA is available in Torch instead of No hardware acceleration could be configured, and re-run audio-separator-remote separate --preset vocal_balanced to confirm the 50min → ~5min speedup

Review

  • Local CodeRabbit CLI review completed (no findings)
  • Tests written alongside code

Deploy

This touches Dockerfile.cloudrun, which triggers .github/workflows/deploy-to-cloudrun.yml on merge to main. CI will Cloud Build the new image and update the audio-separator Cloud Run service automatically.

@coderabbitai ignore

🤖 Generated with Claude Code

beveradb and others added 2 commits May 18, 2026 17:49
Two independent issues prevented `audio-separator-remote` from working
end-to-end against the GCP Cloud Run deployment:

1. **CLI hits Cloud Run's 32 MiB request body limit on real audio files.**
   The underlying AudioSeparatorAPIClient already supports a `gcs_uri`
   mode where the server fetches from GCS (used by karaoke-gen), but the
   CLI only exposed the multipart upload path. Now the CLI detects files
   >30 MiB and auto-uploads to GCS, passes `gcs_uri` to the API, and
   cleans up the GCS object in a `finally` block (the bucket's 1-day
   lifecycle is the safety net). Bucket configurable via `--gcs-bucket`
   or `AUDIO_SEPARATOR_GCS_INPUT_BUCKET`; defaults to the existing
   `nomadkaraoke-audio-separator-outputs` (separator SA already has
   objectAdmin, no infra change needed). `google-cloud-storage` is
   lazy-imported with a clear install hint if missing.

2. **Cloud Run server silently runs in CPU mode, not GPU.** The image
   relied on `pip install ".[gpu]"` for GPU support, which only swaps in
   `onnxruntime-gpu` — the `torch>=2.3` constraint pulls PyPI's default
   CPU-only PyTorch wheel. Result: `torch.cuda.is_available()` returns
   False, Separator falls back to CPU, jobs run ~10x slower (50 min
   instead of 5 min for the vocal_balanced preset). karaoke-gen's
   audio-separation-job image already documents this gotcha in
   `Dockerfile.gpu-base:100-106`; mirroring that pattern here:
   install `torch==2.6.0+cu126` from the cu126 index first so
   audio-separator[gpu] sees torch as already satisfied.

Tests: 7 new unit tests covering GCS upload helpers (blob path format,
URI parsing, error handling), bucket resolution priority (--flag > env >
default), and the integration into handle_separate_command (large/small
file, cleanup on failure, upload failure).

Bumps version to 0.44.2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Missed this call site when updating the function signature. Unit tests
in tests/unit/test_remote_cli.py were updated, but integration test
test_cli_separate_command_integration still passed only 3 args.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@beveradb beveradb merged commit fca0cf7 into main May 18, 2026
24 of 26 checks passed
@beveradb beveradb deleted the feat/sess-20260518-1532-cli-gcs-upload branch May 18, 2026 23:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant