fix(remote): unblock audio-separator-remote for typical files#288
Merged
Conversation
Two independent issues prevented `audio-separator-remote` from working end-to-end against the GCP Cloud Run deployment: 1. **CLI hits Cloud Run's 32 MiB request body limit on real audio files.** The underlying AudioSeparatorAPIClient already supports a `gcs_uri` mode where the server fetches from GCS (used by karaoke-gen), but the CLI only exposed the multipart upload path. Now the CLI detects files >30 MiB and auto-uploads to GCS, passes `gcs_uri` to the API, and cleans up the GCS object in a `finally` block (the bucket's 1-day lifecycle is the safety net). Bucket configurable via `--gcs-bucket` or `AUDIO_SEPARATOR_GCS_INPUT_BUCKET`; defaults to the existing `nomadkaraoke-audio-separator-outputs` (separator SA already has objectAdmin, no infra change needed). `google-cloud-storage` is lazy-imported with a clear install hint if missing. 2. **Cloud Run server silently runs in CPU mode, not GPU.** The image relied on `pip install ".[gpu]"` for GPU support, which only swaps in `onnxruntime-gpu` — the `torch>=2.3` constraint pulls PyPI's default CPU-only PyTorch wheel. Result: `torch.cuda.is_available()` returns False, Separator falls back to CPU, jobs run ~10x slower (50 min instead of 5 min for the vocal_balanced preset). karaoke-gen's audio-separation-job image already documents this gotcha in `Dockerfile.gpu-base:100-106`; mirroring that pattern here: install `torch==2.6.0+cu126` from the cu126 index first so audio-separator[gpu] sees torch as already satisfied. Tests: 7 new unit tests covering GCS upload helpers (blob path format, URI parsing, error handling), bucket resolution priority (--flag > env > default), and the integration into handle_separate_command (large/small file, cleanup on failure, upload failure). Bumps version to 0.44.2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Missed this call site when updating the function signature. Unit tests in tests/unit/test_remote_cli.py were updated, but integration test test_cli_separate_command_integration still passed only 3 args. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
audio-separator-remotewas unusable end-to-end against the GCP Cloud Run deployment for two independent reasons. This PR fixes both:gcs_uriserver path (same one karaoke-gen has been using all along).pip install ".[gpu]"only swaps inonnxruntime-gpu; the defaulttorch>=2.3constraint pulls PyPI's CPU-only wheel. Result: ~10× slowdown (vocal_balanced preset took ~50 min instead of ~5 min). karaoke-gen'sDockerfile.gpu-basealready documents this gotcha; mirroring that fix here.Changes
audio_separator/remote/cli.py: auto-upload to GCS for files >30 MiB, passgcs_uriinstead of file_path. New--gcs-bucketflag +AUDIO_SEPARATOR_GCS_INPUT_BUCKETenv var (default:nomadkaraoke-audio-separator-outputs, separator SA already has objectAdmin).try/finallycleanup after each job; bucket's 1-day lifecycle is the safety net.google-cloud-storagelazy-imported with a clear install hint.Dockerfile.cloudrun: installtorch==2.6.0+cu126+torchvision==0.21.0+cu126from the cu126 index beforepip install ".[gpu]", so audio-separator[gpu] sees torch as already satisfied. Includes inline comment explaining why.audio_separator/remote/README.md: documents the >30 MiB upload behavior and--gcs-bucketoption.tests/unit/test_remote_cli.py: 7 new tests + 4 updated. Covers GCS helpers directly (blob path format, URI parsing, error handling, missing-lib hint), bucket resolution priority, andhandle_separate_commandpaths (large/small file, cleanup on failure, upload failure).pyproject.toml: 0.44.1 → 0.44.2Testing
CUDA is available in Torchinstead ofNo hardware acceleration could be configured, and re-runaudio-separator-remote separate --preset vocal_balancedto confirm the 50min → ~5min speedupReview
Deploy
This touches
Dockerfile.cloudrun, which triggers.github/workflows/deploy-to-cloudrun.ymlon merge to main. CI will Cloud Build the new image and update the audio-separator Cloud Run service automatically.@coderabbitai ignore
🤖 Generated with Claude Code