Skip to content

feat(operator): add OpenSandboxOperator lifecycle backend (Phase 1)#1203

Open
zpzjzj wants to merge 6 commits into
alibaba:masterfrom
zpzjzj:feat/opensandbox-operator
Open

feat(operator): add OpenSandboxOperator lifecycle backend (Phase 1)#1203
zpzjzj wants to merge 6 commits into
alibaba:masterfrom
zpzjzj:feat/opensandbox-operator

Conversation

@zpzjzj

@zpzjzj zpzjzj commented Jul 2, 2026

Copy link
Copy Markdown

What & why

Phase 1 of adding OpenSandbox as a Rock backend (方案 B: delegate both sandbox lifecycle and command/file execution to OpenSandbox via its official Python SDK). This PR delivers the lifecycle seam only; the proxy-layer exec/file seam is a follow-up PR.

refs #1202

Changes

  • OpenSandboxConfig (rock/config.py) — endpoint / api_key / protocol / runtime / namespace / use_server_proxy / default_timeout; wired into RockConfig and from_env yaml parsing. Enables runtime.operator_type=opensandbox.
  • OpenSandboxClient (rock/sandbox/operator/opensandbox/client.py) — async facade over opensandbox.Sandbox with lazy SDK import (optional opensandbox extra) and exception translation to Rock errors.
  • OpenSandboxOperator (rock/sandbox/operator/opensandbox/operator.py) — implements AbstractOperator:
    • submit → Sandbox.create, stop → pause, restart → resume, delete → kill (semantics locked in Phase 0).
    • OpenSandbox↔Rock state mapping; docker→k8s memory (8g→8Gi) and cpu normalization.
    • stores backend + opensandbox_id in SandboxInfo.extended_params.
  • OperatorFactory — dispatch operator_type=opensandbox; OperatorContext gains opensandbox_config; admin/main.py wires rock_config.opensandbox.
  • pyproject.tomlopensandbox optional extra (opensandbox>=0.1.13).

Testing

  • 20 new unit tests (config, client, operator, factory).
  • tests/unit/sandbox/operator/ + tests/unit/test_config.py: 187 passed, no regressions.
  • ruff check clean.
  • Client calls verified against the installed opensandbox==0.1.13 SDK signatures (Sandbox.create/connect/resume/pause/kill, ConnectionConfig fields, info.status.state, sandbox.id) — not just mocks.

Known limitation (tracked for follow-up)

submit does not yet forward container env vars to Sandbox.create (env=None); Rock's runtime-env injection for the OpenSandbox backend is deferred to the Phase 2 exec/file seam. Documented in docs/plans/opensandbox-sdk-contract.md §5.

Follow-up

Phase 2 (proxy exec/file seam: extract SandboxRuntimeBackend, add OpenSandboxBackend, route by extended_params["backend"]) will land in a separate PR.

🤖 Generated with Claude Code

zpzjzj and others added 6 commits July 2, 2026 15:06
Phase 0 deliverables for adding OpenSandboxOperator as a Rock backend
(方案 B: delegate lifecycle + exec/file ops to OpenSandbox via its Python SDK).

refs alibaba#1202

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds OpenSandboxConfig dataclass (endpoint/api_key/protocol/runtime/
namespace/use_server_proxy/default_timeout), wires it into RockConfig and
RockConfig.from_env yaml parsing. Enables runtime.operator_type=opensandbox.

refs alibaba#1202

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Implements 方案 B Phase 1 (lifecycle seam) for using OpenSandbox as a Rock
backend via its Python SDK:

- OpenSandboxClient: async facade over opensandbox.Sandbox with lazy SDK
  import (optional 'opensandbox' extra) and exception translation.
- OpenSandboxOperator(AbstractOperator): submit->create, stop->pause,
  restart->resume, delete->kill; state mapping and docker->k8s memory/cpu
  normalization; stores backend + opensandbox_id in extended_params.
- OperatorFactory: dispatch operator_type=opensandbox; OperatorContext gains
  opensandbox_config; admin main wires rock_config.opensandbox.
- pyproject: opensandbox optional extra.

Verified client calls against the installed opensandbox==0.1.13 SDK signatures.

refs alibaba#1202

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
refs alibaba#1202

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The opensandbox backend delegates the full sandbox lifecycle to an external
OpenSandbox service, so admin no longer needs a local Ray cluster to boot with
operator_type=opensandbox. Adds operator_requires_ray() and gates
RayService.init() on it; GemManager already tolerates ray_service=None and the
lifecycle path dispatches through the operator. ray/k8s behavior unchanged.

refs alibaba#1202

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Real end-to-end run against a live OpenSandbox server surfaced this: get_state
connected with the default health check, so a *paused* sandbox failed the check
and blocked ~ready_timeout, making get_status report the sandbox as gone (None)
instead of STOPPED. get_state/pause/kill only need a handle, so connect() now
passes skip_health_check=True. Verified e2e: paused sandbox now reads STOPPED.

refs alibaba#1202

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant