You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OpenShell's Rust Kubernetes client stack is significantly behind current upstream releases. The workspace currently pins kube and kube-runtime to 0.90, and k8s-openapi to 0.21.1 with the Kubernetes v1_26 generated API feature. Current upstream releases are kube/kube-runtime/kube-client/kube-core4.0.0 and k8s-openapi0.28.0, so updating is a multi-major compatibility task rather than a mechanical lockfile refresh.
The goal of this spike is to define the scope and risks for updating the Kubernetes client dependencies while preserving OpenShell's Kubernetes gateway behavior, sandbox lifecycle management, service-account bootstrap authentication, certificate generation, and Kubernetes e2e coverage.
Technical Context
The Kubernetes dependency surface is intentionally small but security-sensitive. openshell-driver-kubernetes uses kube-rs to construct in-cluster or inferred clients, create/list/get/delete/watch Agent Sandbox CRs, read Kubernetes Events, and list Nodes for GPU capacity checks. openshell-server uses kube-rs for the generate-certs Kubernetes Secret workflow and for the in-cluster ServiceAccount TokenReview bootstrap authenticator.
Upstream version checks performed during this spike:
cargo info kube@4.0.0: latest kube is 4.0.0, released with Rust MSRV 1.88.0; this matches OpenShell's workspace rust-version = "1.88".
cargo info kube-runtime@4.0.0, kube-client@4.0.0, kube-core@4.0.0: related kube-rs crates are aligned at 4.0.0.
cargo info k8s-openapi@0.28.0: latest k8s-openapi is 0.28.0; available Kubernetes feature flags are v1_32 through v1_36, with latest = v1_36.
kube-rs 4.0.0 release notes: adds Kubernetes v1_36 support via k8s-openapi0.28, enables regular client retries by default, changes timeout behavior, makes client tracing opt-in, and preserves the prior ErrorResponse to Status migration from the 3.x line.
k8s-openapi v0.28.0 release notes: adds v1_36, drops support for Kubernetes 1.31, and lists corresponding API server versions v1.32.13 through v1.36.2.
Still depends on serde_yaml; does not move to serde_saphyr
Strongest intermediate target before 4.0. It absorbs major API migrations while avoiding the kube 4.0 default retry/read-timeout/tracing behavior changes.
Removes serde_yaml and uses serde-saphyr/serde_saphyr through kube-client
Cleanest dependency posture and the first target that removes the deprecated YAML parser, but includes kube 4.0 behavior changes around retries, timeouts, and tracing.
The version choice depends on whether this work is primarily a compatibility modernization or a dependency cleanup. If the goal is to de-risk the kube-rs API migration, 3.1.0 is the best initial target before 4.0. If removing serde_yaml is in scope for this spike, 4.0.0 is the first version that actually replaces it with serde_saphyr.
Required validation: do not treat the k8s-openapi feature range as the OpenShell runtime support matrix by itself. k8s-openapi selects the generated Rust Kubernetes API schema, while OpenShell's documented runtime minimum is Kubernetes 1.29+ with RBAC enabled. The implementation must validate the selected kube/k8s-openapi target against that documented minimum, or explicitly update the docs and release notes if maintainers decide to raise the minimum supported Kubernetes version.
Affected Components
Component
Key Files
Role
Workspace dependency pins
Cargo.toml, Cargo.lock
Pins kube, kube-runtime, and k8s-openapi; selects the generated Kubernetes API feature.
Creates and watches Agent Sandbox CRs, lists Nodes for GPU validation, maps Kubernetes Events to platform events, and converts sandbox specs into Kubernetes JSON.
Documents Kubernetes behavior and runs Kind-based Kubernetes e2e coverage.
Technical Investigation
Architecture Overview
The gateway selects the Kubernetes compute runtime through ComputeRuntime::new_kubernetes, after parsing [openshell.drivers.kubernetes] and applying gateway defaults. The driver constructs a normal kube client plus a separate watch client with no read timeout, then creates dynamic Api<DynamicObject> handles for the Agent Sandbox CRD. Lifecycle RPCs call get, list, create, delete, and watcher::watcher; Kubernetes Events are watched in parallel and translated into progress/platform events.
The ServiceAccount bootstrap path is constructed only when the gateway is running in-cluster and has a sandbox JWT issuer. It uses kube::Client::try_default(), Api<TokenReview>, Api<Pod>, and Api<DynamicObject> to verify a sandbox pod's projected token, live pod UID, ownerReference, owning Sandbox CR UID, and sandbox-id label before minting a gateway JWT.
The certgen path uses Client::try_default() and typed Api<Secret> operations to implement idempotent TLS/JWT Secret creation in Helm hook contexts.
Code References
Location
Description
Cargo.toml:111
Workspace pins kube = { version = "0.90", features = ["runtime", "derive"] }.
Cargo.toml:112
Workspace pins kube-runtime = "0.90".
Cargo.toml:113
Workspace pins k8s-openapi = { version = "0.21.1", features = ["v1_26"] }.
crates/openshell-driver-kubernetes/Cargo.toml:26
Kubernetes driver depends on workspace kube.
crates/openshell-driver-kubernetes/Cargo.toml:27
Kubernetes driver depends on workspace kube-runtime.
crates/openshell-driver-kubernetes/Cargo.toml:28
Kubernetes driver depends on workspace k8s-openapi.
Converts Event timestamps with t.0.timestamp_millis(); kube/k8s-openapi changed timestamp internals from chrono to jiff in later releases.
crates/openshell-server/src/lib.rs:315
Enables K8s SA bootstrap only in-cluster and only when sandbox JWT issuing is enabled.
crates/openshell-server/src/lib.rs:321
Builds default kube client for the K8s SA bootstrap authenticator.
crates/openshell-server/src/lib.rs:729
Constructs the Kubernetes compute runtime.
crates/openshell-server/src/auth/k8s_sa.rs:158
Builds typed TokenReview/Pod APIs and dynamic Sandbox CR API.
crates/openshell-server/src/auth/k8s_sa.rs:195
Creates TokenReview objects through the Kubernetes API.
crates/openshell-server/src/auth/k8s_sa.rs:223
Reads the sandbox pod with get_opt.
crates/openshell-server/src/auth/k8s_sa.rs:261
Reads the owning Sandbox CR with get_opt.
crates/openshell-server/src/auth/k8s_sa.rs:371
Validates pod ownerReferences against supported Agent Sandbox apiVersion/kind.
crates/openshell-server/src/auth/k8s_sa.rs:403
Validates owning Sandbox CR UID and sandbox-id label.
crates/openshell-server/src/certgen.rs:123
Constructs kube client and typed Secret API for Kubernetes cert generation.
crates/openshell-server/src/certgen.rs:140
Reads existing JWT Secret with get_opt.
crates/openshell-server/src/certgen.rs:161
Creates JWT Secret with typed Api<Secret>::create.
crates/openshell-server/src/certgen.rs:181
Reads existing TLS Secrets with get_opt.
crates/openshell-server/src/certgen.rs:244
Creates TLS and JWT Secrets in Kubernetes mode.
.github/workflows/e2e-kubernetes-test.yml:115
Runs the Kind-backed Kubernetes e2e workflow.
docs/reference/sandbox-compute-drivers.mdx:286
Published docs currently describe the Agent Sandbox CR integration. Update if supported Kubernetes/API assumptions change.
Current Behavior
OpenShell currently compiles against kube-rs 0.90 and k8s-openapi 0.21.1. The generated Kubernetes API surface is selected with k8s-openapi feature v1_26. Kube API errors are matched directly through KubeError::Api(api).code, and Kubernetes Event timestamps are treated as chrono-like values with timestamp_millis().
The driver has explicit 30-second API operation timeouts and a watch client with no read timeout. kube 4.0 changes default timeout/retry behavior upstream, so the implementation should verify that OpenShell's explicit timeout strategy still bounds non-watch calls and does not unintentionally multiply retries around the gateway's existing tokio::time::timeout wrappers.
What Would Need to Change
Update workspace dependency pins in Cargo.toml and refresh Cargo.lock for kube, kube-runtime, and k8s-openapi. Do not update only one of these; kube-rs release notes explicitly say to upgrade k8s-openapi with kube to avoid conflicts.
Choose the k8s-openapi Kubernetes feature target deliberately. 0.28.0 supports v1_32 through v1_36; the current workspace uses v1_26, so this may change the documented minimum tested Kubernetes API surface even if OpenShell only uses stable core resources.
Replace direct KubeError::Api(api).code matching with the current kube-rs Status-based helpers or equivalent non-deprecated checks. Affected cases include 409 conflict and 404 not found handling in the Kubernetes driver, and any API-version probing/fallback code if feat(kubernetes): support agent-sandbox v1beta1 #2009 lands first.
Update Kubernetes Event timestamp handling for k8s-openapi's chrono to jiff transition. map_kube_event_to_platform should continue producing millisecond Unix timestamps.
Recheck kube::Config timeout fields and Client::try_from/Client::try_default construction against kube 4.0. Explicit OpenShell timeouts should remain intentional after kube's default read-timeout changes.
Revalidate watcher behavior. The current code assumes watcher::watcher(...).try_next() yields Event::Applied, Event::Deleted, and Event::Restarted variants for both Sandbox CRs and Events.
Confirm typed Kubernetes resources still compile and serialize as expected: Node, Event, Pod, TokenReview, TokenReviewSpec, TokenReviewStatus, UserInfo, Secret, ObjectMeta, and ByteString.
Update crate README and published docs only if the dependency update changes the supported Kubernetes minor range, the recommended Agent Sandbox install, or runtime behavior visible to operators.
Alternative Approaches Considered
Jump directly to latest: kube/kube-runtime 4.0.0 and k8s-openapi 0.28.0. This is the cleanest dependency posture and matches OpenShell's current Rust 1.88 MSRV, but it requires resolving all API breaks and deciding whether the Kubernetes feature target should be v1_32, v1_36, or another supported minor.
Stage through an intermediate major, such as kube 1.x or 2.x. This may make compile errors easier to isolate, but it creates extra dependency churn and still leaves OpenShell behind current kube-rs.
Update only k8s-openapi or only kube-rs. This is not recommended; kube-rs release notes repeatedly warn to upgrade k8s-openapi with kube to avoid conflicts.
Patterns to Follow
Keep Kubernetes dependencies centralized in workspace dependencies, as they are today.
Preserve explicit operation timeouts around Kubernetes API calls in the driver. The code currently wraps get/list/create/delete calls with tokio::time::timeout and uses a separate no-read-timeout watch client.
Keep Kubernetes API use contained to openshell-driver-kubernetes and the narrow server paths that already require it: certgen and ServiceAccount bootstrap auth.
Keep unit coverage close to the affected code, following existing tests in crates/openshell-driver-kubernetes/src/driver.rs, crates/openshell-server/src/auth/k8s_sa.rs, and crates/openshell-server/src/certgen.rs.
Run Kubernetes e2e for behavioral validation, not only compile/unit tests. The risk is integration behavior against a real API server and Agent Sandbox controller.
Proposed Approach
Update the kube-rs stack together in one branch, targeting the current upstream major unless maintainers choose an intermediate version for compatibility reasons. Start by changing workspace dependency pins and selecting a k8s-openapi feature target, then fix compile breaks in the Kubernetes driver, K8s SA authenticator, and certgen paths. Treat kube-rs retry/timeout changes as behavior changes to verify, not just compile fallout. Once unit tests pass, run the Kubernetes e2e path against Kind and confirm sandbox create/watch/delete, ServiceAccount bootstrap, certgen hook behavior, and Kubernetes Event progress mapping.
Scope Assessment
Complexity: Medium
Confidence: Medium - dependency graph is small, but the update crosses multiple kube-rs major releases and changes generated Kubernetes API versions.
Estimated files to change: 5-10
Issue type:chore
Risks & Open Questions
Which k8s-openapi feature should OpenShell use after the update: v1_32, v1_36, or another supported minor? This determines the generated API surface and may affect documented Kubernetes compatibility.
Does OpenShell need to continue claiming support for clusters older than the k8s-openapi 0.28 feature range? If yes, maintainers may need to choose an older kube-rs target or document the new tested minimum.
The current published Kubernetes setup docs require Kubernetes 1.29+ with RBAC enabled. Any target whose generated API feature no longer includes v1_29 must be validated against a Kubernetes 1.29 API server, or the documented minimum must be raised intentionally.
kube 4.0 enables regular client retries by default. Confirm this does not interact badly with OpenShell's explicit 30-second tokio::time::timeout wrappers or produce surprising latency under API server failures.
k8s-openapi timestamp internals changed through the chrono to jiff migration. Ensure Kubernetes Event progress timestamps remain correct.
Coordinate with PR feat(kubernetes): support agent-sandbox v1beta1 #2009 if it lands first, because that PR touches the same dynamic Sandbox API construction and API-error fallback paths for Agent Sandbox v1beta1/v1alpha1 support.
LSM impact: none expected. This dependency update does not touch process identity, /proc, file labels, binary execution, or inter-process visibility. SELinux/AppArmor-sensitive behavior should be limited to existing sandbox runtime/e2e environments, not kube client calls themselves.
Test Considerations
Run mise exec -- cargo check -p openshell-driver-kubernetes and mise exec -- cargo check -p openshell-server after dependency changes to catch API breaks quickly.
Run mise exec -- cargo test -p openshell-driver-kubernetes --lib for driver conversion, event mapping, GPU validation, and Kubernetes spec rendering tests.
Run mise exec -- cargo test -p openshell-server auth::k8s_sa --lib for ServiceAccount bootstrap logic.
Run mise exec -- cargo test -p openshell-server certgen --lib or the relevant server unit subset for Kubernetes Secret generation logic.
Run the Kubernetes e2e path with mise run e2e:kubernetes, or rely on the test:e2e-kubernetes CI workflow if local Kind/k3d is unavailable.
Validate the chosen dependency target against the documented Kubernetes minimum, currently Kubernetes 1.29+ with RBAC enabled. At minimum, render the Helm chart for Kubernetes 1.29 and run a Kubernetes e2e smoke test against a 1.29-compatible API server, covering sandbox create/watch/delete, TokenReview bootstrap, Secret certgen, Node listing, Event watching, and Agent Sandbox CR access.
Validate supervisor sideload behavior across the version boundary: Kubernetes < 1.35 should render/use init-container; Kubernetes >= 1.35 should render/use image-volume unless explicitly overridden. If testing image-volume on 1.33 or 1.34, the ImageVolume feature gate must be enabled.
Problem Statement
OpenShell's Rust Kubernetes client stack is significantly behind current upstream releases. The workspace currently pins
kubeandkube-runtimeto0.90, andk8s-openapito0.21.1with the Kubernetesv1_26generated API feature. Current upstream releases arekube/kube-runtime/kube-client/kube-core4.0.0andk8s-openapi0.28.0, so updating is a multi-major compatibility task rather than a mechanical lockfile refresh.The goal of this spike is to define the scope and risks for updating the Kubernetes client dependencies while preserving OpenShell's Kubernetes gateway behavior, sandbox lifecycle management, service-account bootstrap authentication, certificate generation, and Kubernetes e2e coverage.
Technical Context
The Kubernetes dependency surface is intentionally small but security-sensitive.
openshell-driver-kubernetesuses kube-rs to construct in-cluster or inferred clients, create/list/get/delete/watch Agent Sandbox CRs, read Kubernetes Events, and list Nodes for GPU capacity checks.openshell-serveruses kube-rs for thegenerate-certsKubernetes Secret workflow and for the in-cluster ServiceAccount TokenReview bootstrap authenticator.Upstream version checks performed during this spike:
cargo info kube@4.0.0: latestkubeis4.0.0, released with Rust MSRV1.88.0; this matches OpenShell's workspacerust-version = "1.88".cargo info kube-runtime@4.0.0,kube-client@4.0.0,kube-core@4.0.0: related kube-rs crates are aligned at4.0.0.cargo info k8s-openapi@0.28.0: latestk8s-openapiis0.28.0; available Kubernetes feature flags arev1_32throughv1_36, withlatest = v1_36.v1_36support viak8s-openapi0.28, enables regular client retries by default, changes timeout behavior, makes client tracing opt-in, and preserves the priorErrorResponsetoStatusmigration from the 3.x line.v1_36, drops support for Kubernetes 1.31, and lists corresponding API server versionsv1.32.13throughv1.36.2.Primary upstream references:
Version Target Summary
kube/kube-runtime/kube-client0.90.00.21.1; workspace selectsv1_26v1_24-v1_29; OpenShell currently selectsv1_26v1_24: 2023-07-28;v1_25: 2023-10-28;v1_26: 2024-02-28;v1_27: 2024-07-16;v1_28: 2024-10-22;v1_29: 2025-02-28kube-clientdepends onserde_yaml; current lockfile resolvesserde_yaml 0.9.34+deprecatedv1_26API feature is long past EOL and the stack keeps the deprecated YAML parser.kube/kube-runtime/kube-client2.0.10.26.0v1_30-v1_34v1_30: 2025-07-15;v1_31: 2025-11-11;v1_32: 2026-02-28;v1_33: 2026-06-28;v1_34: 2026-10-27serde_yaml; does not move toserde_saphyrkube/kube-runtime/kube-client3.1.00.27.0v1_31-v1_35v1_31: 2025-11-11;v1_32: 2026-02-28;v1_33: 2026-06-28;v1_34: 2026-10-27;v1_35: 2027-02-28serde_yaml; does not move toserde_saphyrkube/kube-runtime/kube-client4.0.00.28.0v1_32-v1_36v1_32: 2026-02-28;v1_33: 2026-06-28;v1_34: 2026-10-27;v1_35: 2027-02-28;v1_36: 2027-06-28serde_yamland usesserde-saphyr/serde_saphyrthroughkube-clientThe version choice depends on whether this work is primarily a compatibility modernization or a dependency cleanup. If the goal is to de-risk the kube-rs API migration,
3.1.0is the best initial target before 4.0. If removingserde_yamlis in scope for this spike,4.0.0is the first version that actually replaces it withserde_saphyr.Required validation: do not treat the
k8s-openapifeature range as the OpenShell runtime support matrix by itself.k8s-openapiselects the generated Rust Kubernetes API schema, while OpenShell's documented runtime minimum is Kubernetes 1.29+ with RBAC enabled. The implementation must validate the selected kube/k8s-openapi target against that documented minimum, or explicitly update the docs and release notes if maintainers decide to raise the minimum supported Kubernetes version.Affected Components
Cargo.toml,Cargo.lockkube,kube-runtime, andk8s-openapi; selects the generated Kubernetes API feature.crates/openshell-driver-kubernetes/src/driver.rs,crates/openshell-driver-kubernetes/Cargo.tomlcrates/openshell-server/src/auth/k8s_sa.rs,crates/openshell-server/src/lib.rscrates/openshell-server/src/certgen.rsdocs/reference/sandbox-compute-drivers.mdx,crates/openshell-driver-kubernetes/README.md,.github/workflows/e2e-kubernetes-test.yml,tasks/test.tomlTechnical Investigation
Architecture Overview
The gateway selects the Kubernetes compute runtime through
ComputeRuntime::new_kubernetes, after parsing[openshell.drivers.kubernetes]and applying gateway defaults. The driver constructs a normal kube client plus a separate watch client with no read timeout, then creates dynamicApi<DynamicObject>handles for the Agent Sandbox CRD. Lifecycle RPCs callget,list,create,delete, andwatcher::watcher; Kubernetes Events are watched in parallel and translated into progress/platform events.The ServiceAccount bootstrap path is constructed only when the gateway is running in-cluster and has a sandbox JWT issuer. It uses
kube::Client::try_default(),Api<TokenReview>,Api<Pod>, andApi<DynamicObject>to verify a sandbox pod's projected token, live pod UID, ownerReference, owning Sandbox CR UID, and sandbox-id label before minting a gateway JWT.The certgen path uses
Client::try_default()and typedApi<Secret>operations to implement idempotent TLS/JWT Secret creation in Helm hook contexts.Code References
Cargo.toml:111kube = { version = "0.90", features = ["runtime", "derive"] }.Cargo.toml:112kube-runtime = "0.90".Cargo.toml:113k8s-openapi = { version = "0.21.1", features = ["v1_26"] }.crates/openshell-driver-kubernetes/Cargo.toml:26kube.crates/openshell-driver-kubernetes/Cargo.toml:27kube-runtime.crates/openshell-driver-kubernetes/Cargo.toml:28k8s-openapi.crates/openshell-server/Cargo.toml:31kube.crates/openshell-server/Cargo.toml:32k8s-openapi.crates/openshell-driver-kubernetes/src/driver.rs:57KubeError::Api(api).code == 409toAlreadyExists; kube-rs 3.x/4.x prefersStatushelpers such as conflict/not-found predicates.crates/openshell-driver-kubernetes/src/driver.rs:211crates/openshell-driver-kubernetes/src/driver.rs:219crates/openshell-driver-kubernetes/src/driver.rs:225read_timeout = None. kube 4.0 changed default timeout behavior, so this should be revalidated.crates/openshell-driver-kubernetes/src/driver.rs:259GroupVersionKindandApiResource.crates/openshell-driver-kubernetes/src/driver.rs:271Noderesources to validate GPU capacity.crates/openshell-driver-kubernetes/src/driver.rs:302crates/openshell-driver-kubernetes/src/driver.rs:338crates/openshell-driver-kubernetes/src/driver.rs:381crates/openshell-driver-kubernetes/src/driver.rs:469crates/openshell-driver-kubernetes/src/driver.rs:510api.get.crates/openshell-driver-kubernetes/src/driver.rs:525watcher::watcher.crates/openshell-driver-kubernetes/src/driver.rs:729crates/openshell-driver-kubernetes/src/driver.rs:747t.0.timestamp_millis(); kube/k8s-openapi changed timestamp internals from chrono to jiff in later releases.crates/openshell-server/src/lib.rs:315crates/openshell-server/src/lib.rs:321crates/openshell-server/src/lib.rs:729crates/openshell-server/src/auth/k8s_sa.rs:158crates/openshell-server/src/auth/k8s_sa.rs:195crates/openshell-server/src/auth/k8s_sa.rs:223get_opt.crates/openshell-server/src/auth/k8s_sa.rs:261get_opt.crates/openshell-server/src/auth/k8s_sa.rs:371crates/openshell-server/src/auth/k8s_sa.rs:403crates/openshell-server/src/certgen.rs:123crates/openshell-server/src/certgen.rs:140get_opt.crates/openshell-server/src/certgen.rs:161Api<Secret>::create.crates/openshell-server/src/certgen.rs:181get_opt.crates/openshell-server/src/certgen.rs:244.github/workflows/e2e-kubernetes-test.yml:115docs/reference/sandbox-compute-drivers.mdx:286Current Behavior
OpenShell currently compiles against kube-rs 0.90 and k8s-openapi 0.21.1. The generated Kubernetes API surface is selected with
k8s-openapifeaturev1_26. Kube API errors are matched directly throughKubeError::Api(api).code, and Kubernetes Event timestamps are treated as chrono-like values withtimestamp_millis().The driver has explicit 30-second API operation timeouts and a watch client with no read timeout. kube 4.0 changes default timeout/retry behavior upstream, so the implementation should verify that OpenShell's explicit timeout strategy still bounds non-watch calls and does not unintentionally multiply retries around the gateway's existing
tokio::time::timeoutwrappers.What Would Need to Change
Cargo.tomland refreshCargo.lockforkube,kube-runtime, andk8s-openapi. Do not update only one of these; kube-rs release notes explicitly say to upgradek8s-openapiwithkubeto avoid conflicts.k8s-openapiKubernetes feature target deliberately.0.28.0supportsv1_32throughv1_36; the current workspace usesv1_26, so this may change the documented minimum tested Kubernetes API surface even if OpenShell only uses stable core resources.KubeError::Api(api).codematching with the current kube-rsStatus-based helpers or equivalent non-deprecated checks. Affected cases include 409 conflict and 404 not found handling in the Kubernetes driver, and any API-version probing/fallback code if feat(kubernetes): support agent-sandbox v1beta1 #2009 lands first.k8s-openapi's chrono to jiff transition.map_kube_event_to_platformshould continue producing millisecond Unix timestamps.kube::Configtimeout fields andClient::try_from/Client::try_defaultconstruction against kube 4.0. Explicit OpenShell timeouts should remain intentional after kube's default read-timeout changes.watcher::watcher(...).try_next()yieldsEvent::Applied,Event::Deleted, andEvent::Restartedvariants for both Sandbox CRs and Events.Node,Event,Pod,TokenReview,TokenReviewSpec,TokenReviewStatus,UserInfo,Secret,ObjectMeta, andByteString.Alternative Approaches Considered
kube/kube-runtime4.0.0 andk8s-openapi0.28.0. This is the cleanest dependency posture and matches OpenShell's current Rust 1.88 MSRV, but it requires resolving all API breaks and deciding whether the Kubernetes feature target should bev1_32,v1_36, or another supported minor.k8s-openapior only kube-rs. This is not recommended; kube-rs release notes repeatedly warn to upgradek8s-openapiwithkubeto avoid conflicts.Patterns to Follow
tokio::time::timeoutand uses a separate no-read-timeout watch client.openshell-driver-kubernetesand the narrow server paths that already require it: certgen and ServiceAccount bootstrap auth.crates/openshell-driver-kubernetes/src/driver.rs,crates/openshell-server/src/auth/k8s_sa.rs, andcrates/openshell-server/src/certgen.rs.Proposed Approach
Update the kube-rs stack together in one branch, targeting the current upstream major unless maintainers choose an intermediate version for compatibility reasons. Start by changing workspace dependency pins and selecting a
k8s-openapifeature target, then fix compile breaks in the Kubernetes driver, K8s SA authenticator, and certgen paths. Treat kube-rs retry/timeout changes as behavior changes to verify, not just compile fallout. Once unit tests pass, run the Kubernetes e2e path against Kind and confirm sandbox create/watch/delete, ServiceAccount bootstrap, certgen hook behavior, and Kubernetes Event progress mapping.Scope Assessment
choreRisks & Open Questions
k8s-openapifeature should OpenShell use after the update:v1_32,v1_36, or another supported minor? This determines the generated API surface and may affect documented Kubernetes compatibility.k8s-openapi0.28 feature range? If yes, maintainers may need to choose an older kube-rs target or document the new tested minimum.v1_29must be validated against a Kubernetes 1.29 API server, or the documented minimum must be raised intentionally.tokio::time::timeoutwrappers or produce surprising latency under API server failures.k8s-openapitimestamp internals changed through the chrono to jiff migration. Ensure Kubernetes Event progress timestamps remain correct.v1beta1/v1alpha1support./proc, file labels, binary execution, or inter-process visibility. SELinux/AppArmor-sensitive behavior should be limited to existing sandbox runtime/e2e environments, not kube client calls themselves.Test Considerations
mise exec -- cargo check -p openshell-driver-kubernetesandmise exec -- cargo check -p openshell-serverafter dependency changes to catch API breaks quickly.mise exec -- cargo test -p openshell-driver-kubernetes --libfor driver conversion, event mapping, GPU validation, and Kubernetes spec rendering tests.mise exec -- cargo test -p openshell-server auth::k8s_sa --libfor ServiceAccount bootstrap logic.mise exec -- cargo test -p openshell-server certgen --libor the relevant server unit subset for Kubernetes Secret generation logic.mise run e2e:kubernetes, or rely on thetest:e2e-kubernetesCI workflow if local Kind/k3d is unavailable.init-container; Kubernetes >= 1.35 should render/useimage-volumeunless explicitly overridden. If testingimage-volumeon 1.33 or 1.34, the ImageVolume feature gate must be enabled.docs/reference/gateway-config.mdxupdate is expected unless the update changes gateway TOML fields, driver-specific config options, defaults, or Helm rendering. Updatedocs/reference/sandbox-compute-drivers.mdx,crates/openshell-driver-kubernetes/README.md, or Kubernetes setup docs if minimum supported Kubernetes/API assumptions change.Created by spike investigation. Use
build-from-issueto plan and implement.