feat (auto scale) : add StackableScaler CRD rollout and admission webhook#411
Open
soenkeliebau wants to merge 5 commits intomainfrom
Open
feat (auto scale) : add StackableScaler CRD rollout and admission webhook#411soenkeliebau wants to merge 5 commits intomainfrom
soenkeliebau wants to merge 5 commits intomainfrom
Conversation
Add a mutating admission webhook that guards StackableScaler replicas changes during active scaling operations, preventing state machine corruption from concurrent HPA updates. Includes CRD YAML, RBAC roles, webhook registration, and generated Cargo/Nix files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the inline pattern match against individual ScalerStage variants with the new is_scaling_in_progress() method from operator-rs. This removes the ScalerStage import and ensures the webhook stays correct if new stages are added to the state machine. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove cluster-kind label injection (no longer needed with .owns()). Narrow to UPDATE operations only. The handler returns allow/deny without patches, making it functionally a validating webhook. The MutatingWebhook framework is retained because stackable-webhook does not yet provide a ValidatingWebhook type. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Part of the ReplicasConfig rewrite: the label-injection webhook is no longer needed (replaced by owner references and .owns()), so the webhook description and code are updated to reflect validation-only scope. - Update CLI help text: remove label-injection reference, describe webhook as rejecting spec.replicas changes during active scaling. - Simplify deny logic in scaler_admission_handler: use `if let` with `filter()` instead of `is_some_and()` + separate `stage_str` variable, removing the unnecessary "unknown" fallback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
30 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Rolls out the
StackableScalerCRD (defined in operator-rs) and adds a validation webhookthat rejects
spec.replicaschanges on StackableScaler resources while a scaling operationis in progress.
fixes stackabletech/issues#667
extra/crds.yamlwith fullschema (spec:
replicas,clusterRef,role,roleGroup; status:currentStatewithstage enum,
replicas,desiredReplicas,selector) and/scale+/statussubresources
scaler-admission.stackable.tech) -- targets UPDATE operations onstackablescalers.autoscaling.stackable.tech/v1alpha1. Onspec.replicaschange, fetchesthe live object to inspect
status.current_state.stage(Kubernetes strips.statusfromadmission review
oldObjectfor CRDs with a status subresource). Denies the update ifscaling is in progress (any stage other than Idle or Failed). Failure policy is
Fail.webhook for future multi-version support
StackableScaler::merged_crd()added to thecrdsubcommand--disable-scaler-admission-webhookto skip the webhook (matches existing--disable-restarter-mutating-webhookpattern)get,list,watchonstackablescalersinautoscaling.stackable.techfor the webhook's live-object fetchMotivation
The StackableScaler state machine assumes
spec.replicasremains stable while a scalingoperation is in progress. If the HPA writes a new value mid-flight, the operator would see
conflicting desired replica counts and the state machine's
previous_replicas/desired_replicasbookkeeping breaks down. The admission webhook enforces this invariantat the API server level, before the write reaches etcd.
The webhook is validation-only (no mutations) but is implemented as a MutatingWebhook
because Kubernetes evaluates mutating webhooks before validating webhooks -- this ensures
the check runs before any other validating webhook that might depend on the replica count.
Dependencies
ScalerStage::is_scaling_in_progress()(see feat(auto scaling): add StackableScaler CRD, state machine, ReplicasConfig, and scaling hooks operator-rs#1181)
../operator-rs/crates/stackable-operator(development dependency,will be updated to git tag before merge)
Test plan
cargo test --all-featurespassescargo clippy --all-targets --all-features -- -D warningscleanspec.replicasupdate when scaler is in PreScaling/Scaling/PostScaling stage
spec.replicasupdate when scaler is Idle or Failedspec.replicasregardless of stage--disable-scaler-admission-webhookflag prevents webhook registrationAuthor
Reviewer
Acceptance
type/deprecationlabel & add to the deprecation scheduletype/experimentallabel & add to the experimental features tracker