feat (auto scale) : add StackableScaler CRD rollout and admission webhook by soenkeliebau · Pull Request #411 · stackabletech/commons-operator

soenkeliebau · 2026-03-25T11:53:39Z

Summary

Rolls out the StackableScaler CRD (defined in operator-rs) and adds a validation webhook
that rejects spec.replicas changes on StackableScaler resources while a scaling operation
is in progress.

fixes stackabletech/issues#667

CRD rollout -- StackableScaler CRD definition added to extra/crds.yaml with full
schema (spec: replicas, clusterRef, role, roleGroup; status: currentState with
stage enum, replicas, desiredReplicas, selector) and /scale + /status
subresources
Admission webhook (scaler-admission.stackable.tech) -- targets UPDATE operations on
stackablescalers.autoscaling.stackable.tech/v1alpha1. On spec.replicas change, fetches
the live object to inspect status.current_state.stage (Kubernetes strips .status from
admission review oldObject for CRDs with a status subresource). Denies the update if
scaling is in progress (any stage other than Idle or Failed). Failure policy is Fail.
Conversion webhook registration -- StackableScaler added to the existing conversion
webhook for future multi-version support
CRD schema output -- StackableScaler::merged_crd() added to the crd subcommand
CLI flag -- --disable-scaler-admission-webhook to skip the webhook (matches existing
--disable-restarter-mutating-webhook pattern)
RBAC -- grants get, list, watch on stackablescalers in
autoscaling.stackable.tech for the webhook's live-object fetch

Motivation

The StackableScaler state machine assumes spec.replicas remains stable while a scaling
operation is in progress. If the HPA writes a new value mid-flight, the operator would see
conflicting desired replica counts and the state machine's previous_replicas /
desired_replicas bookkeeping breaks down. The admission webhook enforces this invariant
at the API server level, before the write reaches etcd.

The webhook is validation-only (no mutations) but is implemented as a MutatingWebhook
because Kubernetes evaluates mutating webhooks before validating webhooks -- this ensures
the check runs before any other validating webhook that might depend on the replica count.

Dependencies

operator-rs: StackableScaler CRD definition and ScalerStage::is_scaling_in_progress()
(see feat(auto scaling): add StackableScaler CRD, state machine, ReplicasConfig, and scaling hooks operator-rs#1181)
Uses local path patch to ../operator-rs/crates/stackable-operator (development dependency,
will be updated to git tag before merge)

Test plan

cargo test --all-features passes
cargo clippy --all-targets --all-features -- -D warnings clean
Webhook correctly denies spec.replicas update when scaler is in PreScaling/Scaling/
PostScaling stage
Webhook allows spec.replicas update when scaler is Idle or Failed
Webhook allows updates that don't change spec.replicas regardless of stage
--disable-scaler-admission-webhook flag prevents webhook registration
CRD schema output includes StackableScaler definition
Integration: HPA updates are blocked during active scaling, unblocked after completion

Author

Changes are OpenShift compatible
CRD changes approved
CRD documentation for all fields, following the style guide.
Helm chart can be installed and deployed operator works
Integration tests passed (for non trivial changes)
Changes need to be "offline" compatible
Links to generated (nightly) docs added
Release note snippet added

Reviewer

Code contains useful comments
Code contains useful logging statements
(Integration-)Test cases added
Documentation added or updated. Follows the style guide.
Changelog updated
Cargo.toml only contains references to git tags (not specific commits or branches)

Acceptance

Feature Tracker has been updated
Proper release label has been added
Links to generated (nightly) docs added
Release note snippet added
Add type/deprecation label & add to the deprecation schedule
Add type/experimental label & add to the experimental features tracker

Add a mutating admission webhook that guards StackableScaler replicas changes during active scaling operations, preventing state machine corruption from concurrent HPA updates. Includes CRD YAML, RBAC roles, webhook registration, and generated Cargo/Nix files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace the inline pattern match against individual ScalerStage variants with the new is_scaling_in_progress() method from operator-rs. This removes the ScalerStage import and ensures the webhook stays correct if new stages are added to the state machine. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove cluster-kind label injection (no longer needed with .owns()). Narrow to UPDATE operations only. The handler returns allow/deny without patches, making it functionally a validating webhook. The MutatingWebhook framework is retained because stackable-webhook does not yet provide a ValidatingWebhook type. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Part of the ReplicasConfig rewrite: the label-injection webhook is no longer needed (replaced by owner references and .owns()), so the webhook description and code are updated to reflect validation-only scope. - Update CLI help text: remove label-injection reference, describe webhook as rejecting spec.replicas changes during active scaling. - Simplify deny logic in scaler_admission_handler: use `if let` with `filter()` instead of `is_some_and()` + separate `stage_str` variable, removing the unnecessary "unknown" fallback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

soenkeliebau and others added 5 commits March 11, 2026 09:39

Added scaling functionality for trino-operator

43604f8

soenkeliebau mentioned this pull request Mar 25, 2026

feat(auto scaling): implement NiFi auto-scaling with graceful node decommissioning stackabletech/nifi-operator#915

Open

30 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat (auto scale) : add StackableScaler CRD rollout and admission webhook#411

feat (auto scale) : add StackableScaler CRD rollout and admission webhook#411
soenkeliebau wants to merge 5 commits intomainfrom
feat/autoscale

soenkeliebau commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

soenkeliebau commented Mar 25, 2026

Summary

Motivation

Dependencies

Test plan

Author

Reviewer

Acceptance

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant