feat(auto scaling): add StackableScaler CRD, state machine, ReplicasConfig, and scaling hooks by soenkeliebau · Pull Request #1181 · stackabletech/operator-rs

soenkeliebau · 2026-03-25T11:51:08Z

Summary

Introduces the StackableScaler CRD and supporting framework for HPA-integrated scaling
with operator-controlled lifecycle hooks. The HPA targets the StackableScaler's /scale
subresource instead of the StatefulSet directly, giving product operators the opportunity
to run pre/post-scale hooks (e.g. NiFi node offloading, Trino graceful shutdown) before
replica changes propagate.

fixes stackabletech/issues#667

Key additions:

StackableScaler CRD (autoscaling.stackable.tech/v1alpha1) with /scale subresource,
5-stage state machine (Idle -> PreScaling -> Scaling -> PostScaling -> Idle, plus Failed),
and annotation-based retry recovery
ReplicasConfig enum -- Fixed(u16), Hpa(HpaConfig), Auto(AutoConfig),
ExternallyScaled -- with flexible deserialization (bare integer, string, or typed object)
and validation
ScalingHooks trait with pre_scale(), post_scale(), and on_failure() lifecycle
hooks, ScalingContext with direction-aware helpers (removed_ordinals(),
added_ordinals()), and HookOutcome (Done/InProgress) for async hook execution
reconcile_scaler() -- state machine reconciler that drives stage transitions, calls
hooks, handles failures, and returns ScalingResult with requeue action + ScalingCondition
for cluster CR status propagation
build_scaler() / build_hpa_from_user_spec() -- builders for StackableScaler and HPA
objects with deterministic naming, standard app.kubernetes.io labels, and owner references
initialize_scaler_status() -- seeds scaler status on first reconcile to prevent
scale-to-zero (status defaults to replicas: 0 before initialization)
JobTracker -- idempotent Kubernetes Job lifecycle manager for hook execution with
DNS-safe name generation and automatic cleanup
ClusterResource / DeepMerge impls for StackableScaler, enabling ClusterResources
lifecycle management and orphan cleanup
Versioned macro extension -- #[versioned(k8s(scale(...)))] attribute support for
generating CRDs with the /scale subresource

Motivation

Kubernetes HPAs can target StatefulSets directly, but this bypasses the operator -- there is
no interception point for product-specific lifecycle tasks before pods are terminated. For
NiFi, this means flowfile data is not offloaded before node removal, causing data loss. For
Trino, active queries are killed mid-execution. The StackableScaler provides that
interception point as a generic, reusable framework.

Notable design choices

Spec contains only replicas: i32 -- identity is derived from owner references and
app.kubernetes.io labels. This keeps the CRD minimal and avoids redundant fields that
could drift from the cluster CR.
Failed is a terminal trap state -- the state machine cannot leave Failed without an
explicit autoscaling.stackable.tech/retry annotation. This prevents infinite retry loops
on persistent hook failures and requires human acknowledgment.
ReplicasConfig custom deserialization -- accepts replicas: 3 (Fixed), replicas: externallyScaled, or replicas: { hpa: { maxReplicas: 10, ... } } in a single field,
maintaining backward compatibility with existing integer-based configs.
HpaConfig wraps HorizontalPodAutoscalerSpec -- users provide standard HPA fields
(maxReplicas, minReplicas, metrics, behavior); scaleTargetRef is overwritten
internally to point at the StackableScaler.

Test plan

cargo test --all-features passes -- unit tests cover state machine transitions,
ReplicasConfig deserialization/validation, builder output, job naming, hook direction
derivation, and serialization round-trips
cargo clippy --all-targets --all-features -- -D warnings clean
cargo doc --no-deps --all-features produces no warnings
Verify StackableScaler CRD schema generates correctly with scale subresource paths
Integration test with commons-operator and nifi-operator PRs: HPA targets
StackableScaler, state machine drives hooks through full scale-up/scale-down cycle

Author

Changes are OpenShift compatible
CRD changes approved
CRD documentation for all fields, following the style guide.
Integration tests passed (for non trivial changes)
Changes need to be "offline" compatible

Reviewer

Code contains useful comments
Code contains useful logging statements
(Integration-)Test cases added
Documentation added or updated. Follows the style guide.
Changelog updated
Cargo.toml only contains references to git tags (not specific commits or branches)

Acceptance

Feature Tracker has been updated
Proper release label has been added

Add the StackableScaler custom resource and a generic reconciler that drives a multi-stage scaling state machine (Idle → PreScaling → Scaling → PostScaling → Idle). Operators implement the ScalingHooks trait to plug in product-specific logic (e.g. data offload before scale-down). Key components: - CRD types with serde/JsonSchema support for status subresource - Hook trait with pre_scale, post_scale, and on_failure callbacks - Reconciler that advances stages, patches status, and handles failures - JobTracker for coordinating async hook operations - ScalingContext helpers for direction detection and ordinal computation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…knowledge The admission webhook in commons-operator was pattern-matching on ScalerStage variants to determine whether a scaling operation blocks HPA writes. This duplicated the "which stages are active" logic, creating a maintenance risk: adding a new stage to the state machine would require updating the webhook's match arm in a separate crate. Move this knowledge into a single method on ScalerStage so both the reconciler and the webhook can query it without enumerating variants. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The design doc (ADR Decision 9) specified that Failed is a terminal trap state with annotation-based recovery, but the implementation was missing. When the operator sees `autoscaling.stackable.tech/retry: "true"` on a StackableScaler in the Failed stage, it now: 1. Strips the annotation via merge patch 2. Resets status.currentState.stage to Idle 3. Clears desired_replicas and previous_replicas 4. Requeues so the next reconcile can start a fresh scaling attempt Usage: kubectl annotate stackablescaler <name> autoscaling.stackable.tech/retry=true Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Introduces ReplicasConfig with Fixed, Hpa, Auto, and ExternallyScaled variants to replace the simple `replicas: Option<u16>` on role groups. Includes custom Deserialize impl (bare integer, string, tagged object), validation via snafu, JsonSchema, and comprehensive tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove cluster_ref, role, and role_group from StackableScalerSpec since identity is now conveyed via owner references and labels set by callers. Also remove the UnknownClusterRef struct which was only used by the spec. The reconcile_scaler() function now accepts role_group_name as an explicit parameter instead of reading it from the spec. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…Autoscaler Enable StackableScaler and HPA to be managed through ClusterResources.add(), providing label validation and orphan cleanup. Adds DeepMerge implementations for StackableScaler and its status types, and registers both resource types in delete_orphaned_resources(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ler objects Provides a shared helper that product operators use to construct StackableScaler resources with the required labels (name, instance, managed-by, component, role-group) and owner reference, ensuring ClusterResources.add() validation passes consistently. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add three public helpers to the scaler module: - `scale_target_ref()`: builds a CrossVersionObjectReference pointing at a StackableScaler, for use as an HPA's scaleTargetRef. - `build_hpa_from_user_spec()`: constructs a HorizontalPodAutoscaler from a user-provided spec, overwriting scaleTargetRef to target the correct StackableScaler and applying the standard 5-label set. - `initialize_scaler_status()`: patches a freshly created scaler's status subresource with the current replica count and Idle stage, preventing the scale-to-zero edge case on first reconcile. Also makes BuildScalerError's snafu context selectors pub(super) so sibling modules can reuse them via .context(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

soenkeliebau and others added 9 commits March 11, 2026 09:38

Added scaling functionality for trino-operator

c321d73

This was referenced Mar 25, 2026

feat (auto scale) : add StackableScaler CRD rollout and admission webhook stackabletech/commons-operator#411

Open

feat(auto scaling): implement NiFi auto-scaling with graceful node decommissioning stackabletech/nifi-operator#915

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(auto scaling): add StackableScaler CRD, state machine, ReplicasConfig, and scaling hooks#1181

feat(auto scaling): add StackableScaler CRD, state machine, ReplicasConfig, and scaling hooks#1181
soenkeliebau wants to merge 9 commits intomainfrom
feat/autoscale

soenkeliebau commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

soenkeliebau commented Mar 25, 2026

Summary

Motivation

Notable design choices

Test plan

Author

Reviewer

Acceptance

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant