Skip to content

feat(auto scaling): add StackableScaler CRD, state machine, ReplicasConfig, and scaling hooks#1181

Open
soenkeliebau wants to merge 9 commits intomainfrom
feat/autoscale
Open

feat(auto scaling): add StackableScaler CRD, state machine, ReplicasConfig, and scaling hooks#1181
soenkeliebau wants to merge 9 commits intomainfrom
feat/autoscale

Conversation

@soenkeliebau
Copy link
Member

Summary

Introduces the StackableScaler CRD and supporting framework for HPA-integrated scaling
with operator-controlled lifecycle hooks. The HPA targets the StackableScaler's /scale
subresource instead of the StatefulSet directly, giving product operators the opportunity
to run pre/post-scale hooks (e.g. NiFi node offloading, Trino graceful shutdown) before
replica changes propagate.

fixes stackabletech/issues#667

Key additions:

  • StackableScaler CRD (autoscaling.stackable.tech/v1alpha1) with /scale subresource,
    5-stage state machine (Idle -> PreScaling -> Scaling -> PostScaling -> Idle, plus Failed),
    and annotation-based retry recovery
  • ReplicasConfig enum -- Fixed(u16), Hpa(HpaConfig), Auto(AutoConfig),
    ExternallyScaled -- with flexible deserialization (bare integer, string, or typed object)
    and validation
  • ScalingHooks trait with pre_scale(), post_scale(), and on_failure() lifecycle
    hooks, ScalingContext with direction-aware helpers (removed_ordinals(),
    added_ordinals()), and HookOutcome (Done/InProgress) for async hook execution
  • reconcile_scaler() -- state machine reconciler that drives stage transitions, calls
    hooks, handles failures, and returns ScalingResult with requeue action + ScalingCondition
    for cluster CR status propagation
  • build_scaler() / build_hpa_from_user_spec() -- builders for StackableScaler and HPA
    objects with deterministic naming, standard app.kubernetes.io labels, and owner references
  • initialize_scaler_status() -- seeds scaler status on first reconcile to prevent
    scale-to-zero (status defaults to replicas: 0 before initialization)
  • JobTracker -- idempotent Kubernetes Job lifecycle manager for hook execution with
    DNS-safe name generation and automatic cleanup
  • ClusterResource / DeepMerge impls for StackableScaler, enabling ClusterResources
    lifecycle management and orphan cleanup
  • Versioned macro extension -- #[versioned(k8s(scale(...)))] attribute support for
    generating CRDs with the /scale subresource

Motivation

Kubernetes HPAs can target StatefulSets directly, but this bypasses the operator -- there is
no interception point for product-specific lifecycle tasks before pods are terminated. For
NiFi, this means flowfile data is not offloaded before node removal, causing data loss. For
Trino, active queries are killed mid-execution. The StackableScaler provides that
interception point as a generic, reusable framework.

Notable design choices

  • Spec contains only replicas: i32 -- identity is derived from owner references and
    app.kubernetes.io labels. This keeps the CRD minimal and avoids redundant fields that
    could drift from the cluster CR.
  • Failed is a terminal trap state -- the state machine cannot leave Failed without an
    explicit autoscaling.stackable.tech/retry annotation. This prevents infinite retry loops
    on persistent hook failures and requires human acknowledgment.
  • ReplicasConfig custom deserialization -- accepts replicas: 3 (Fixed), replicas: externallyScaled, or replicas: { hpa: { maxReplicas: 10, ... } } in a single field,
    maintaining backward compatibility with existing integer-based configs.
  • HpaConfig wraps HorizontalPodAutoscalerSpec -- users provide standard HPA fields
    (maxReplicas, minReplicas, metrics, behavior); scaleTargetRef is overwritten
    internally to point at the StackableScaler.

Test plan

  • cargo test --all-features passes -- unit tests cover state machine transitions,
    ReplicasConfig deserialization/validation, builder output, job naming, hook direction
    derivation, and serialization round-trips
  • cargo clippy --all-targets --all-features -- -D warnings clean
  • cargo doc --no-deps --all-features produces no warnings
  • Verify StackableScaler CRD schema generates correctly with scale subresource paths
  • Integration test with commons-operator and nifi-operator PRs: HPA targets
    StackableScaler, state machine drives hooks through full scale-up/scale-down cycle

Author

  • Changes are OpenShift compatible
  • CRD changes approved
  • CRD documentation for all fields, following the style guide.
  • Integration tests passed (for non trivial changes)
  • Changes need to be "offline" compatible

Reviewer

  • Code contains useful comments
  • Code contains useful logging statements
  • (Integration-)Test cases added
  • Documentation added or updated. Follows the style guide.
  • Changelog updated
  • Cargo.toml only contains references to git tags (not specific commits or branches)

Acceptance

  • Feature Tracker has been updated
  • Proper release label has been added

soenkeliebau and others added 9 commits March 11, 2026 09:38
Add the StackableScaler custom resource and a generic reconciler that
drives a multi-stage scaling state machine (Idle → PreScaling → Scaling →
PostScaling → Idle). Operators implement the ScalingHooks trait to plug
in product-specific logic (e.g. data offload before scale-down).

Key components:
- CRD types with serde/JsonSchema support for status subresource
- Hook trait with pre_scale, post_scale, and on_failure callbacks
- Reconciler that advances stages, patches status, and handles failures
- JobTracker for coordinating async hook operations
- ScalingContext helpers for direction detection and ordinal computation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…knowledge

The admission webhook in commons-operator was pattern-matching on
ScalerStage variants to determine whether a scaling operation blocks
HPA writes. This duplicated the "which stages are active" logic,
creating a maintenance risk: adding a new stage to the state machine
would require updating the webhook's match arm in a separate crate.

Move this knowledge into a single method on ScalerStage so both the
reconciler and the webhook can query it without enumerating variants.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The design doc (ADR Decision 9) specified that Failed is a terminal trap
state with annotation-based recovery, but the implementation was missing.

When the operator sees `autoscaling.stackable.tech/retry: "true"` on a
StackableScaler in the Failed stage, it now:
1. Strips the annotation via merge patch
2. Resets status.currentState.stage to Idle
3. Clears desired_replicas and previous_replicas
4. Requeues so the next reconcile can start a fresh scaling attempt

Usage:
  kubectl annotate stackablescaler <name> autoscaling.stackable.tech/retry=true

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduces ReplicasConfig with Fixed, Hpa, Auto, and ExternallyScaled
variants to replace the simple `replicas: Option<u16>` on role groups.
Includes custom Deserialize impl (bare integer, string, tagged object),
validation via snafu, JsonSchema, and comprehensive tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove cluster_ref, role, and role_group from StackableScalerSpec since
identity is now conveyed via owner references and labels set by callers.
Also remove the UnknownClusterRef struct which was only used by the spec.

The reconcile_scaler() function now accepts role_group_name as an
explicit parameter instead of reading it from the spec.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…Autoscaler

Enable StackableScaler and HPA to be managed through ClusterResources.add(),
providing label validation and orphan cleanup. Adds DeepMerge implementations
for StackableScaler and its status types, and registers both resource types
in delete_orphaned_resources().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ler objects

Provides a shared helper that product operators use to construct
StackableScaler resources with the required labels (name, instance,
managed-by, component, role-group) and owner reference, ensuring
ClusterResources.add() validation passes consistently.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add three public helpers to the scaler module:

- `scale_target_ref()`: builds a CrossVersionObjectReference pointing
  at a StackableScaler, for use as an HPA's scaleTargetRef.
- `build_hpa_from_user_spec()`: constructs a HorizontalPodAutoscaler
  from a user-provided spec, overwriting scaleTargetRef to target the
  correct StackableScaler and applying the standard 5-label set.
- `initialize_scaler_status()`: patches a freshly created scaler's
  status subresource with the current replica count and Idle stage,
  preventing the scale-to-zero edge case on first reconcile.

Also makes BuildScalerError's snafu context selectors pub(super) so
sibling modules can reuse them via .context().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Shared AutoScaling Hook Functionality

1 participant