Skip to content

fix: populate Status.Selector in CacheEngine for worker pod discovery#6064

Open
adityaupasani2 wants to merge 1 commit into
fluid-cloudnative:masterfrom
adityaupasani2:fix/cache-engine-selector
Open

fix: populate Status.Selector in CacheEngine for worker pod discovery#6064
adityaupasani2 wants to merge 1 commit into
fluid-cloudnative:masterfrom
adityaupasani2:fix/cache-engine-selector

Conversation

@adityaupasani2

Copy link
Copy Markdown
Contributor

Ⅰ. Describe what this PR does

CacheEngine never populated Status.Selector on the CacheRuntime resource. Every other runtime (JindoCache, JuiceFS, Vineyard, EFC) calls getWorkerSelectors() during master setup to set this field. Without it, CacheRuntime.Status.Selector is always empty, breaking HPA and any tooling that reads this field to discover worker pods.

Two TODOs marked this gap:

  • pkg/ddc/cache/engine/master.go: // TODO(cache runtime): figure out how to use this selector
  • pkg/ddc/cache/engine/status.go: // TODO(cache runtime): set the CacheRuntime Status left fields: Selector

This PR resolves both.

Ⅱ. Does this pull request fix one issue?

Fixes #6063

Ⅲ. List the added test cases

Added 4 tests for getWorkerSelectors() in util_test.go:

  • Returns a non-empty selector string
  • Selector contains the runtime name label and value
  • Selector contains the worker component name label and value
  • Different runtime names produce different selectors

Ⅳ. Describe how to verify it

After creating a CacheRuntime, check that status.selector is populated with a valid label selector string matching the worker pods.

Ⅴ. Special notes for reviews

getWorkerSelectors() uses common.LabelCacheRuntimeName and common.LabelCacheRuntimeComponentName — the same labels set by getCommonLabelsFromComponent() in the component manager, which are used as the StatefulSet selector. This matches the pattern used by JindoCache, JuiceFS, Vineyard and EFC.

CacheEngine never set Status.Selector on the CacheRuntime resource,
unlike every other runtime (JindoCache, JuiceFS, Vineyard, EFC) which
all call getWorkerSelectors() during master setup. Without it, HPA and
any tooling that reads CacheRuntime.Status.Selector to discover worker
pods sees an empty string.

- Add getWorkerSelectors() to pkg/ddc/cache/engine/util.go, building
  the selector from LabelCacheRuntimeName and LabelCacheRuntimeComponentName
  labels — matching the convention used by other runtimes
- Call it in setupMasterInternal() (master.go) to set Status.Selector
  when the master is first initialized, removing the TODO comment
- Also set it in CheckAndUpdateRuntimeStatus() (status.go) on every
  reconcile so it stays consistent, removing the second TODO comment
- Add tests for getWorkerSelectors() covering non-empty output,
  correct label content, and selector uniqueness per runtime name

Fixes fluid-cloudnative#6063

Signed-off-by: Aditya Upasani <adityaupasani29@gmail.com>
@fluid-e2e-bot

fluid-e2e-bot Bot commented Jun 25, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign yangyuliufeng for approval by writing /assign @yangyuliufeng in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@fluid-e2e-bot

fluid-e2e-bot Bot commented Jun 25, 2026

Copy link
Copy Markdown

Hi @adityaupasani2. Thanks for your PR.

I'm waiting for a fluid-cloudnative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sonarqubecloud

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements the getWorkerSelectors helper method to populate the Status.Selector field for CacheRuntime worker pods, enabling discovery by HPA and other tooling. Unit tests have also been added to verify this functionality. The review feedback suggests simplifying the implementation of getWorkerSelectors by using labels.SelectorFromSet from k8s.io/apimachinery/pkg/labels instead of metav1.LabelSelectorAsSelector, which eliminates unnecessary error handling and simplifies the imports.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.


"github.com/fluid-cloudnative/fluid/pkg/common"
"github.com/fluid-cloudnative/fluid/pkg/utils"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since we can construct the label selector directly using labels.SelectorFromSet, we don't need to import metav1 here. We can import k8s.io/apimachinery/pkg/labels instead.

Suggested change
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/labels"

Comment on lines +140 to +154
labels := map[string]string{
common.LabelCacheRuntimeName: e.name,
common.LabelCacheRuntimeComponentName: workerName,
}
labelSelector := &metav1.LabelSelector{
MatchLabels: labels,
}
selectorValue := ""
selector, err := metav1.LabelSelectorAsSelector(labelSelector)
if err != nil {
e.Log.Error(err, "Failed to parse the labelSelector of the runtime", "labels", labels)
} else {
selectorValue = selector.String()
}
return selectorValue

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Instead of manually constructing a metav1.LabelSelector and converting it via metav1.LabelSelectorAsSelector (which requires error handling), you can directly use labels.SelectorFromSet to build the selector string. This is cleaner, more idiomatic, and avoids unnecessary error handling since SelectorFromSet is guaranteed to succeed for a static map of valid labels.

Suggested change
labels := map[string]string{
common.LabelCacheRuntimeName: e.name,
common.LabelCacheRuntimeComponentName: workerName,
}
labelSelector := &metav1.LabelSelector{
MatchLabels: labels,
}
selectorValue := ""
selector, err := metav1.LabelSelectorAsSelector(labelSelector)
if err != nil {
e.Log.Error(err, "Failed to parse the labelSelector of the runtime", "labels", labels)
} else {
selectorValue = selector.String()
}
return selectorValue
return labels.SelectorFromSet(labels.Set{
common.LabelCacheRuntimeName: e.name,
common.LabelCacheRuntimeComponentName: workerName,
}).String()

@codecov

codecov Bot commented Jun 25, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 84.21053% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.78%. Comparing base (36f0467) to head (547bdfd).

Files with missing lines Patch % Lines
pkg/ddc/cache/engine/util.go 82.35% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6064   +/-   ##
=======================================
  Coverage   64.77%   64.78%           
=======================================
  Files         484      484           
  Lines       33892    33909   +17     
=======================================
+ Hits        21954    21968   +14     
- Misses      10215    10216    +1     
- Partials     1723     1725    +2     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


// TODO(cache runtime): figure out how to use this selector
// runtimeToUpdate.Status.Selector = e.getWorkerSelectors()
runtimeToUpdate.Status.Selector = e.getWorkerSelectors()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heads-up, out of scope for this PR but worth flagging in #6063: populating Status.Selector here is correct and useful for any tooling that reads it directly, but HPA targeting CacheRuntime/scale will still not work until the scale subresource is declared. Unlike AlluxioRuntime, JuiceFSRuntime, and VineyardRuntime, the CacheRuntime type currently only declares +kubebuilder:subresource:status — there is no +kubebuilder:subresource:scale:...selectorpath=.status.selector annotation, and the generated CRD (config/crd/bases/data.fluid.io_cacheruntimes.yaml) reflects that (only subresources: status: {}). A follow-up PR adding the scale subresource will likely be needed to fully close the linked issue.

e.Log.Error(err, "Failed to parse the labelSelector of the runtime", "labels", labels)
} else {
selectorValue = selector.String()
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: the gemini-code-assist bot suggested using labels.SelectorFromSet(labels) from k8s.io/apimachinery/pkg/labels here, which would drop the metav1.LabelSelectorAsSelector error branch entirely (SelectorFromSet cannot fail for plain MatchLabels). Not a blocker — the current form is consistent with the existing pattern in pkg/ddc/jindocache/worker.go, so keeping symmetry across runtimes is also a valid choice. Up to you.

engine1 := &CacheEngine{name: "runtime-a", namespace: "default", Log: fake.NullLogger()}
engine2 := &CacheEngine{name: "runtime-b", namespace: "default", Log: fake.NullLogger()}
Expect(engine1.getWorkerSelectors()).NotTo(Equal(engine2.getWorkerSelectors()))
})

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding one assertion that pins the exact selector string (e.g. Expect(selector).To(Equal(...)) with the expected sorted form) so any future change to selector composition — label keys, ordering, escaping — is caught by tests. Right now the assertions only check substrings, so an unrelated label silently added in the future would not fail this test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] CacheEngine does not populate Status.Selector, breaking HPA and worker pod discovery

2 participants