fix: proxy-webhook selector matches operator pods by bowling233 · Pull Request #3228 · tektoncd/operator

bowling233 · 2026-02-19T10:19:56Z

Changes

Both the tekton-operator and tekton-operator-proxy-webhook Deployments
label their Pods with name: tekton-operator. The
tekton-operator-proxy-webhook Service uses this same label as its only
selector, so it inadvertently load-balances traffic across both Deployments.
Because tekton-operator pods do not serve on port 8443, ~50% of admission
webhook requests fail with connection refused. Since the
MutatingWebhookConfiguration has failurePolicy: Fail, each failure
immediately rejects TaskRun Pod creation.

Changes:

cmd/kubernetes/operator/kodata/webhook/webhook.yaml: rename the
proxy-webhook Deployment's matchLabels selector and pod template label
from name: tekton-operator to name: tekton-operator-proxy-webhook;
update the Service selector to match.
cmd/openshift/operator/kodata/webhook/webhook.yaml: same change for the
OpenShift manifest.

The existing app: tekton-operator label is preserved on both Deployments.
No other resources are affected.

Alternative considered: adding a set-based (NotIn) expression to the
Service selector to exclude tekton-operator pods. This was not viable
because Kubernetes Services only support equality-based (matchLabels)
selectors.

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

Run make test lint before submitting a PR
Includes tests (if functionality changed/added)
Includes docs (if user facing)
Commit messages follow commit message best practices

See the contribution guide for more details.

Note on tests: This bug only manifests at the Service routing layer
(i.e., ~50% of requests land on a pod with no server). There is no
in-tree unit or integration test that exercises which pods a Service
selects. A targeted e2e test verifying that the proxy-webhook Service
endpoints do not include tekton-operator pods would be a good addition,
but is left for a follow-up.

Release Notes

Fix: the tekton-operator-proxy-webhook Service selector incorrectly matched
tekton-operator pods in addition to proxy-webhook pods, causing ~50% of
admission webhook requests to fail with "connection refused" and TaskRun Pod
creation to be rejected. Users on v0.78.1 can work around this until upgrading
by adding `pod-template-hash: <webhook-pod-hash>` to the Service selector.

Both the `tekton-operator` and `tekton-operator-proxy-webhook` Deployments label their Pods with `name: tekton-operator`. The `tekton-operator-proxy-webhook` Service uses this same label as its only selector, so it inadvertently load-balances traffic across both Deployments. Because `tekton-operator` pods do not serve on port 8443, ~50% of admission webhook requests fail: failed calling webhook "proxy.operator.tekton.dev": Post ".../tekton-operator-proxy-webhook.../defaulting": dial tcp <ClusterIP>:443: connect: connection refused Because MutatingWebhookConfiguration has `failurePolicy: Fail`, each such failure immediately rejects TaskRun Pod creation. Rename the proxy-webhook Deployment's selector matchLabels and pod template label from `name: tekton-operator` to `name: tekton-operator-proxy-webhook`, and update the Service selector to match. The `app: tekton-operator` label is left unchanged. Applies to both Kubernetes and OpenShift manifests. Adding a set-based (NotIn) expression to the Service selector instead was not viable as Kubernetes Services only support equality-based (matchLabels) selectors.

linux-foundation-easycla · 2026-02-19T10:20:01Z

The committers listed above are authorized under a signed CLA.

✅ login: bowling233 / name: Baolin Zhu (49cacf1)

tekton-robot · 2026-02-19T10:20:03Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign anithapriyanatarajan after the PR has been reviewed.
You can assign the PR to them by writing /assign @anithapriyanatarajan in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copilot

Pull request overview

Fixes a production routing bug where the tekton-operator-proxy-webhook Service selector unintentionally matched both proxy-webhook and main operator pods, causing intermittent admission webhook failures and rejected TaskRun Pod creation.

Changes:

Update the proxy-webhook Deployment selector + pod template label to use name: tekton-operator-proxy-webhook.
Update the proxy-webhook Service selector to match the new pod label (Kubernetes + OpenShift manifests).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
cmd/kubernetes/operator/kodata/webhook/webhook.yaml	Aligns proxy-webhook Deployment/Service selectors to target only proxy-webhook pods on Kubernetes.
cmd/openshift/operator/kodata/webhook/webhook.yaml	Same selector/label fix for the OpenShift manifest.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cmd/kubernetes/operator/kodata/webhook/webhook.yaml

cmd/openshift/operator/kodata/webhook/webhook.yaml

jkhelil · 2026-02-22T15:28:11Z

@bowling233 , thank for your PR.

can you check what happens to existing clusters during upgrade? ( Install 0.78.1 and then apply your change)
Please describe and post a proof that upgrade is working and not broken

anithapriyanatarajan · 2026-03-09T07:36:09Z

@bowling233 - Request you to address the review comment, if you are still pursuing this PR. Thank you. 🙇‍♀️

bowling233 · 2026-03-09T14:01:05Z

Hi @anithapriyanatarajan,

So sorry for the late response! I won't have the bandwidth to properly validate these changes until next month.

I can confirm this approach has side effects—specifically, the HorizontalPodAutoscaler is hitting FailedGetResourceMetric errors because it's incorrectly picking up the main operator pods, which lack the expected CPU requests in the tekton-operator-lifecycle container.

This PR definitely needs more refinement to handle the selector immutability and the HPA configuration. Should I move this to a Draft for now, or would you prefer I close this and resubmit once I've validated a full fix?

Copilot AI review requested due to automatic review settings February 19, 2026 10:19

tekton-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Feb 19, 2026

tekton-robot requested review from mbpavan and pratap0007 February 19, 2026 10:20

tekton-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 19, 2026

Copilot started reviewing on behalf of bowling233 February 19, 2026 10:20 View session

Copilot AI reviewed Feb 19, 2026

View reviewed changes

cmd/kubernetes/operator/kodata/webhook/webhook.yaml Show resolved Hide resolved

cmd/openshift/operator/kodata/webhook/webhook.yaml Show resolved Hide resolved

anithapriyanatarajan closed this Mar 5, 2026

anithapriyanatarajan reopened this Mar 5, 2026

bowling233 marked this pull request as draft March 10, 2026 14:34

tekton-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: proxy-webhook selector matches operator pods#3228

fix: proxy-webhook selector matches operator pods#3228
bowling233 wants to merge 1 commit intotektoncd:mainfrom
ZJUSCT:main

bowling233 commented Feb 19, 2026

Uh oh!

linux-foundation-easycla bot commented Feb 19, 2026 •

edited

Loading

Uh oh!

tekton-robot commented Feb 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

jkhelil commented Feb 22, 2026

Uh oh!

anithapriyanatarajan commented Mar 9, 2026

Uh oh!

bowling233 commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

bowling233 commented Feb 19, 2026

Changes

Submitter Checklist

Release Notes

Uh oh!

linux-foundation-easycla bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tekton-robot commented Feb 19, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

jkhelil commented Feb 22, 2026

Uh oh!

anithapriyanatarajan commented Mar 9, 2026

Uh oh!

bowling233 commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

linux-foundation-easycla bot commented Feb 19, 2026 •

edited

Loading