Skip to content

Conversation

@omerap12
Copy link
Member

@omerap12 omerap12 commented Nov 15, 2025

What type of PR is this?

/kind documentation

What this PR does / why we need it:

AEP for #8720

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

NONE

Signed-off-by: Omer Aplatony <[email protected]>
@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Nov 15, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: omerap12
Once this PR has been reviewed and has the lgtm label, please assign gjtempleton for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. area/vertical-pod-autoscaler and removed do-not-merge/needs-area labels Nov 15, 2025
Signed-off-by: Omer Aplatony <[email protected]>
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-none Denotes a PR that doesn't merit a release note. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Nov 15, 2025
@omerap12 omerap12 changed the title [WIP] In Place Only VPA AEP-8720: InPlace Update Mode Nov 15, 2025
@omerap12 omerap12 changed the title AEP-8720: InPlace Update Mode AEP-8818: InPlace Update Mode Nov 15, 2025
Signed-off-by: Omer Aplatony <[email protected]>
Signed-off-by: Omer Aplatony <[email protected]>
Signed-off-by: Omer Aplatony <[email protected]>
@omerap12
Copy link
Member Author

/cc @adrianmoisey @maxcao13

@omerap12 omerap12 marked this pull request as ready for review November 16, 2025 11:55
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 16, 2025
Signed-off-by: Omer Aplatony <[email protected]>
@omerap12
Copy link
Member Author

/kind api-review

@k8s-ci-robot
Copy link
Contributor

@omerap12: The label(s) kind/api-review cannot be applied, because the repository doesn't have them.

In response to this:

/kind api-review

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@omerap12
Copy link
Member Author

/label kind/api-review

@maxcao13
Copy link
Member

maxcao13 commented Dec 5, 2025

I'll take a look at this next week @omerap12 sorry for the delay 🥲

Signed-off-by: Omer Aplatony <[email protected]>
Comment on lines 126 to 127
klog.V(4).InfoS("Can't in-place update pod, waiting for next loop", "pod", klog.KObj(pod))
return utils.InPlaceDeferred
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, are these supposed to be indented the same level?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No sure, I'll try to fmt soon


### Behavior when Feature Gate is Disabled

- When `InPlace` feature gate is disabled and a VPA is configured with `UpdateMode: InPlace`, the updater will skip processing that VPA entirely (not fall back to eviction).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to check: it won't evict and it won't in-place update?

Also, what does the admission-controller do when the feature gate is disabled but a pod is set to InPlace?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The admission controller will deny the the request ref

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to check: it won't evict and it won't in-place update?

That’s what I assumed, because if someone wants to use in-place mode only, it likely means the workload can’t be evicted. In that case, I think the correct action is to do nothing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, what if someone does this:

  1. Upgrades to this version of VPA and enables the feature gate
  2. Uses the InPlace mode on a VPA
  3. Disables the feature gate
  4. Deletes a Pod from the VPA pointing at InPlace

Does the admission-controller:

  1. Set the resources as per the recommendation (as if the VPA was in "Initial" mode)
  2. Ignore the pod (as if the VPA was in "Off" mode)
  3. Something else..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I didn't test it. but it should be 1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checked, we set the resources as per the recommendation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, that's worth clarifying here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in: 13f1fa7

- Apply recommendations during pod admission (like all other modes)
- Attempt in-place updates for running pods under the same conditions as `InPlaceOrRecreate`
- Never add pods to `podsForEviction` if in-place updates fail
- Continuously retry failed in-place update
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a backoff policy for retrying, or do we think linear retry is sufficient if we keep failing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another idea - let me know what you think.
Since the kubelet automatically retries deferred pods when resources become available, we could use that behavior to our advantage. Let's say we send an in-place update that sets a pod to x CPU and y memory.
In the next update loop, if the recommended values are still x CPU and y memory, we can skip sending a new update. We already know the kubelet got the first request and will retry it when it can.
So the updater only needs to check whether the requested resources have changed since the previous cycle. If they haven’t changed, we just move on.

The main drawback is that the updater now has to remember which recommendation was last applied for each pod, which means some extra memory use and more bookkeeping in the code.

Copy link
Member

@maxcao13 maxcao13 Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense what you are proposing but I think the existing resize conditions will tell us this information without the extra bookkeeping: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources#resize-status

I think this already exists in the vpa code -> with Deferred we will just keep waiting until the deferred timeout, at which point InPlaceOrRecreate will fallback to eviction. Is our intention with InPlace to just let it sit in Deferred indefinitely (same with InProgress)? I think that's what we want, I just wanted to clarify here, and maybe clarify that in the AEP.

With Infeasible I'm not so sure. In the KEP it says it the kubelet itself will never retry, which is where the VPA will come in to manually retry and where I guess I am asking if we should backoff these requests or not. But as an alpha implementation, I think it's fine to just retry if we see Infeasible indefinitely, until pre-production testing tells us it's better not to do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this already exists in the vpa code -> with Deferred we will just keep waiting until the deferred timeout, at which point InPlaceOrRecreate will fallback to eviction. Is our intention with InPlace to just let it sit in Deferred indefinitely (same with InProgress)? I think that's what we want, I just wanted to clarify here, and maybe clarify that in the AEP.

Exactly, with deferred we will just skip that pod do nothing ( the kubelet will do the hard job for us ).

With Infeasible I'm not so sure. In the KEP it says it the kubelet itself will never retry, which is where the VPA will come in to manually retry and where I guess I am asking if we should backoff these requests or not. But as an alpha implementation, I think it's fine to just retry if we see Infeasible indefinitely, until pre-production testing tells us it's better not to do that.

Agree.

So to sum up:

  • Deferred - do nothing.
  • Infeasible - we retry with no backoff for alpha.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding both of these cases, what happens if the recommendation changes? (Omer already mentioned this earlier in the thread).

Should the updater check if recommendations != spec.resources, and if they aren't the same, resize again?

It's possible that the new recommendation could be smaller, allowing for the pod to be resized.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should, since we don’t know how long the pod will remain in deferred mode and we don’t want to miss recommendations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense +1

- Allow VPA to eventually apply updates when cluster conditions improve
- Respect the existing in-place update infrastructure from AEP-4016

## Non-Goals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think we should have some small note that this update mode is subject to the behavior of the inplacepodverticalscaling gate, such that it's possible (but improbable) that a resize can cause an OOMkill during a memory limit downsize?

Though I don't actually know the probability of this happening if a limit gets resized close to the usage, I think it may be useful to callout since we emphasize that brief disruptions are unnacceptable.

I think to mitigate risk here we may want to recommend that if you absolutely cannot tolerate disruption (i.e. unintended OOMkill), then you can either:

  1. disallow memory limits for your no disruption container
  2. if you must allow VPA to set memory limits, then you should configure the VPA to generate more generous/conservative memory limit recommendations as a safety buffer.

^ Though this may or may not be better for our docs, instead of getting into it in the AEP here.

Thoughts? cc @adrianmoisey

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right.
I was thinking a similar though on the "Provide a truly non-disruptive VPA update mode that never evicts pods" goal.

I think it may be worth softening the language in the AEP (since we can't make guarantees that resizes are non-disruptive)

I also agree that most of what you suggested may be good for the docs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related: #8805

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that sounds very reasonable. I think we can have this in both our docs and the AEP.
lemme know what you think of this: ba9514a

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks for that 👍

Comment on lines +243 to +245
## Implementation History

- 2025-15-11: initial version
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we didn't write anything down for AEP-4016 in terms of graduation criteria, but since we went through the process of graduating that one from alpha to beta, I'm wondering if we should have some sort of idea for this one?

I don't know if we should have some formal process, but just judging from the last graduation, I think it make senses to say we would keep it in alpha for one release cycle to allow early adoption, and if there's no graduation bugs/blockers that come up in the issues, then we are okay to graduate to beta.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added Graduation Criteria section.

)
```

Modify the `CanInPlaceUpdate` to accomdate the new update mode:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
verify check is complaining about this:

vertical-pod-autoscaler/enhancements/8818-in-place-only/README.md:80:33: "accomdate" is a misspelling of "accommodate"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! fixed :)

Signed-off-by: Omer Aplatony <[email protected]>
Signed-off-by: Omer Aplatony <[email protected]>
@maxcao13
Copy link
Member

maxcao13 commented Dec 11, 2025

/lgtm

Thanks for writing this up, this is great :-)

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 11, 2025
Signed-off-by: Omer Aplatony <[email protected]>
@k8s-ci-robot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 11, 2025
@omerap12
Copy link
Member Author

@maxcao13 , @adrianmoisey , @iamzili
Updated the AEP based on our talk, if you lgtm I want to ping to the sig-node folks for review as well.

@omerap12
Copy link
Member Author

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Dec 11, 2025

Disabling of feature gate `InPlace` will cause the following to happen:
- admission-controller will:
- Reject new VPA objects being created with `InPlace` configured
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to clarify, it will reject new VPAs with InPlace, existing VPAs with InPlace can still be modified, right?
(that's how k/k handles this)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - I need to double check on that but yes, just like InPlaceOrRecreate

@adrianmoisey
Copy link
Member

Generally speaking I think this is good.
I think it's safe to ping sig-node and api-review on this if you want

@omerap12
Copy link
Member Author

/label api-review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-review Categorizes an issue or PR as actively needing an API review. area/vertical-pod-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants