fix(runtimes): set defaultMode on MPI SSH Secret volume#3649
Conversation
|
🎉 Welcome to the Kubeflow Trainer! 🎉 Thanks for opening your first PR! We're happy to have you as part of our community 🚀 Here's what happens next:
Join the community:
Feel free to ask questions in the comments if you need any help or clarification! |
There was a problem hiding this comment.
Pull request overview
This PR hardens the MPI plugin’s SSH Secret volume permissions by explicitly setting defaultMode and per-item mode values so the mounted SSH private key is no longer world-readable, aligning the runtime behavior with least-privilege expectations for in-pod SSH auth material.
Changes:
- Set
defaultModeon the MPI SSH auth Secret volume and explicitmodeon each mounted Secret item (private key, public key, authorized_keys). - Centralize the file mode values as constants for reuse and consistency.
- Update unit, framework, runtime, and integration tests to assert the new volume permissions.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
pkg/runtime/framework/plugins/mpi/mpi.go |
Sets Secret volume DefaultMode and per-item Mode for SSH key material in the MPI plugin. |
pkg/constants/constants.go |
Introduces constants for the SSH Secret volume default mode and per-file modes. |
pkg/runtime/framework/plugins/mpi/mpi_test.go |
Updates MPI plugin unit tests to expect the new defaultMode and item mode fields. |
pkg/runtime/framework/core/framework_test.go |
Updates framework component-builder expectations for the SSH Secret volume permissions. |
pkg/runtime/core/trainingruntime_test.go |
Updates runtime construction tests to include the new Secret volume modes. |
test/integration/controller/trainjob_controller_test.go |
Updates integration assertions for rendered volumes to include DefaultMode and per-item Mode. |
3a49f53 to
cf032c7
Compare
Set per-item file modes on the MPI SSH auth Secret volume to prevent the private key from being world-readable inside training pods. Private key uses 0o640 (not 0o600) because the builtin deepspeed-distributed and mlx-distributed runtimes run as runAsUser: 1000 while Secret volumes mount as root:root. A volume-level DefaultMode of 0o400 is added as defense-in-depth for any future items added without an explicit per-item mode. Fixes kubeflow#3262 Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Dmytro Hryshchenko <dhryshch@redhat.com>
cf032c7 to
be0c6d9
Compare
|
/ok-to-test |
andreyvelich
left a comment
There was a problem hiding this comment.
Thanks for picking this up and addressing the prior feedback, @h0pers! This review was done using AI tooling. One small non-blocking nit inline.
|
/assign @robert-bell @astefanutti @abhijeet-dhumal @tenzen-y |
|
@andreyvelich: GitHub didn't allow me to assign the following users: robert-bell. Note that only kubeflow members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
robert-bell
left a comment
There was a problem hiding this comment.
Thanks @h0pers this looks great!
/lgtm
|
@robert-bell: changing LGTM is restricted to collaborators DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Co-authored-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> Signed-off-by: Dmytro Hryshchenko <dmytrohryshchenkowork@gmail.com>
|
Thanks @h0pers ! 🙌 |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andreyvelich The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
Sets
defaultModeand per-itemmodeon the MPI SSH Secret volume to restrict private key permissions. Without this fix, Kubernetes defaults all mounted Secret files to0644(world-readable), making the SSH private key accessible to any user in the container.Supersedes #3368 and addresses its unresolved review feedback.
File modes:
id_rsa):0o640(owner rw, group r) — needs group-read because builtin runtimes (deepspeed, mlx) run asrunAsUser: 1000while Secret volumes mount asroot:rootauthorized_keys:0o644(owner rw, world r) — public data, unchanged from defaultdefaultMode:0o400(owner read-only) — defense-in-depth fallback for any future items added without an explicit per-item modeWhich issue(s) this PR fixes:
Fixes #3262
Checklist: