Skip to content

fix(runtimes): set defaultMode on MPI SSH Secret volume#3649

Merged
google-oss-prow[bot] merged 2 commits into
kubeflow:masterfrom
h0pers:feat/mpi-ssh-secret-default-mode
Jun 24, 2026
Merged

fix(runtimes): set defaultMode on MPI SSH Secret volume#3649
google-oss-prow[bot] merged 2 commits into
kubeflow:masterfrom
h0pers:feat/mpi-ssh-secret-default-mode

Conversation

@h0pers

@h0pers h0pers commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

What this PR does / why we need it:

Sets defaultMode and per-item mode on the MPI SSH Secret volume to restrict private key permissions. Without this fix, Kubernetes defaults all mounted Secret files to 0644 (world-readable), making the SSH private key accessible to any user in the container.

Supersedes #3368 and addresses its unresolved review feedback.

File modes:

  • Private key (id_rsa): 0o640 (owner rw, group r) — needs group-read because builtin runtimes (deepspeed, mlx) run as runAsUser: 1000 while Secret volumes mount as root:root
  • Public key and authorized_keys: 0o644 (owner rw, world r) — public data, unchanged from default
  • Volume defaultMode: 0o400 (owner read-only) — defense-in-depth fallback for any future items added without an explicit per-item mode

Which issue(s) this PR fixes:
Fixes #3262

Checklist:

  • Docs included if any changes are user facing

@github-actions

Copy link
Copy Markdown

🎉 Welcome to the Kubeflow Trainer! 🎉

Thanks for opening your first PR! We're happy to have you as part of our community 🚀

Here's what happens next:

  • If you haven't already, please check out our Contributing Guide for repo-specific guidelines and the Kubeflow Contributor Guide for general community standards.
  • Our team will review your PR soon! cc @kubeflow/kubeflow-trainer-team

Join the community:

Feel free to ask questions in the comments if you need any help or clarification!
Thanks again for contributing to Kubeflow! 🙏

@h0pers h0pers marked this pull request as ready for review June 23, 2026 15:55
Copilot AI review requested due to automatic review settings June 23, 2026 15:55
@google-oss-prow google-oss-prow Bot requested a review from kuizhiqing June 23, 2026 15:55

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the MPI plugin’s SSH Secret volume permissions by explicitly setting defaultMode and per-item mode values so the mounted SSH private key is no longer world-readable, aligning the runtime behavior with least-privilege expectations for in-pod SSH auth material.

Changes:

  • Set defaultMode on the MPI SSH auth Secret volume and explicit mode on each mounted Secret item (private key, public key, authorized_keys).
  • Centralize the file mode values as constants for reuse and consistency.
  • Update unit, framework, runtime, and integration tests to assert the new volume permissions.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
pkg/runtime/framework/plugins/mpi/mpi.go Sets Secret volume DefaultMode and per-item Mode for SSH key material in the MPI plugin.
pkg/constants/constants.go Introduces constants for the SSH Secret volume default mode and per-file modes.
pkg/runtime/framework/plugins/mpi/mpi_test.go Updates MPI plugin unit tests to expect the new defaultMode and item mode fields.
pkg/runtime/framework/core/framework_test.go Updates framework component-builder expectations for the SSH Secret volume permissions.
pkg/runtime/core/trainingruntime_test.go Updates runtime construction tests to include the new Secret volume modes.
test/integration/controller/trainjob_controller_test.go Updates integration assertions for rendered volumes to include DefaultMode and per-item Mode.

@h0pers h0pers force-pushed the feat/mpi-ssh-secret-default-mode branch from 3a49f53 to cf032c7 Compare June 23, 2026 16:01
Set per-item file modes on the MPI SSH auth Secret volume to prevent
the private key from being world-readable inside training pods.

Private key uses 0o640 (not 0o600) because the builtin
deepspeed-distributed and mlx-distributed runtimes run as
runAsUser: 1000 while Secret volumes mount as root:root.

A volume-level DefaultMode of 0o400 is added as defense-in-depth
for any future items added without an explicit per-item mode.

Fixes kubeflow#3262

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Dmytro Hryshchenko <dhryshch@redhat.com>
@h0pers h0pers force-pushed the feat/mpi-ssh-secret-default-mode branch from cf032c7 to be0c6d9 Compare June 23, 2026 16:07
@andreyvelich

Copy link
Copy Markdown
Member

/ok-to-test
/retest

@andreyvelich andreyvelich left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up and addressing the prior feedback, @h0pers! This review was done using AI tooling. One small non-blocking nit inline.

Comment thread pkg/constants/constants.go Outdated
@andreyvelich

Copy link
Copy Markdown
Member

/assign @robert-bell @astefanutti @abhijeet-dhumal @tenzen-y
/lgtm

@google-oss-prow

Copy link
Copy Markdown

@andreyvelich: GitHub didn't allow me to assign the following users: robert-bell.

Note that only kubeflow members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

Details

In response to this:

/assign @robert-bell @astefanutti @abhijeet-dhumal @tenzen-y
/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@robert-bell robert-bell left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @h0pers this looks great!

/lgtm

@google-oss-prow

Copy link
Copy Markdown

@robert-bell: changing LGTM is restricted to collaborators

Details

In response to this:

Thanks @h0pers this looks great!

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Co-authored-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
Signed-off-by: Dmytro Hryshchenko <dmytrohryshchenkowork@gmail.com>
@google-oss-prow google-oss-prow Bot removed the lgtm label Jun 24, 2026
@abhijeet-dhumal

Copy link
Copy Markdown
Member

Thanks @h0pers ! 🙌
/lgtm

@google-oss-prow google-oss-prow Bot added the lgtm label Jun 24, 2026

@andreyvelich andreyvelich left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@google-oss-prow

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andreyvelich

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow Bot merged commit 129b3fa into kubeflow:master Jun 24, 2026
28 checks passed
@google-oss-prow google-oss-prow Bot added this to the v2.3 milestone Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Set defaultMode on MPI SSH Secret volume to restrict private key permissions

7 participants