Skip to content

feat(pod): unprivileged allow_fuse() via FUSE device plugin + volume examples#1205

Merged
EngHabu merged 5 commits into
mainfrom
haytham/allow-fuse-device-plugin
Jun 13, 2026
Merged

feat(pod): unprivileged allow_fuse() via FUSE device plugin + volume examples#1205
EngHabu merged 5 commits into
mainfrom
haytham/allow-fuse-device-plugin

Conversation

@EngHabu

@EngHabu EngHabu commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

What

Two things, together:

  1. PodTemplate.allow_fuse() now grants unprivileged FUSE via a device plugin by defaultsmarter-devices/fuse resource request + CAP_SYS_ADMIN, no privileged container, no /dev/fuse hostPath. allow_fuse(privileged=True) is kept as a legacy escape hatch (hostPath + privileged) for clusters without a device plugin. The two paths are cleanly separated (_apply_fuse_device_plugin / _apply_fuse_privileged).
  2. JuiceFS Volume examples (examples/volumes/) updated to use pod_template=flyte.PodTemplate().allow_fuse() instead of the old privileged/enable_fuse_mount approach.

Why

The old allow_fuse(privileged=False) path used a /dev/fuse hostPath, which doesn't actually work: a hostPath surfaces the device node but the container's devices cgroup still denies open() with EPERM. Only privileged (or a device plugin) grants access. The device-plugin path is the real unprivileged solution — requesting smarter-devices/fuse makes kubelet inject /dev/fuse into the devices-cgroup allowlist; CAP_SYS_ADMIN covers mount(2). Composes with allow_nested_sandboxing().

The cluster must run a FUSE device plugin advertising smarter-devices/fuse — the Union dataplane chart ships an opt-in fuseDevicePlugin DaemonSet (unionai/helm-charts#443, unionai/cloud#16489).

Validated end-to-end

On a real Union dataplane (dogfood, device plugin deployed): an unprivileged task pod built from PodTemplate().allow_fuse() (CAP_SYS_ADMIN + smarter-devices/fuse, not privileged, no hostPath) opened /dev/fuse and mounted an S3-backed flyteplugins-union Volume with a read/write round-trip. CapEff=a82425fb (SYS_ADMIN only).

Tests

TestAllowFuse (device-plugin default) + TestAllowFusePrivileged (legacy hostPath escape hatch) + updated sandboxing-composition tests — 54 passed.

…efault

allow_fuse() previously defaulted to privileged=True (a /dev/fuse hostPath +
privileged container), and its privileged=False path used a hostPath +
AppArmor-unconfined — which still fails at runtime: a hostPath surfaces the
device node but the devices cgroup denies open() with EPERM, so an unprivileged
container cannot use it.

Default allow_fuse() now uses the FUSE device-plugin path instead: it requests
the smarter-devices/fuse extended resource (so kubelet injects /dev/fuse into
the container's devices-cgroup allowlist) plus CAP_SYS_ADMIN for mount(2) — no
privileged, no hostPath. This composes with allow_nested_sandboxing() and is
what actually works on a real cluster running a FUSE device plugin (the Union
dataplane chart ships an opt-in fuseDevicePlugin DaemonSet).

allow_fuse(privileged=True) is retained as a legacy escape hatch (hostPath +
privileged) for clusters without a device plugin.

Validated end-to-end on a real Union dataplane (dogfood): an unprivileged pod
with exactly this template mounted a flyteplugins-union JuiceFS Volume with IO.

Signed-off-by: Haytham Abuelfutuh <haytham@union.ai>
Comment thread src/flyte/_pod.py Outdated
_stamp_capability(pt, "fuse")
return pt

# Legacy escape hatch (privileged=True): /dev/fuse hostPath + privileged, for

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this weird that this will fallback on randomly, i think we should cleanly separate 2 methods.
if privileged
_fallback_to_legacy
else:
enable_fuse_device

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — split into two clearly separated helpers in de82835c: _apply_fuse_device_plugin(primary) (default, unprivileged) and _apply_fuse_privileged(pt, primary) (legacy hostPath + privileged escape hatch). _apply_fuse just dispatches on the flag; the shared CAP_SYS_ADMIN grant + capability annotation stamp run once after. No inline fallback.

…ged helpers

Per review: dispatch cleanly to _apply_fuse_device_plugin (default) vs
_apply_fuse_privileged (legacy escape hatch) instead of an inline fallback;
CAP_SYS_ADMIN + capability stamp are shared across both.

Signed-off-by: Haytham Abuelfutuh <haytham@union.ai>
…fuse()

Adds the flyteplugins-union Volume examples (volume_example, volume_cold_fork,
volume_recover, bench) and aligns gcsfuse_example. They request unprivileged
FUSE via pod_template=flyte.PodTemplate().allow_fuse() — the device-plugin grant
this PR makes the default — instead of a privileged container.

Signed-off-by: Haytham Abuelfutuh <haytham@union.ai>
@EngHabu EngHabu changed the title feat(pod): make allow_fuse() unprivileged via FUSE device plugin by d… feat(pod): unprivileged allow_fuse() via FUSE device plugin + volume examples Jun 13, 2026
EngHabu added 2 commits June 13, 2026 01:47
_apply_fuse_privileged mutates in place and is annotated -> None; remove the
leftover 'return pt' that mypy flagged (the dispatcher _apply_fuse returns pt).

Signed-off-by: Haytham Abuelfutuh <haytham@union.ai>
plugins/redis/uv.lock had drifted from its pyproject resolution on main
(`uv lock --check` failure, unrelated to this PR's changes).

Signed-off-by: Haytham Abuelfutuh <haytham@union.ai>
@EngHabu EngHabu enabled auto-merge (squash) June 13, 2026 13:41
@EngHabu EngHabu merged commit 6cd1b09 into main Jun 13, 2026
41 checks passed
@EngHabu EngHabu deleted the haytham/allow-fuse-device-plugin branch June 13, 2026 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants