Skip to content

cgroup v2: unsafe cgroupfs mount when cgroup namespace is disabled #5258

@xujihui1985

Description

@xujihui1985

Description

In a cgroup v2 environment, runc currently allows mounting cgroupfs inside a container even when the container is not running with a private cgroup namespace (cgroupns disabled).

This can lead to unintended side effects, as global mount options (e.g., nsdelegate) on the host’s cgroup filesystem may be modified or overridden by the container’s mount operation.

Problem

When cgroupns is not enabled, the container shares the host’s cgroup namespace. In this scenario:
• Mounting cgroupfs inside the container directly operates on the host’s cgroup hierarchy
• Mount options applied during the mount (e.g., nsdelegate) may:
• Override existing global mount options
• Introduce inconsistent behavior across the system
• Break assumptions about cgroup isolation

This effectively allows a container to mutate global kernel state without proper isolation, which is unsafe and unexpected.

Expected Behavior

runc should ensure safe behavior when handling cgroupfs mounts under cgroup v2. Specifically, when:
• The system is using cgroup v2, and
• The container does not have a private cgroup namespace (cgroupns disabled)

Then one of the following should be enforced:
1. Reject the mount entirely, or
2. Ensure the mount options are consistent with the host’s existing cgroupfs mount (i.e., do not override global mount options)

Proposed Solution

Adopt one of the following strategies when mounting cgroupfs under cgroup v2:

Option 1: Strict validation (preferred for safety)
• Add a validation check in runc
• If cgroupns is disabled:
• Disallow mounting cgroupfs inside the container
• Return a clear error message indicating that cgroup namespace isolation is required

Option 2: Inherit host mount configuration
• Ensure that any cgroupfs mount inside the container:
• Reuses the host’s existing mount options
• Does not override global flags such as nsdelegate

Related Work

A similar issue has been identified and addressed in LXCFS:
• Linux Containers (LXC) project PR: lxc/lxcfs#703

Impact

  • Prevents containers from modifying global cgroup mount behavior
  • Improves isolation guarantees under cgroup v2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions