feat: enable auth in Flame. by k82cn · Pull Request #390 · xflops/flame

k82cn · 2026-03-29T14:01:05Z

No description provided.

Signed-off-by: Klaus Ma <klausm@nvidia.com>

gemini-code-assist

Code Review

This pull request introduces a comprehensive design for JWT-based authentication in Flame, aimed at providing workspace isolation and secure access control across the cluster. The proposal includes specifications for user and session tokens, local validation logic, and updates to the CLI and SDKs. The review feedback identifies several critical improvements: implementing support for multiple public keys to allow zero-downtime key rotation, resolving contradictory pseudocode in the cache authorization section, and clarifying the configuration for symmetric encryption. Other suggestions focus on reducing redundancy in token claims, handling tilde expansion in file paths, and defining allowed character sets for workspaces to prevent parsing errors.

gemini-code-assist · 2026-03-29T14:03:13Z

+**Limitations:**
+- Tokens must be manually distributed to users (no built-in user management)
+- No automatic token rotation; users must regenerate expired tokens
+- Single signing key per cluster (key rotation requires cluster restart)


The limitation "Single signing key per cluster (key rotation requires cluster restart)" presents a significant operational challenge, as it would cause downtime or service disruption during key rotation.

To support graceful key rotation, I suggest modifying the design to allow for multiple public keys for verification. The configuration could accept a list of public keys, for example:

auth: config: # ... public_key_files: - "/etc/flame/jwt-public-v1.pem" - "/etc/flame/jwt-public-v2.pem"

During verification, the server would try each key until one succeeds. This allows an administrator to add a new key, perform a rolling restart of components, start signing new tokens with the new private key, and eventually remove the old public key after all old tokens have expired, all without service interruption.

gemini-code-assist · 2026-03-29T14:03:13Z

+        Scope::Cluster => Ok(()),
+
+        // Workspace scope can only access objects in their workspaces
+        Scope::Workspace => {
+            if auth.workspaces.is_empty() {
+                return Err(AuthError::InvalidToken("workspaces required"));
+            }
+
+            if auth.workspaces.contains(&object_key.workspace) {
+                Ok(())
+            } else {
+                Err(AuthError::PermissionDenied(format!(
+                    "Access denied to workspace {}", object_key.workspace
+                )))
+            }
+        }
+
+        // System scope (executors) can access workspaces they're bound to
+        Scope::System => {
+            // Executors have bound_workspaces in their context
+            // (set during executor registration)
+            if auth.can_access_bound_workspace(&object_key.workspace) {
+                Ok(())
+            } else {
+                Err(AuthError::PermissionDenied(format!(
+                    "System scope not bound to workspace {}", object_key.workspace
+                )))
+            }
+        }
+    }
+}
+```


This block of pseudocode appears to be a copy-paste error and contradicts the section's goal of only allowing session tokens for cache access. The preceding block (lines 988-1017) correctly implements the desired logic. This block introduces confusion by handling user-scoped tokens (workspace, cluster, system), which should be rejected. Please remove this block to maintain design consistency.

gemini-code-assist · 2026-03-29T14:03:13Z

+    config:
+      algorithm: "EdDSA"                   # Default: EdDSA. Options: EdDSA, RS256, HS256


The design lists HS256 as a supported algorithm, but the configuration (private_key_file, public_key_file) and the flmadm token init command are described in terms of asymmetric key pairs. HS256 is a symmetric algorithm and uses a single shared secret.

The design should be updated to clarify how HS256 is handled:

How is the shared secret configured? A secret_key_file option in auth.config would be appropriate.

How does flmadm token init generate a secret for HS256?

How does flmadm token sign use the secret?

gemini-code-assist · 2026-03-29T14:03:13Z

+    provider: "jwt"                        # Default: "jwt". Future: "oidc", "custom"
+    config:
+      algorithm: "EdDSA"                   # Default: EdDSA. Options: EdDSA, RS256, HS256
+      private_key_file: "/etc/flame/jwt-private.pem"   # Private key for signing tokens (flmadm only)


The comment for private_key_file states it's for flmadm only. However, the "Session Token Flow" section (lines 145-173) and the implementation detail for create_session (lines 895-935) correctly state that the Session Manager also needs the private key to sign session-scoped tokens. The comment should be updated to reflect this.

Suggested change

private_key_file: "/etc/flame/jwt-private.pem" # Private key for signing tokens (flmadm only)

private_key_file: "/etc/flame/jwt-private.pem" # Private key for signing tokens (used by flmadm and session-manager)

gemini-code-assist · 2026-03-29T14:03:14Z

+  "iat": 1711670400,                    // Issued at timestamp
+  "scope": "session",                   // Scope: always "session"
+  "workspace": "team-ml",               // Workspace this session belongs to
+  "session_id": "session-abc123"        // Session identifier


In the Session Token claims structure, both sub (subject) and session_id are present and appear to hold the same session identifier. The sub claim is the standard way to identify the principal that is the subject of the JWT. Having a duplicate session_id claim is redundant. To simplify the token structure and adhere to JWT best practices, I recommend removing the session_id claim and using sub exclusively to carry the session ID. This would also require updating the SessionClaims struct (lines 621-638) and its usage accordingly.

gemini-code-assist · 2026-03-29T14:03:14Z

+  endpoint: "https://flame.example.com:8080"
+  token: "eyJhbGciOiJFZERTQSI..."
+  # Or reference a file
+  token_file: "~/.flame/token"


The example configuration for flmctl uses token_file: "~/.flame/token". The tilde (~) is a shell construct and is not automatically expanded to the user's home directory by most application code, including Rust's standard file I/O.

The design should acknowledge this and specify that the implementation must explicitly handle tilde expansion (e.g., by using a library like shellexpand in Rust) to provide a good user experience. Otherwise, users might be confused when the path doesn't work as expected.

gemini-code-assist · 2026-03-29T14:03:14Z

+    pub fn parse(key: &str) -> Result<Self, FlameError> {
+        let parts: Vec<&str> = key.split('/').collect();
+        if parts.len() != 3 {
+            return Err(FlameError::InvalidConfig(format!(
+                "Invalid key format '{}': expected workspace/session_id/object_id",
+                key
+            )));
+        }
+        Ok(Self {
+            workspace: parts[0].to_string(),
+            session_id: parts[1].to_string(),
+            object_id: parts[2].to_string(),
+        })
+    }
+
+    pub fn to_string(&self) -> String {
+        format!("{}/{}/{}", self.workspace, self.session_id, self.object_id)
+    }
+}


The ObjectKey::parse implementation (lines 790-791) splits the key string by / to extract workspace, session_id, and object_id. This assumes these components do not contain /. While the design includes validation functions for session_id and object_id that prevent this, it's not explicitly mentioned for workspace.

To prevent ambiguity and potential parsing errors, the design document should explicitly define the allowed character set for workspace names and confirm that / is disallowed. This makes the parsing logic more robust and secure.

Signed-off-by: Klaus Ma <klausm@nvidia.com>

feat: enable auth in Flame.

ff9b16e

Signed-off-by: Klaus Ma <klausm@nvidia.com>

gemini-code-assist Bot reviewed Mar 29, 2026

View reviewed changes

update flamepy.runner impl detail.

f16d1a0

Signed-off-by: Klaus Ma <klausm@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: enable auth in Flame.#390

feat: enable auth in Flame.#390
k82cn wants to merge 2 commits into
xflops:mainfrom
k82cn:flm_389

k82cn commented Mar 29, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 29, 2026

Uh oh!

gemini-code-assist Bot Mar 29, 2026

Uh oh!

gemini-code-assist Bot Mar 29, 2026

Uh oh!

gemini-code-assist Bot Mar 29, 2026

Uh oh!

gemini-code-assist Bot Mar 29, 2026

Uh oh!

gemini-code-assist Bot Mar 29, 2026

Uh oh!

gemini-code-assist Bot Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		config:
		algorithm: "EdDSA" # Default: EdDSA. Options: EdDSA, RS256, HS256

	private_key_file: "/etc/flame/jwt-private.pem" # Private key for signing tokens (flmadm only)
	private_key_file: "/etc/flame/jwt-private.pem" # Private key for signing tokens (used by flmadm and session-manager)

Conversation

k82cn commented Mar 29, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant