Skip to content

feat: provision dual RO/RW service accounts for MCP services#304

Merged
rshoemaker merged 2 commits intomainfrom
feat/PLAT-461/ro_rw_svc_accts
Mar 20, 2026
Merged

feat: provision dual RO/RW service accounts for MCP services#304
rshoemaker merged 2 commits intomainfrom
feat/PLAT-461/ro_rw_svc_accts

Conversation

@rshoemaker
Copy link
Contributor

Summary

  • Provisions two service accounts per MCP service (svc_ro and svc_rw) using built-in group roles (pgedge_application_read_only and pgedge_application) instead of custom grants
  • MCP config.yaml credential selection is driven by allow_writes: when true, the RW account is used; otherwise, the RO account
  • Both credentials are stored in MCPConfigResource state and selected at config generation time via activeCredentials()

Changes

  • ServiceUserRole: Added Mode field with ServiceUserRoleRO/ServiceUserRoleRW constants. Role creation uses group role membership instead of direct grants. ResourceVersion bumped to "4".
  • MCPConfigResource: Replaced single Username/Password with dual RO/RW credential fields. Added activeCredentials() for AllowWrites-based selection. populateCredentials() fetches from both ServiceUserRole resources.
  • orchestrator.go: Creates two ServiceUserRole resources per service. Removed credential seeding into MCPConfigResource.
  • GenerateServiceUsername: Now accepts a mode parameter appended to the username.
  • Dependencies: ServiceInstance, ServiceInstanceSpec, and MCPConfigResource all depend on both RO and RW ServiceUserRole resources.

Known issue

Service accounts are currently only created on the primary node and not propagated to other nodes. This will be addressed in a follow-up ticket by creating accounts on each node independently rather than relying on Spock DDL
replication.

Test plan

  • Unit tests pass (make test — 729 tests)
  • Manual testing: allow_writes: true uses RW account, allow_writes: false uses RO account
  • Manual testing: all E2E tests passing

PLAT-461

Service users now use built-in group roles (pgedge_application,
pgedge_application_read_only) instead of custom grants. MCP
config.yaml credential selection is driven by allow_writes.
@coderabbitai
Copy link

coderabbitai bot commented Mar 20, 2026

📝 Walkthrough

Walkthrough

Mode-aware database service users introduced: usernames now include a mode suffix and generation accepts a mode parameter. Orchestration and MCP config now track separate RO/RW user-role resources and credentials; user-role creation uses group-role membership instead of explicit GRANT statements.

Changes

Cohort / File(s) Summary
Username generation & tests
server/internal/database/service_instance.go, server/internal/database/service_instance_test.go
GenerateServiceUsername signature changed to accept mode and appends _{mode} (with length/hash logic adjusted). Tests updated to pass mode and validate RO/RW outputs.
MCP config resource
server/internal/orchestrator/swarm/mcp_config_resource.go
Replaced single Username/Password with ROUsername/ROPassword and RWUsername/RWPassword; added activeCredentials(); ResourceVersion"2"; diff paths updated to ignore new fields.
Orchestration wiring
server/internal/orchestrator/swarm/orchestrator.go, server/internal/orchestrator/swarm/service_instance.go, server/internal/orchestrator/swarm/service_instance_spec.go
Resource chains and Dependencies now include distinct RO and RW ServiceUserRole resources; MCP credential propagation adjusted to rely on populated role resources.
Service user role resource
server/internal/orchestrator/swarm/service_user_role.go
Added Mode field and ServiceUserRoleRO/ServiceUserRoleRW constants; identifiers include mode; ResourceVersion"4"; username generation updated to pass mode; creation uses group-role membership (pgedge_application_read_only / pgedge_application) instead of explicit GRANT SQL.

Poem

🐇 I hopped to code beneath the moonlit node,

Two names now dance where once one strode.
RO and RW tuck tails in stride,
Mode in the name, credentials divide.
Group roles hum — no grants to bemoan,
A rabbit's nibble, a tidy code home.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: provision dual RO/RW service accounts for MCP services' directly and clearly summarizes the main change: provisioning dual read-only and read-write service accounts for MCP services.
Description check ✅ Passed The PR description includes all major required sections: a concise summary, comprehensive changes list, clear test plan, known issues, and linked issue (PLAT-461), though the Changelog entry checklist item is not explicitly marked.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/PLAT-461/ro_rw_svc_accts
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
server/internal/orchestrator/swarm/mcp_config_resource.go (1)

95-98: ⚠️ Potential issue | 🟠 Major

Keep Refresh() limited to observing config.yaml.

Hydrating RO/RW credentials here makes the refreshed state reflect ServiceUserRole, not the file on disk. If a role gets recreated with a new password, this can make the resource look current without proving config.yaml was rewritten.

Suggested change
-	// Populate credentials from ServiceUserRole
-	if err := r.populateCredentials(rc); err != nil {
-		return err
-	}
-
 	// Check if config.yaml exists
 	_, err = readResourceFile(fs, filepath.Join(dirPath, "config.yaml"))

Based on learnings: Refresh should only verify the existence of config.yaml.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/mcp_config_resource.go` around lines 95 -
98, The Refresh() implementation is hydrating credentials by calling
r.populateCredentials(rc), causing the refreshed state to reflect
ServiceUserRole instead of the on-disk config.yaml; remove that call so
Refresh() only verifies the presence and timestamp/contents of config.yaml
(e.g., check file existence/read and update observed state accordingly) rather
than populating RO/RW credentials from ServiceUserRole. Locate the Refresh
method and remove or disable the r.populateCredentials(rc) invocation, ensuring
any credential population remains only in reconcile/create/update paths, and
keep Refresh limited to observing config.yaml only.
server/internal/orchestrator/swarm/service_user_role.go (1)

136-143: ⚠️ Potential issue | 🟠 Major

Provision the role on every node, not only on the current primary.

This still routes both RO and RW account creation through connectToPrimary(). In a multi-node database, failover or host selection can land MCP on a node that never received the CREATE ROLE, which turns authentication failures into a topology event instead of a config bug.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/service_user_role.go` around lines 136 -
143, createUserRole currently opens a single connection via connectToPrimary
which causes the CREATE ROLE to only run on the primary; instead enumerate all
cluster nodes and run the role creation on each node so the role exists
everywhere. Replace the single connectToPrimary(...) call in
ServiceUserRole.createUserRole with a loop over the cluster nodes (e.g., from rc
or a node list on ServiceUserRole), use the per-node connector (e.g.,
r.connectToNode or an equivalent helper) to connect to each node with
r.DatabaseName, execute the same role creation logic for each connection, Close
each connection, and aggregate/return errors (return the first error or a
combined error) so failures creating the role on any node are surfaced. Ensure
you keep the original behavior when there is only one node.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@server/internal/orchestrator/swarm/mcp_config_resource.go`:
- Around line 95-98: The Refresh() implementation is hydrating credentials by
calling r.populateCredentials(rc), causing the refreshed state to reflect
ServiceUserRole instead of the on-disk config.yaml; remove that call so
Refresh() only verifies the presence and timestamp/contents of config.yaml
(e.g., check file existence/read and update observed state accordingly) rather
than populating RO/RW credentials from ServiceUserRole. Locate the Refresh
method and remove or disable the r.populateCredentials(rc) invocation, ensuring
any credential population remains only in reconcile/create/update paths, and
keep Refresh limited to observing config.yaml only.

In `@server/internal/orchestrator/swarm/service_user_role.go`:
- Around line 136-143: createUserRole currently opens a single connection via
connectToPrimary which causes the CREATE ROLE to only run on the primary;
instead enumerate all cluster nodes and run the role creation on each node so
the role exists everywhere. Replace the single connectToPrimary(...) call in
ServiceUserRole.createUserRole with a loop over the cluster nodes (e.g., from rc
or a node list on ServiceUserRole), use the per-node connector (e.g.,
r.connectToNode or an equivalent helper) to connect to each node with
r.DatabaseName, execute the same role creation logic for each connection, Close
each connection, and aggregate/return errors (return the first error or a
combined error) so failures creating the role on any node are surfaced. Ensure
you keep the original behavior when there is only one node.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c30c0de2-1912-4775-b043-bec2faa2ffae

📥 Commits

Reviewing files that changed from the base of the PR and between 7b76ca1 and 157d478.

📒 Files selected for processing (7)
  • server/internal/database/service_instance.go
  • server/internal/database/service_instance_test.go
  • server/internal/orchestrator/swarm/mcp_config_resource.go
  • server/internal/orchestrator/swarm/orchestrator.go
  • server/internal/orchestrator/swarm/service_instance.go
  • server/internal/orchestrator/swarm/service_instance_spec.go
  • server/internal/orchestrator/swarm/service_user_role.go

@rshoemaker
Copy link
Contributor Author

@coderabbitai I fixed the first issue. The second issue will be fixed in a different PR.

@coderabbitai
Copy link

coderabbitai bot commented Mar 20, 2026

Tip

For best results, initiate chat on the files or code changes.

@rshoemaker: Thanks for the update! Glad the first issue is resolved. For the second issue being deferred to a follow-up PR — would you like me to open a GitHub issue to track it so it doesn't get lost?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
server/internal/orchestrator/swarm/mcp_config_resource.go (1)

54-60: Update the DiffIgnore comment to match the new lifecycle.

The code now populates RO/RW credentials in Create/Update, not Refresh, so this comment is misleading.

✏️ Suggested diff
-		// Credentials are populated from ServiceUserRole resources during refresh.
+		// Credentials are populated from ServiceUserRole resources during Create/Update.

Based on learnings: "Refresh should only verify the existence of config.yaml. Tokens.yaml and users.yaml are owned and managed by the MCP server at runtime and should not be rewritten on Update."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/mcp_config_resource.go` around lines 54 -
60, Update the misleading comment in the MCPConfigResource.DiffIgnore function:
change the line that currently reads "Credentials are populated from
ServiceUserRole resources during refresh." to accurately state that RO/RW
credentials are populated during Create/Update (not Refresh), and note that
tokens.yaml and users.yaml are owned/managed by the MCP server at runtime and
should not be rewritten on Update; modify the comment inside the DiffIgnore
method (MCPConfigResource.DiffIgnore) accordingly so it reflects the new
lifecycle and ownership.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@server/internal/orchestrator/swarm/mcp_config_resource.go`:
- Around line 54-60: Update the misleading comment in the
MCPConfigResource.DiffIgnore function: change the line that currently reads
"Credentials are populated from ServiceUserRole resources during refresh." to
accurately state that RO/RW credentials are populated during Create/Update (not
Refresh), and note that tokens.yaml and users.yaml are owned/managed by the MCP
server at runtime and should not be rewritten on Update; modify the comment
inside the DiffIgnore method (MCPConfigResource.DiffIgnore) accordingly so it
reflects the new lifecycle and ownership.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9d16c217-2e68-483f-81fa-eed10b0edc95

📥 Commits

Reviewing files that changed from the base of the PR and between 157d478 and e84cd8a.

📒 Files selected for processing (1)
  • server/internal/orchestrator/swarm/mcp_config_resource.go

Copy link
Member

@mmols mmols left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worked well for me with a single node database. I was able to switch back and forth successfully by restarting the MCP container.

@rshoemaker rshoemaker merged commit bc20c0b into main Mar 20, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants