Skip to content

Extend unmanaged OAuth flow to drive code exchange in-process#2896

Merged
dgageot merged 6 commits into
docker:mainfrom
trungutt:extend-unmanaged-oauth-flow
May 28, 2026
Merged

Extend unmanaged OAuth flow to drive code exchange in-process#2896
dgageot merged 6 commits into
docker:mainfrom
trungutt:extend-unmanaged-oauth-flow

Conversation

@trungutt
Copy link
Copy Markdown
Contributor

@trungutt trungutt commented May 26, 2026

Why

The current unmanaged OAuth flow (WithManagedOAuth(false), added in #830) emits an elicitation carrying only authorization-server metadata and expects the connected client to:

  1. Generate PKCE + state
  2. Run Dynamic Client Registration if needed
  3. Open the browser
  4. Run a callback server (or use a deeplink)
  5. Exchange the code for a token at the auth server's token endpoint
  6. Return the resulting {access_token, refresh_token, …} via ResumeElicitation

That contract works fine when the client is a Go process (the CLI mirror at pkg/runtime/remote_runtime.go:handleOAuthElicitation reuses the same helpers). It is a poor fit for embedders that want to be thin couriers — for example a host that registers a custom URI scheme and forwards the deeplink payload to docker-agent — because every such host ends up re-implementing PKCE / DCR / token exchange in the host language.

This branch extends the unmanaged flow with an opt-in sub-behavior where the runtime drives everything itself and the embedder just relays the deeplink payload back.

How the modes compare

Three OAuth flows are now available. The mode is picked per-toolset at flow-entry time:

flowchart TD
  start([handleOAuthFlow])
  managed{WithManagedOAuth true?}
  redir{mcp-oauth-redirect-uri set?}
  M[Managed flow]
  UClient[Unmanaged - client-driven]
  UDrive[Unmanaged - drive-flow NEW]

  start --> managed
  managed -- yes --> M
  managed -- no --> redir
  redir -- no --> UClient
  redir -- yes --> UDrive
Loading

Managed flow (unchanged)

The runtime owns the entire OAuth dance. A local callback server bound to 127.0.0.1:N receives the redirect.

sequenceDiagram
  participant agent as docker-agent
  participant client as MCP client
  participant browser as Browser
  participant authServer as Authorization Server

  agent->>agent: generate state and PKCE, run DCR if needed
  agent->>agent: start local callback server on 127.0.0.1
  agent->>client: elicitation asking to proceed with OAuth
  client-->>agent: accept
  agent->>browser: open authorize URL with localhost redirect_uri
  browser->>authServer: GET authorize
  authServer-->>browser: redirect to local callback with code and state
  browser->>agent: GET callback (hits local server)
  agent->>authServer: POST token with code, code_verifier, redirect_uri
  authServer-->>agent: access_token and refresh_token
  agent->>agent: store token in keychain
Loading

Unmanaged - client-driven (unchanged)

The runtime hands the client raw auth-server metadata and expects a finished token back.

sequenceDiagram
  participant agent as docker-agent
  participant client as MCP client
  participant browser as Browser
  participant authServer as Authorization Server

  agent->>client: elicitation with auth-server metadata
  client->>client: generate state and PKCE, run DCR, pick own redirect_uri
  client->>browser: open authorize URL
  browser->>authServer: GET authorize
  authServer-->>browser: redirect to client's redirect_uri
  browser->>client: deliver code and state
  client->>authServer: POST token with code, code_verifier, redirect_uri
  authServer-->>client: access_token and refresh_token
  client->>agent: ResumeElicitation with access_token and refresh_token
  agent->>agent: store token in keychain
Loading

Unmanaged - drive-flow (this PR)

The runtime owns PKCE, DCR, and token exchange and advertises an opaque redirect_uri. The client (or any process that received the deeplink) delivers { code, state } back. Two delivery paths are accepted; either or both work in a given deployment.

sequenceDiagram
  participant agent as docker-agent
  participant client as MCP client
  participant browser as Browser
  participant authServer as Authorization Server
  participant relay as Deeplink relay

  agent->>agent: generate state and PKCE, run DCR if needed
  agent->>agent: build authorize URL with configured redirect_uri
  agent->>agent: register state-to-waiter in registry
  agent->>client: elicitation with authorize_url and state
  client->>browser: open authorize_url
  browser->>authServer: GET authorize
  authServer-->>browser: redirect to configured redirect_uri with code and state

  alt Path A - out-of-band callback (new)
    browser->>relay: deeplink or URI-scheme handler
    relay->>agent: POST /api/mcp-oauth/callback with state and code
    agent->>agent: lookup state in registry, hand payload to waiter
  else Path B - elicitation reply
    browser->>client: deliver code and state
    client->>agent: ResumeElicitation with code and state
  end

  agent->>agent: verify state in constant time
  agent->>authServer: POST token with code, code_verifier, redirect_uri
  authServer-->>agent: access_token and refresh_token
  agent->>agent: store token plus client_id and secret for silent refresh
Loading

What changes

1. Runtime-driven OAuth flow (server mode)

A new server-level flag — --mcp-oauth-redirect-uri=<URL> (also MCPOAuthRedirectURI on RuntimeConfig and WithUnmanagedOAuthRedirectURI on the runtime). When set, handleUnmanagedOAuthFlow:

  • Generates state + PKCE in-process
  • Runs DCR if the auth server advertises a registration endpoint (or uses the per-toolset clientId/clientSecret from RemoteOAuthConfig)
  • Builds the full authorize URL with the configured redirect_uri
  • Emits an elicitation whose Meta includes:
    • docker-agent/authorize_url — URL the client should open in the browser
    • docker-agent/state — state value the client must echo back
    • plus the existing docker-agent/server_url, auth_server, auth_server_metadata, resource_metadata
  • Accepts {code, state} as a ResumeElicitation payload, verifies state in constant time, exchanges the code at the token endpoint (same redirect_uri per RFC 6749 §4.1.3), and stores the resulting token in the keychain with client credentials stamped on for silent refresh

When the flag is not set, the elicitation shape stays identical to today and the runtime expects the client-driven {access_token, …} reply. CLI-mirror clients continue to work unchanged.

The new path also accepts the client-driven {access_token, …} reply, so a client that prefers to do the exchange itself is still free to.

A per-toolset RemoteOAuthConfig.CallbackRedirectURL overrides the runtime-wide flag for that toolset.

2. Out-of-band callback route — POST /api/mcp-oauth/callback

Adds a second, session-less way to deliver the deeplink payload to the runtime:

POST /api/mcp-oauth/callback?state=<opaque>
                            [&code=<opaque>]
                            [&error=<rfc6749 code>]
                            [&error_description=<text>]

The parameters mirror what an authorization server sends to a redirect URI per RFC 6749 §4.1.2. The body is empty; everything travels in the query string.

A process-wide pending-OAuth registry maps state → waiting unmanaged flow. handleUnmanagedOAuthFlow registers its waiter on entry (drive-flow only) and unregisters on exit. The new route looks up the waiter by state and hands it the payload through a single-shot buffered channel. The waiting goroutine then performs the token exchange and persists the token exactly as before.

The lookup-by-state IS the authentication: only states the runtime has just generated are present in the registry. Unknown or replayed states return 404. State values continue to come from GenerateState (≥128 bits, opaque, base64).

This route is strictly additive. The existing session-keyed POST /api/sessions/:id/elicitation path still works; embedders may use the elicitation path, the new callback path, or both.

3. Propagate user-initiated cancellation across the WithoutCancel boundary

clientConnector.Connect detaches the caller's ctx with context.WithoutCancel so that the MCP toolset's session can outlive any single request (otherwise the SSE/stdio connection tears down when the request that triggered toolset.Start returns, and a successful OAuth flow gets killed at request end).

The detachment also severs user-initiated cancellation, which the OAuth select inside Connect does need to observe — without that signal, embedders that abort the in-progress run while the user is in the OAuth dialog would leave the OAuth select blocked indefinitely, the per-session streaming lock held, and the next message returning 409 Conflict / ErrSessionBusy.

The connection-longevity invariant is preserved by stashing the caller's original ctx as a value on the detached one. The detached ctx remains the primary driver of the connection's lifetime; operations that want to observe user-initiated cancellation opt in by reading the parent back out and selecting on its Done channel.

Carrying ctx-as-value is unusual but appropriate here: the parent is metadata about how the detached ctx was constructed, not data the request is operating on. The helpers (withCancellableParent / cancellableParentFromContext) live next to the connector that creates the relationship so the rationale stays close to the code.

@aheritier aheritier added area/security Authentication, authorization, secrets, vulnerabilities kind/feat PR adds a new feature (maps to feat: commit prefix) labels May 26, 2026
@trungutt trungutt closed this May 26, 2026
@trungutt trungutt reopened this May 26, 2026
@trungutt trungutt force-pushed the extend-unmanaged-oauth-flow branch from 1612def to 7bd8d5f Compare May 26, 2026 15:42
Adds a new sub-behavior to the unmanaged OAuth flow that lets the
runtime own PKCE / state / DCR / token exchange while the connected
client acts as a thin courier (open browser, receive deeplink,
forward {code, state}).

Activated by a new server-level flag --mcp-oauth-redirect-uri (also
exposed on the runtime as WithUnmanagedOAuthRedirectURI and on the
RuntimeConfig as MCPOAuthRedirectURI). When set, the runtime:

  - generates state + PKCE in-process
  - runs DCR if needed (or uses per-toolset explicit credentials)
  - builds the full authorize URL with the configured redirect_uri
  - emits an elicitation whose Meta carries cagent/authorize_url +
    cagent/state alongside the existing auth_server_metadata
  - accepts {code, state} as a ResumeElicitation payload, verifies
    state in constant time, and exchanges the code at the token
    endpoint using the same redirect_uri (RFC 6749 §4.1.3 binding)
  - stores the resulting token in the keychain with client_id /
    client_secret stamped on for later silent refresh

When the flag is empty, the existing client-driven contract is
preserved verbatim: the elicitation carries only metadata and the
client is expected to reply with {access_token, refresh_token, …}.
The {access_token, …} reply shape is also accepted on the new path
so a client that prefers to do the exchange itself is still free to.

A per-toolset RemoteOAuthConfig.CallbackRedirectURL overrides the
runtime-wide flag for that toolset.

Docker-agent never learns anything about the URL it advertises: it
is opaque, and what happens at the URL (HTML bouncer, OS deeplink,
universal-link claim, …) is the host's concern.

Plumbing:

  - cmd/root/flags.go: add --mcp-oauth-redirect-uri
  - pkg/config/runtime.go: MCPOAuthRedirectURI field
  - pkg/runtime/runtime.go: unmanagedOAuthRedirectURI field +
    WithUnmanagedOAuthRedirectURI Opt
  - pkg/runtime/loop.go: pass through to ConfigureHandlers
  - pkg/server/session_manager.go: forward from runtime config
  - pkg/tools/capabilities.go: extend OAuthCapable interface +
    ConfigureHandlers
  - pkg/tools/mcp/mcp.go, remote.go, session_client.go,
    builtin/mcpcatalog: implement the new SetUnmanagedOAuthRedirectURI
  - pkg/tools/mcp/oauth.go: rework handleUnmanagedOAuthFlow,
    factor out resolveClientCredentials helper and
    consumeUnmanagedElicitationReply
  - docs/features/remote-mcp/index.md: new 'Unmanaged OAuth flow'
    section documenting the meta keys and behavior

Tests: six new oauth_test.go tests covering the drive-flow happy
path (incl. asserting RFC 6749 §4.1.3 redirect_uri parity at /token),
state-mismatch rejection, legacy access-token reply on the new path,
legacy mode shape (no authorize_url emitted), legacy mode rejection
of {code, state}, and the per-toolset > runtime-wide precedence.

Signed-off-by: Trung Nguyen <trung.nguyen@docker.com>
@trungutt trungutt force-pushed the extend-unmanaged-oauth-flow branch from 7bd8d5f to c97124d Compare May 26, 2026 15:51
@aheritier aheritier added the area/mcp MCP protocol, MCP tool servers, integration label May 27, 2026
trungutt added 4 commits May 28, 2026 09:16
Builds on the unmanaged-OAuth drive-flow added in this branch by
giving embedders a second, session-less way to deliver the deeplink
payload to the runtime.

Adds a new HTTP route:

  POST /api/mcp-oauth/callback?state=<opaque>
                              [&code=<opaque>]
                              [&error=<rfc6749 code>]
                              [&error_description=<text>]

The parameters mirror what an authorization server sends to a
redirect URI per RFC 6749 §4.1.2. The body is empty; everything
travels in the query string.

A process-wide pending-OAuth registry maps state -> waiting
unmanaged flow. handleUnmanagedOAuthFlow registers its waiter on
entry (drive-flow only) and unregisters on exit. The new route
looks up the waiter by state and hands it the payload through a
single-shot buffered channel. The waiting goroutine then performs
the token exchange and persists the token exactly as before.

The lookup-by-state IS the authentication: only states the runtime
has just generated are present in the registry. Unknown or replayed
states return 404. State values continue to come from GenerateState
(>=128 bits, opaque, base64).

Inside the flow, the elicitation result and the direct callback
race. Whichever arrives first wins; the other is cancelled cleanly
with no further side effects on the connected client.

Strictly additive to the existing session-keyed
POST /api/sessions/:id/elicitation path: both reply shapes
({access_token, ...} legacy and {code, state} drive-flow) still
work, and an embedder may use the elicitation path, the new
callback path, or both. The --mcp-oauth-redirect-uri flag, the
cagent/authorize_url and cagent/state elicitation meta keys, and
the runtime's redirect-URI parity check at /token are unchanged.

Why it matters for embedders:

  - No session context required. The embedder forwards an opaque
    {state, code} (or {state, error}) and is done. It never needs
    to know which session is waiting, or even that the runtime has
    sessions at all.
  - No state tracking required. All state -> waiter accounting
    lives where it belongs: inside the runtime.
  - No UI-lifecycle dependency. As long as the runtime is alive
    when the deeplink fires, the token is exchanged and stored. An
    embedder UI that died between authorize and callback is no
    longer fatal.
  - Cross-process deeplink handlers can be thin couriers: a
    system-wide URL-scheme handler or an OS launcher does not need
    a connection to the runtime's elicitation stream at all.

Plumbing:

  - pkg/tools/mcp/pending_oauth.go: new file, process-wide
    state -> chan registry, exported DeliverPendingOAuthCallback
    + ErrPendingOAuthNoWaiter for the server handler
  - pkg/tools/mcp/oauth.go: handleUnmanagedOAuthFlow registers a
    waiter, runs the elicitation in a goroutine scoped to a
    cancellable child context, and selects between elicitation
    result, direct callback, and the outer ctx
  - pkg/server/server.go: new mcpOAuthCallback handler at
    POST /api/mcp-oauth/callback

Signed-off-by: Trung Nguyen <trung.nguyen@docker.com>
The OAuth elicitation Meta map keys still carry the legacy product
name (cagent/server_url, cagent/authorize_url, cagent/state,
cagent/type). The runtime was renamed to docker-agent some time ago;
align the meta keys with it so the contract clients see matches the
project name they're integrating against.

Renamed:
  cagent/type            -> docker-agent/type
  cagent/server_url      -> docker-agent/server_url
  cagent/authorize_url   -> docker-agent/authorize_url       (docker#2896)
  cagent/state           -> docker-agent/state               (docker#2896)

Both writer sites (handleManagedOAuthFlow, handleUnmanagedOAuthFlow)
and every in-tree reader (CLI runner, RemoteRuntime OAuth mirror,
TUI elicitation handling) are updated together. Out-of-tree
consumers will need the same one-line rename.

Other cagent/* identifiers that are NOT in the elicitation Meta map
(user-config paths under ~/.cagent/, the User-Agent header, the
cagent/title meta key used by the user_prompt builtin) are
intentionally left alone -- the rename is scoped to OAuth meta.

Signed-off-by: Trung Nguyen <trung.nguyen@docker.com>
clientConnector.Connect detaches the caller's ctx with
context.WithoutCancel before sending the MCP initialize handshake.
The rationale is correct: an MCP toolset's session must outlive any
single request -- otherwise the underlying SSE/stdio connection
tears down when the request that triggered toolset.Start returns,
and a successful OAuth flow gets killed the moment its triggering
request completes.

But that detachment also severs user-initiated cancellation, which
the OAuth flow inside Connect DOES need to observe. Concretely:
handleUnmanagedOAuthFlow blocks on a select waiting for an
elicitation reply or an out-of-band callback. If the embedder
aborts the in-progress agent run while the user is in the OAuth
dialog, the streamCtx the SessionManager derived from the request
ctx gets cancelled -- but every ctx in the chain BELOW Connect's
WithoutCancel boundary stays alive. The OAuth select then blocks
forever, the per-session streaming lock is never released, and the
next user message returns 409 Conflict from RunSession.ErrSessionBusy.

This commit preserves the connection-longevity invariant and
restores user-initiated cancellation by stashing the caller's
original ctx as a value on the detached ctx. The detached ctx
remains the primary driver of the connection's lifetime, and
operations that want to observe user-initiated cancellation can
opt in by reading the parent back out and selecting on its Done
channel alongside the local ctx.

Carrying ctx-as-value is unusual but appropriate here: the parent
is metadata about how the detached ctx was constructed, not data
the request is operating on. The helpers (withCancellableParent /
cancellableParentFromContext) live next to the conector that
creates the relationship so the rationale stays close to the code.

handleUnmanagedOAuthFlow now extends its select with a userCancelCh
case keyed on the parent's Done channel. When the parent is nil
(e.g. unit tests, CLI invocations that don't go through
clientConnector.Connect), userCancelCh stays nil and the select
silently never picks it -- preserving back-compat.

Plumbing:

  - pkg/tools/mcp/cancellable_parent.go: new helper file with
    withCancellableParent + cancellableParentFromContext,
    extensively commented because ctx-as-value is unusual.
  - pkg/tools/mcp/mcp.go: clientConnector.Connect captures the
    caller's ctx before WithoutCancel and attaches it via
    withCancellableParent.
  - pkg/tools/mcp/oauth.go: handleUnmanagedOAuthFlow extracts the
    parent and adds a fourth select case watching its Done channel;
    returns parent.Err() so the agent loop sees a cancellable
    error and the SessionManager goroutine's deferred Unlock fires.

Tests:

  - cancellable_parent_test.go: round-trip, absent-returns-nil,
    nil-parent-no-op, and the key invariant that the local ctx
    stays independent of the parent's cancellation (only opt-in
    callers see it).
  - oauth_test.go: TestUnmanagedOAuthFlow_DriveFlow_AbortsOnParent-
    CtxCancellation asserts the new branch fires under the exact
    boundary clientConnector.Connect sets up, returns the parent's
    ctx error, and skips the token exchange.

Signed-off-by: Trung Nguyen <trung.nguyen@docker.com>
- bodyclose: close response bodies in RoundTrip test goroutines
- forbidigo: use t.Context() instead of context.Background() in tests
- gci/gofmt: reflow var block in pending_oauth.go
- testifylint require-error: switch terminal ErrorIs to require
- testifylint error-is-as: use assert.ErrorIs instead of
  assert.True(errors.Is(...))
- unconvert: drop redundant tools.ElicitationAction(...) conversion
- unparam: drop unused []byte return from httpDoStatus helper
trungutt added a commit to trungutt/docker-agent that referenced this pull request May 28, 2026
CI flags pre-existing lint issues in code introduced by c97124d
and a93ae4d (the drive-flow and callback-route commits this PR
is stacked on). Fix them here so this PR's CI is green; these
fixes are also present on extend-unmanaged-oauth-flow (docker#2896).
When that PR merges, this commit will be dropped on rebase.

- bodyclose: close response bodies in RoundTrip test goroutines
- gci/gofmt: reflow var block in pending_oauth.go
- testifylint error-is-as: use assert.ErrorIs instead of
  assert.True(errors.Is(...))
- unconvert: drop redundant tools.ElicitationAction(...) conversion
- unparam: drop unused []byte return from httpDoStatus helper

Signed-off-by: Trung Nguyen <trung.nguyen@docker.com>
@trungutt trungutt requested a review from docker-agent May 28, 2026 13:57
…uth-flow

# Conflicts:
#	pkg/tools/mcp/oauth.go
@docker-agent
Copy link
Copy Markdown

PR Review Failed — The review agent encountered an error and could not complete the review. View logs.

@trungutt trungutt marked this pull request as ready for review May 28, 2026 14:41
@trungutt trungutt requested a review from a team as a code owner May 28, 2026 14:41
Copy link
Copy Markdown

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟡 NEEDS ATTENTION

This PR correctly implements the new drive-flow OAuth mode with PKCE, DCR, and token exchange in-process, and the pending-OAuth registry with its channel-based waiter is race-safe. Three medium-severity issues were identified in the newly added code that are worth addressing before merge.

Comment thread pkg/tools/mcp/oauth.go
slog.DebugContext(ctx, "OAuth callback received via deeplink, exchanging code for token")
// Synthesize the {code, state} payload that
// consumeUnmanagedElicitationReply expects.
content = map[string]any{"code": cb.Code, "state": expectedState}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] State validation tautology on the deeplink callback path

When the out-of-band deeplink wins the select race, the code synthesises the payload that consumeUnmanagedElicitationReply expects by pinning state to the local expectedState:

content = map[string]any{"code": cb.Code, "state": expectedState}

consumeUnmanagedElicitationReply then extracts state from that map and does:

if subtle.ConstantTimeCompare([]byte(state), []byte(expectedState)) != 1 {
    return nil, errors.New("state mismatch in elicitation reply")
}

Because state == expectedState by construction, this comparison is always true — the ConstantTimeCompare guard is a no-op for the deeplink path. The actual state that arrived from the HTTP callback is never independently validated inside the flow.

The registry provides the primary authentication (only a POST carrying the exact state the runtime registered can deliver to the channel), so this is not directly exploitable as-is. However:

  • The per-flow check gives a false sense of security — reviewers and future maintainers may think state is re-validated from the callback, when it is not.
  • If PendingOAuthCallback ever grows a State field (e.g. for logging), the same pattern would silently skip validating it.
  • Defence-in-depth is weakened: any future change that loosens the registry's state-key strictness would remove the only real protection.

Suggestion: Either remove the ConstantTimeCompare from the deeplink branch entirely (making the code honest), or add a State field to PendingOAuthCallback, populate it from the HTTP query param in mcpOAuthCallback, and validate it here instead of synthesising the expected value.

Comment thread pkg/server/server.go

group.GET("/agents/:id/:agent_name/tools/count", s.getAgentToolCount)

group.POST("/mcp-oauth/callback", s.mcpOAuthCallback)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] /api/mcp-oauth/callback is unauthenticated when no bearer token is configured

The endpoint is registered under the group that only has BearerTokenMiddleware when authToken != "":

if authToken != "" {
    e.Use(BearerTokenMiddleware(authToken))
}

When the server runs without --auth-token (a documented/valid configuration), any network-reachable client can POST to /api/mcp-oauth/callback?state=<s>&code=<c> with no credentials.

The pending-OAuth registry provides meaningful defence: only states the runtime has generated and is currently awaiting are accepted (unknown states → 404). However, if an attacker can observe a live state value (it appears in the elicitation meta sent to the client over the session stream, at debug-level logs, and in the authorize URL the client is asked to open), they can inject a malicious authorization code. The token exchange would then proceed with an attacker-controlled code, potentially yielding the attacker's own tokens bound to the victim's session.

Suggestions:

  • This route is not a "deeplink receiver" in the same sense as the OAuth redirect URI on the authorization server — it is a programmatic API call made by the embedder. Requiring an auth token for it is reasonable even in deployments where other API calls are also authenticated.
  • At minimum, add a comment explaining the threat model (what the registry guards against, what it does not) so operators know the risk of running without an auth token.
  • Consider rate-limiting or adding an explicit warning in the docs that running without --auth-token weakens the callback endpoint's security posture.

Comment thread pkg/tools/mcp/oauth.go
userCancelCh = parentCtx.Done()
}

var content map[string]any
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Indefinite block in handleUnmanagedOAuthFlow if the elicitation goroutine hangs and the parent context never cancels

ctx is the WithoutCancel-detached context set up in clientConnector.Connect — its Done() channel never fires on its own. The select therefore has only three live cancellation signals:

Case When it fires
ctx.Done() Never (WithoutCancel)
userCancelCh Only if the embedder actively cancels the parent ctx
elicCh Only when requestElicitation returns

If the MCP client disconnects silently (TCP RST, idle timeout, process kill) and requestElicitation has no internal deadline, the goroutine and the entire select will block indefinitely. The per-session streaming lock at the SessionManager level is then held forever: all subsequent messages from that session return 409 Conflict / ErrSessionBusy until a server restart.

callbackCh (the deeplink path) also provides no relief — it will never fire if the client that was supposed to open the browser is already gone.

Suggestion: Add a context with a reasonable deadline around the elicitation wait, e.g.:

elicCtx, elicCancel := context.WithTimeout(ctx, 10*time.Minute)
defer elicCancel()

Or, if the elicitation itself must be open-ended, wrap the outer select with the same timeout so the streaming lock is released even if no reply arrives. The choice of timeout can be made configurable via RuntimeConfig.

@dgageot
Copy link
Copy Markdown
Member

dgageot commented May 28, 2026

Docs reference outdated metadata keys: cagent/type, cagent/server_url, cagent/authorize_url, cagent/state, but the code emits docker-agent/.... Embedders following the docs would look up the wrong keys — please update the docs to match the implementation before merging.

@dgageot
Copy link
Copy Markdown
Member

dgageot commented May 28, 2026

@trungutt Let's merge this one and iterate

@dgageot dgageot merged commit 4e3856b into docker:main May 28, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/mcp MCP protocol, MCP tool servers, integration area/security Authentication, authorization, secrets, vulnerabilities kind/feat PR adds a new feature (maps to feat: commit prefix)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants