Revert "fix: spool large mcp media to disk"#2893
Conversation
This reverts commit 6dd0262.
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
This PR reverts the MCP media spool-to-disk feature. The revert is clean and intentional, but re-introduces two behavioural regressions worth noting.
| return ".bin" | ||
| // encodeMedia re-encodes raw bytes (as decoded by the MCP SDK) back to base64 | ||
| // for our internal MediaContent representation. | ||
| func encodeMedia(data []byte, mimeType string) tools.MediaContent { |
There was a problem hiding this comment.
[MEDIUM] Memory pressure regression: large MCP media always held in-memory as base64
The reverted encodeMedia unconditionally base64-encodes all MCP media payloads — no matter how large — directly into a string field held in memory for the lifetime of the session. The spool-to-disk fix (commit 6dd0262) was introduced specifically to avoid this: it kept payloads ≤ 256 KB inline and wrote larger ones to a temp file.
With this revert, a single MCP tool response containing, e.g., a 10 MB screenshot will allocate ~13 MB of base64 in RAM, duplicated across the session history and the TUI. Multiple such responses compound quickly.
If the spool approach is being dropped intentionally, consider at minimum documenting the decision or adding a size cap/warning log for unexpectedly large payloads.
| }, | ||
| }) | ||
| } | ||
| parts = append(parts, chat.MessagePart{ |
There was a problem hiding this comment.
[MEDIUM] Missing empty-data guard may produce malformed data URIs
The previous buildMultiContent had a case img.Data != "": guard that silently skipped any MediaContent with an empty Data field. The restored code unconditionally builds a MessagePart for every image:
URL: "data:" + img.MimeType + ";base64," + img.Data,If img.Data is empty (e.g., an MCP server returns an ImageContent with a zero-length Data slice — base64.StdEncoding.EncodeToString(nil) returns ""), the resulting URL is "data:image/png;base64," — an invalid data URI that will be sent verbatim to the LLM API. Most LLM clients will return an error or silently fail on such a URL.
Since encodeMedia always calls base64.StdEncoding.EncodeToString, this can only trigger when an MCP server sends an empty ImageContent.Data, but the old guard handled that defensively. Consider adding back a check:
if img.Data != "" {
parts = append(parts, ...)
}| // FilePath is an optional local file containing the decoded media payload. | ||
| FilePath string `json:"filePath,omitempty"` | ||
| // Data is the base64-encoded payload. | ||
| Data string `json:"data"` |
There was a problem hiding this comment.
[LOW] Removing omitempty from MediaContent.Data changes JSON serialisation
The field tag changed from json:"data,omitempty" to json:"data". When Data is an empty string, the old tag would omit the field entirely from JSON output; the new tag always emits "data":"".
In the current code path encodeMedia always populates Data, so this is unlikely to matter in practice. However, any code that constructs a MediaContent with an empty Data (e.g. tests, future zero-value defaults) will now emit the extra field, which may surprise consumers or break strict-equality JSON assertions.
This reverts commit 6dd0262.