Remove default pruneMessages call from Think to preserve client-side tool results#1456
Remove default pruneMessages call from Think to preserve client-side tool results#1456mattzcarey wants to merge 2 commits intomainfrom
Conversation
The hardcoded pruneMessages({ toolCalls: "before-last-2-messages" }) call
silently strips client-side tool results from a 5-message conversation,
breaking multi-turn flows where the user's choices live in those tool
results (issue #1455). Add an overridable getPruneOptions() method that
returns the prune options, or null to skip pruning entirely. Default
behavior is unchanged.
🦋 Changeset detectedLatest commit: 2aaf22b The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Feels like we need a better fix than making this configurable now when we'll likely remove it later? |
id would prefer to remove it cause it bit me before. @threepointone I cant see it being a good default. |
agents
@cloudflare/ai-chat
@cloudflare/codemode
hono-agents
@cloudflare/shell
@cloudflare/think
@cloudflare/voice
@cloudflare/worker-bundler
commit: |
Per #1455 and review feedback: the hardcoded pruneMessages({ toolCalls: "before-last-2-messages" }) silently dropped client-side tool results across turns. Rather than add a configurable escape hatch we'll likely remove later, just stop pruning by default. truncateOlderMessages still runs, so context cost stays bounded. Agents that want aggressive pruning can opt in from beforeTurn — documented with a code snippet alongside the existing examples.
|
Pivoted per your feedback — dropped the default |
Summary
Fixes #1455.
Think hardcoded
pruneMessages({ toolCalls: "before-last-2-messages" })in the inference loop, which silently stripped client-side tool results (noexecute, output supplied viaaddToolOutput) from any turn beyond the second. The user's choices live in those tool results, so the conversation effectively lost its memory.This PR drops the
pruneMessagescall fromThink._runInferenceLoopentirely.truncateOlderMessagesstill runs (size-trimming tool outputs >500 chars and text >10k chars in messages older than the last 4), so context cost stays bounded.What if you want the old behavior?
Apply
pruneMessagesyourself inbeforeTurn:Or scope it per-tool with the array form so client-side tool results survive across turns:
Both snippets are added to
docs/think/lifecycle-hooks.mdnext to the existingbeforeTurnexamples.Behavior change
This is a behavior change — the changeset is marked
minorand explains the diff. Existing agents that benefit from the previous aggressive pruning will see longer model contexts; the size cap fromtruncateOlderMessageskeeps the increase bounded.Test plan
npm run typecheck— all 80 projects passnpm run check:exports— all 8 packages validnpx sherif— no issuesoxfmt --check .— cleanoxlint examples/ packages/ guides/ openai-sdk/ site/— 0 warnings, 0 errorspackages/think/src/tests/think-session.test.tsconfirms client-side tool outputs (user-choice-0/1/2) all reach the model after follow-up turns