Skip to content

fix(agent-core): compaction edge cases, error handling, and truncated output#120

Merged
kermanx merged 19 commits into
mainfrom
xtr/fix-compaction
May 28, 2026
Merged

fix(agent-core): compaction edge cases, error handling, and truncated output#120
kermanx merged 19 commits into
mainfrom
xtr/fix-compaction

Conversation

@kermanx
Copy link
Copy Markdown
Collaborator

@kermanx kermanx commented May 27, 2026

Summary

Reworks the compaction subsystem to fix several edge cases surfaced while testing on long sessions.

Strategy / algorithm

  • Fix computeCompactCount: removed a stray inner recentMessages++ that made the loop step by 2 and undercount recentSize. Behavior now matches the rules documented above the function (rule 6: smallest valid N within recent limits).
  • Returns 0 when no prefix can be compacted instead of throwing, with a fallback that allows compacting all messages when only the trailing message is splittable (tool / asst-no-tool-call).
  • Default config field renamed maxRecentStepsmaxRecentMessages and the algorithm now respects all three at-most limits.

Worker

  • Restored project() on the messages passed to the compaction LLM call so hook records, blocked prompts, and empty assistant placeholders no longer leak in.
  • Moved extractCompactionSummary inside the retry loop so empty summaries (APIEmptyResponseError) actually retry.
  • retryCount now only counts real retries (not the first non-retryable failure).
  • New: when the compaction response is truncated (finishReason === 'truncated'), shrink compactedCount via reduceCompactOnOverflow and retry, sharing the existing overflow handling path. Subject to MAX_COMPACTION_RETRY_ATTEMPTS.
  • tokensBefore / tokensAfter are now both estimated from the raw history (no projection, no system prompt) so they are comparable.

Lifecycle

  • New COMPACTION_UNABLE error code. begin() throws it when there is nothing to compact, so both manual (RPC rejects) and auto (turn fails) paths produce a clear signal instead of silently erroring or spinning.
  • TUI's /compact handler catches the rejection and surfaces it via showError.

Exports

  • compaction/index.ts re-exports from ./strategy (./config was deleted) so CompactionStrategy and friends remain on the public surface.

Test plan

  • pnpm -w typecheck
  • pnpm -w test (4464 passed | 25 skipped | 2 todo)
  • New unit tests for computeCompactCount edge cases (0 / 1 message, all-user, last-splittable fallback)
  • New integration tests for compaction.unable rejection on both manual and auto paths (agent-core + node-sdk)
  • New tests for deferred system reminder behavior across compaction (verifies the existing defer queue is preserved and flushes correctly once tool calls close)
  • Manual smoke: run /compact on an empty session, verify [compaction.unable] shown in TUI
  • Manual smoke: long session that triggers auto compaction, verify it completes

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 27, 2026

🦋 Changeset detected

Latest commit: 87b503f

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@moonshot-ai/agent-core Patch
@moonshot-ai/kimi-code Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 27, 2026

pnpm dlx https://pkg.pr.new/@moonshot-ai/kimi-code@87b503f
npx https://pkg.pr.new/@moonshot-ai/kimi-code@87b503f

commit: 87b503f

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b112bebb31

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +211 to +212
this.begin({ source: 'auto', instruction: undefined });
return this.compacting !== null;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip auto compaction when the suffix cannot fit

When the retained suffix is already larger than the model window (for example an old exchange followed by one huge pending user prompt), starting auto compaction here cannot make the next LLM request fit because the trailing user message is preserved. The turn will compact the prefix, retry with the same oversized suffix, then compact the summary again until maxCompactionPerTurn is hit, leaving the session history rewritten to summaries and failing with a compaction-limit error instead of the original overflow. The previous beginAutoCompaction guarded against history.slice(compactedCount) exceeding max_context_tokens; that check needs to be kept before calling begin.

Useful? React with 👍 / 👎.

Comment on lines +91 to +93
if (!cannotSplitAfter(m1)) {
bestN = messages.length - recentMessages;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve whole parallel tool exchanges during compaction

When history contains a parallel tool call with multiple tool results, this can choose a split after only the first tool message because it only checks the message immediately before the split. For example, assistant(toolCalls=[a,b]), tool(a), tool(b), user(...) can compact the assistant call and tool(a) while leaving tool(b) in the retained suffix, so the next provider request contains an orphan tool result without its preceding assistant tool call. The old sliceCompleteMessages normalization prevented this; the strategy needs to reject splits inside a tool exchange before applying compaction.

Useful? React with 👍 / 👎.

Comment on lines +87 to +90
const compactedCount = this.strategy.computeCompactCount(this.agent.context.history);
if (compactedCount === 0) {
throw new KimiError(ErrorCodes.COMPACTION_UNABLE, 'No prefix that can be compacted in current history.');
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid validating historical begin records while restoring

For sessions recorded by earlier builds, a manual /compact on an empty or otherwise non-compactable history could still write full_compaction.begin before the worker cancelled. During replay AgentRecords.restore calls begin() with records.restoring set, but this new validation runs before the restoring check and throws compaction.unable, so those existing sessions can no longer resume. Restoration should replay old records without starting work or recomputing compactability, or a migration should remove/normalize those begin/cancel pairs.

Useful? React with 👍 / 👎.

Comment on lines +267 to +273
const response = await this.agent.generate(
this.agent.config.provider,
this.agent.config.systemPrompt,
[...this.agent.tools.loopTools],
messages,
undefined,
{ signal },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply completion budgets to compaction requests

Compaction still issues an LLM request, but this path now calls the raw provider directly, bypassing the resolveCompletionBudget/applyCompletionBudget wrapper that ordinary turn requests use and that compaction used before. When operators set KIMI_MODEL_MAX_COMPLETION_TOKENS or rely on loop_control.reservedContextSize for Kimi models, manual/auto compaction ignores that cap, so a large compaction prompt can consume an unbounded completion budget or hit provider truncation despite the configured limit. Build an effective provider for this call the same way the turn LLM path does.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5f6e8b7f31

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/agent-core/src/agent/compaction/strategy.ts
Comment thread packages/agent-core/src/agent/compaction/full.ts
Comment thread packages/agent-core/src/agent/compaction/full.ts
Comment thread .changeset/fix-compaction-edge-cases.md
@kermanx
Copy link
Copy Markdown
Collaborator Author

kermanx commented May 27, 2026

@codex

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 74eaf747af

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

let recentSize = 0;
let bestN: number | undefined;

for (; recentMessages < messages.length; recentMessages++) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Allow compacting a single completed exchange

When the history contains exactly one completed user/assistant exchange and the user runs manual compaction, this loop never considers N === messages.length because it requires recentMessages < messages.length, and the only candidate split is after the initial user message, which is unsafe. That makes computeCompactCount([user, assistant]) return 0, so /compact/Session.compact() rejects with compaction.unable even though the whole completed exchange can be summarized into a compaction summary with an empty retained suffix. This is a common post-turn compaction case and differs from truly uncompactable histories like only pending user messages.

Useful? React with 👍 / 👎.

}
this.agent.fullCompaction.begin({ source: 'auto', instruction: undefined });
return true;
this.begin({ source: 'auto', instruction: undefined });
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep the overflow suffix check before auto-compacting

When the provider reports context overflow but the retained suffix is already too large by itself (for example, a huge pending user prompt after older history), this now starts auto-compaction unconditionally. The old guard avoided compacting in that case; without it, the worker summarizes older messages, retries with the still-oversized pending prompt, and can repeatedly compact the summary until the per-turn compaction limit is hit, wasting calls and replacing useful history even though compaction cannot make the next request fit.

Useful? React with 👍 / 👎.

Comment on lines +91 to +92
if (compactedCount === 0) {
throw new KimiError(ErrorCodes.COMPACTION_UNABLE, 'No prefix that can be compacted in current history.');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Do not throw compaction.unable for auto-compaction

When automatic compaction is triggered but the current history has no safe prefix (for example, the first user prompt alone crosses the 85% trigger threshold but still fits the model window), beginAutoCompaction() reaches this throw through begin({ source: 'auto' }). Because beforeStep() does not catch it, the turn fails before the provider is called, so large single prompts or consecutive pending user messages that could have been answered now immediately end with compaction.unable; auto-compaction should return false in this case and let the normal request/overflow path proceed.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 72758c30e2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +73 to +74
if (canSplitAfter(messages, i)) {
return i;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Return the split count for manual compaction

When /compact finds a safe split at message index i, the count passed to messages.slice(0, N) must be i + 1, not i. With a normal completed exchange like [user, assistant], this returns 1, so the compaction LLM summarizes only the user prompt and applyCompaction preserves the assistant reply outside the summary; longer histories similarly leave the message at the chosen safe split in the retained suffix instead of compacting it.

Useful? React with 👍 / 👎.

Comment on lines +134 to 137
markCompleted() {
this.agent.records.logRecord({
type: 'full_compaction.complete',
...result,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep complete records compatible with vis renderers

New full_compaction.complete records now contain only type/time, but the visual debugger still renders that record by reading compactedCount, tokensBefore, tokensAfter, and summary.length in apps/vis/web/src/components/wire/WireHeadline.tsx and WireRowDetail.tsx. Opening a newly compacted session in vis will therefore render undefined values or throw on record.summary.length; either keep the result payload on this record or update the vis renderers before emitting empty complete records.

Useful? React with 👍 / 👎.

Comment on lines +262 to +263
if (error instanceof APIContextOverflowError || error instanceof TruncatedError) {
compactedCount = this.strategy.reduceCompactOnOverflow(messagesToCompact);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Stop retrying when overflow cannot reduce the prefix

When the compaction request overflows for a prefix that has no smaller safe split, such as a single very large completed exchange, reduceCompactOnOverflow(messagesToCompact) returns the same messagesToCompact.length. This assignment then retries the exact same oversized compaction prompt until MAX_COMPACTION_RETRY_ATTEMPTS is exhausted, wasting provider calls and reporting a generic compaction failure instead of recognizing that the prefix cannot be reduced; compare the new count to the old one and stop when it does not shrink.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6429171bfd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/agent-core/src/agent/compaction/strategy.ts
Comment thread packages/agent-core/src/agent/compaction/full.ts
Comment thread packages/agent-core/src/agent/compaction/full.ts
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 87b503f2d2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/agent-core/src/agent/compaction/full.ts
@kermanx kermanx merged commit 8515472 into main May 28, 2026
6 checks passed
@kermanx kermanx deleted the xtr/fix-compaction branch May 28, 2026 08:47
@github-actions github-actions Bot mentioned this pull request May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant