fix: expose terminal task_updated events as typed TaskUpdatedMessage#1016
fix: expose terminal task_updated events as typed TaskUpdatedMessage#1016maxim092001 wants to merge 1 commit into
Conversation
Background tasks sometimes finish by emitting only a generic system/task_updated message whose patch.status is terminal, with no typed TaskNotificationMessage. Consumers that track active task IDs from TaskStartedMessage and clear them only on TaskNotificationMessage then believe a finished task is still active and can hang draining the persistent stream until an outer timeout. Expose system/task_updated as a typed TaskUpdatedMessage(SystemMessage) with task_id, patch, status, session_id, uuid — mirroring the TypeScript SDK's SDKTaskUpdatedMessage. Parsing is defensive (all .get(), non-dict patch guard, status derived from patch.status) so a lifecycle event can never raise. Add TaskUpdatedStatus and a shared TERMINAL_TASK_STATUSES frozenset so consumers can clear active task IDs on a terminal status from either TaskNotificationMessage or TaskUpdatedMessage, and document the lifecycle contract on both message types. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Hey @qing-ant, could you please check it out? Thanks! |
|
Or maybe @ashwin-ant you can take a look? |
ashwin-ant
left a comment
There was a problem hiding this comment.
Thanks for digging into this, the root cause analysis is right. Terminal state for a background shell can show up as only a task_updated patch when no follow-up turn drains the notification, so typing this message is the correct fix.
One blocking issue though: the status enum doesn't match what the CLI actually sends. The set of values that can appear in patch.status is
pending | running | completed | failed | killed | paused
This PR declares TaskUpdatedStatus = Literal["running", "completed", "failed", "stopped", "cancelled"], which has a few problems:
stoppedandcancellednever appear in atask_updatedpatch.stoppedonly exists ontask_notification, where the CLI mapskilledtostoppedbefore emitting.cancelleddoesn't exist at all.killedis missing, and it's terminal. A task stopped via TaskStop emitstask_updatedwithstatus: "killed". With the proposedTERMINAL_TASK_STATUSESthat event isn't recognized as terminal, so the stale-active-task hang this PR fixes comes right back for stopped tasks. In some kill paths the notification is suppressed entirely, sotask_updatedis the only terminal signal. Easy to reproduce: start a background bash task, stop it with TaskStop, and watch the patch.pendingandpausedare missing, so a paused task's patch produces a value that violates the declared Literal.
Suggested fix:
TaskUpdatedStatus = Literal["pending", "running", "completed", "failed", "killed", "paused"]- Terminal set for
task_updatedis{completed, failed, killed}. If you want one constant shared withTaskNotificationMessage, it needs both vocabularies:{completed, failed, stopped, killed}. Either way, dropcancelled.
Two smaller things:
- I'd soften the lifecycle contract docstring. "Every TaskStartedMessage gets a typed terminal event" is a stronger guarantee than the CLI makes today.
- Current CLI builds stamp
uuidandsession_idon every drained event, so the defensive handling is fine to keep but probably reflects an older build.
The shape, tests, and backward-compat approach all look good. Happy to approve once the enum is fixed.
Fixes #1019
Summary
Background tasks can leave consumers with stale active-task state. A background Bash task sometimes finishes by emitting only a generic
system/task_updatedmessage whosepatch.statusis terminal (completed/failed/stopped), with no typedTaskNotificationMessage:Consumers that track active task IDs from
TaskStartedMessageand clear them only onTaskNotificationMessagethen believe a finished task is still active. If they use that active set to boundreceive_messages()(which is open-ended for the persistent client), they keep draining the stream until an outer timeout.The parser previously mapped
system/task_updatedto a genericSystemMessage, so terminal task state — semantically lifecycle data — was never represented by a typed lifecycle message. This breaks the contract: a typed start event (TaskStartedMessage) was not guaranteed a typed terminal event.Fix
Expose
system/task_updatedas a typedTaskUpdatedMessage(SystemMessage), mirroring the TypeScript SDK'sSDKTaskUpdatedMessage:task_iddata.task_idpatchdata.patch(full dict preserved)statuspatch.statussession_iddata.session_iduuiddata.uuidPlus:
TaskUpdatedStatus—Literal["running","completed","failed","stopped","cancelled"].TERMINAL_TASK_STATUSES— sharedfrozensetof finished statuses, so consumers can clear active task IDs on a terminal status from eitherTaskNotificationMessageorTaskUpdatedMessage.TaskUpdatedMessageandTaskNotificationMessage.After this change, a bounded drain works without hanging:
Defensive parsing
task_updatedparsing uses.get()throughout, guards a non-dictpatch, and derivesstatusfrompatch.status— it can never raise on a lifecycle event (the observed CLI payload omitsuuid/session_id). A patch carrying onlyend_time/result/erroris left non-terminal (status=None) with the fullpatchpreserved for callers that need more.Backward compatibility
TaskUpdatedMessagesubclassesSystemMessage, so existingisinstance(msg, SystemMessage)andcase SystemMessage()checks still match;subtypeanddataremain populated with the raw payload.SystemMessagesubclasses are covered structurally by theMessageunion rather than listed individually.Tests
Added parser tests for: terminal
completed; the minimal observed shape (nouuid/session_id);running/non-terminal; missingpatch;patchpresent withoutstatus(preserved verbatim); non-dict/Nonepatch(parametrized — never raises); all terminal statusescompleted/failed/stopped/cancelled(parametrized); andSystemMessagebackward-compat.ruff check,ruff format,mypy src/, and the full suite (775 passed, 5 skipped) all green.🤖 Generated with Claude Code