feat(core): implement Runner.runLive and LlmAgent live flow#445
Conversation
Implements the bidirectional (live) orchestration that was left as a TODO after google#409 (which added only the GeminiLlmConnection layer): runner.ts carried `// TODO b/425992518: Implement runLive` and LlmAgent.runLiveFlow threw `not implemented`. - Runner.runLive: opens the invocation, runs the before/on-event/after-run plugin hooks, drives agent.runLive, and persists events while skipping raw inline-audio blobs (shouldAppendLiveEvent). - LlmAgent.runLiveFlow: preprocess (same processors as runAsync) -> connect -> parallel send loop + receive loop -> tool execution -> sub-agent transfer. Reconnects transparently on goAway / recoverable drops when a session resumption handle is available, skipping history replay on resume. - InvocationContext.liveRequestQueue / liveSessionResumptionHandle. - LiveRequestQueue.get(abortSignal) to release a parked read on teardown. - RunConfig.contextWindowCompression, forwarded to the live connect config. - Gemini.connect: model-version-aware live API version routing and stripping of the Vertex-only sessionResumption.transparent flag on the AI Studio backend. Adds core/test/runner/run_live_test.ts (16 tests covering realtime/audio, transcription persistence, tool calls, resumption capture, goAway reconnect, externally supplied handles, activity signals, sub-agent transfer) and liveApiVersion / transparent-strip coverage in google_llm_test.ts.
kalenkevich
left a comment
There was a problem hiding this comment.
Thanks for the contribution!
I did some partial review, will continue tomorrow
| return true; | ||
| } | ||
| const inlineData = parts[0].inlineData; | ||
| if (!inlineData?.mimeType?.startsWith('audio/')) { |
There was a problem hiding this comment.
why only audio/?
what about video/ and image/?
There was a problem hiding this comment.
Added audo/video/image & renamed the function
| function shouldAppendLiveEvent(event: Event): boolean { | ||
| const parts = event.content?.parts; | ||
| if (!parts?.length) { | ||
| return true; |
| if (!parts?.length) { | ||
| return true; | ||
| } | ||
| const inlineData = parts[0].inlineData; |
There was a problem hiding this comment.
why we are checking only the first part? We need to iterate on parts list and if at least one satisfy the criteria we should return true
There was a problem hiding this comment.
Added .some((p) => {...}) for all parts
| runConfig.inputAudioTranscription ??= {}; | ||
| } | ||
|
|
||
| const span = tracer.startSpan('invocation'); |
There was a problem hiding this comment.
it should be runLive I assume
There was a problem hiding this comment.
I kept this as 'invocation' intentionally. runAsync in this same class opens its top-level span with tracer.startSpan('invocation'), and Python ADK's run_async seems to do the same (correct me if I am wrong).
An 'invocation' seems to be the ADK convention for the one span that wraps a whole run, regardless of mode, correct? Naming only runLive's span differently would make the two entry points inconsistent in traces and break any span-name-based filtering that already assumes an 'invocation'. The agent/flow-level spans underneath still distinguish live from non-live work. Happy to rename if you'd prefer runLive here, just clarifying my thought process.
| ctx, | ||
| this, | ||
| async function* () { | ||
| const session = await this.sessionService.getSession({ |
There was a problem hiding this comment.
Applied & removed the if(!session) check.
| } else if (extractModelName(this.model).startsWith('gemini-2.5')) { | ||
| this._liveApiVersion = 'v1beta'; |
There was a problem hiding this comment.
This is not how it is in python ADK https://github.com/google/adk-python/blob/main/src/google/adk/models/google_llm.py#L379-L388
- runner.runLive: use getOrCreateSession instead of getSession + throw, matching Python ADK run_live. - shouldAppendLiveEvent -> isLiveModelMediaEventWithInlineData: iterate all parts and skip audio/video/image inline media (mirrors Python _is_live_model_media_event_with_inline_data); negate at the call site. - google_llm.liveApiVersion: drop the gemini-2.5 -> v1beta special case; Vertex -> v1beta1, AI Studio -> v1alpha, matching Python ADK. - Tests updated for the new behavior plus video/image and non-first-part media coverage.
Thanks, I updated code and did successful re-test with our own agent. |
Implements the bidirectional (live) orchestration that was left as a
TODOafter #409 (which added only the GeminiLlmConnection layer): runner.ts carried// TODO b/425992518: Implement runLiveandLlmAgent.runLiveFlowthrewnot implemented.Runner.runLive: opens the invocation, runs the before/on-event/after-run plugin hooks, drivesagent.runLive, and persists events while skipping raw inline-audio blobs (shouldAppendLiveEvent).LlmAgent.runLiveFlow: preprocess (same processors asrunAsync) -> connect ->parallel send loop + receive loop -> tool execution -> sub-agent transfer. Reconnects transparently on goAway / recoverable drops when a session resumption handle is available, skipping history replay on resume.
InvocationContext.liveRequestQueue/liveSessionResumptionHandle.LiveRequestQueue.get(abortSignal)to release a parked read on teardown.RunConfig.contextWindowCompression, forwarded to the live connect config.Gemini.connect: model-version-aware live API version routing and stripping of the Vertex-onlysessionResumption.transparentflag on the AI Studio backend.Adds
core/test/runner/run_live_test.ts(16 tests covering realtime/audio, transcription persistence, tool calls, resumption capture, goAway reconnect, externally supplied handles, activity signals, sub-agent transfer) and liveApiVersion / transparent-strip coverage in google_llm_test.ts.Please ensure you have read the contribution guide before creating a pull request.
Link to Issue or Description of Change
1. Link to an existing issue (if applicable):
2. Or, if no issue exists, describe the change:
If applicable, please follow the issue templates to provide as much detail as
possible.
Problem:
A clear and concise description of what the problem is.
Solution:
A clear and concise description of what you want to happen and why you choose
this solution.
Testing Plan
Please describe the tests that you ran to verify your changes. This is required
for all PRs that are not small documentation or typo fixes.
Unit Tests:
Please include a summary of passed npm test results.
Manual End-to-End (E2E) Tests:
Please provide instructions on how to manually test your changes, including any
necessary setup or configuration. Please provide logs or screenshots to help
reviewers better understand the fix.
Checklist
Additional context
Add any other context or screenshots about the feature request here.