fix: preserve Ollama streaming tool calls#1948
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Fixes a bug in the Ollama streaming path where tool calls emitted in non-final chunks were dropped because only the final (done) chunk's tool_calls were inspected. Now tool calls are accumulated across all streamed chunks before being added to the final response.
Changes:
- Extract the tool-call →
types.Partconversion into a helper_convert_tool_call_to_part. - Accumulate
tool_callsacross all streamed chunks and append them on the done chunk. - Add a unit test covering tool-call accumulation when the tool call arrives before the done chunk.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| python/packages/kagent-adk/src/kagent/adk/models/_ollama.py | Refactors tool-call conversion into a helper and fixes streaming to aggregate tool calls across chunks. |
| python/packages/kagent-adk/tests/unittests/models/test_ollama.py | Adds a streaming test verifying tool calls from earlier chunks are included in the final response. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Thanks so much for the contribution, you will need to sign your commits to pass DCO before we can merge |
jmhbh
left a comment
There was a problem hiding this comment.
lgtm after commits are signed.
Signed-off-by: andreivince <andreivince21@gmail.com>
708d9fc to
740d320
Compare
|
Looks like the remaining This PR only touches the Ollama adapter and its unit test, and the Python lint/tests are green. Could someone rerun the failed build jobs? |
Closes #1922.
Ollama can emit
tool_callsbefore the finaldone=truechunk when streaming. The adapter was only readingchunk.message.tool_callsfrom the final chunk, so a real streamed tool call could turn into an empty final ADK response.I changed the streaming path to accumulate tool calls across chunks, then include them when building the final
LlmResponse. I also moved the Ollama tool-call to ADK part conversion into one helper so streaming and non-streaming stay consistent.I verified this two ways:
qwen3:0.6b. Raw Ollama emittedget_temperature({"city": "Tokyo"})ondone=false, then a finaldone=truechunk with no tool calls. Before the fix, kagent returnedfunction_calls=0; after the fix, kagent returned the expectedget_temperaturefunction call.Checks run:
uv run pytest packages/kagent-adk/tests/unittests/models/test_ollama.py -quv run pytest packages/kagent-adk/tests/unittests/models --ignore=packages/kagent-adk/tests/unittests/models/test_tls_e2e.py -quv run ruff check packages/kagent-adk/src/kagent/adk/models/_ollama.py packages/kagent-adk/tests/unittests/models/test_ollama.pyOne note: the full model test directory has unrelated local TLS E2E failures because the cert fixture files are missing in my checkout. The non-TLS model tests passed.