Add DeepSeek agent tool cassette regressions#2009
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
tests/providers/deepseek/agent_tool_sessions.rswith 10 live-recorded fixtures.tool_choice: "required"wire value when thinking is explicitly disabled.Bugs found and fixed
nullfor DeepSeek. DeepSeek chat completions expects OpenAI-compatible chat wire format such as"required","none", or{ "type": "function", "function": { "name": "..." } }.required/specific tool choice (Thinking mode does not support this tool_choice), so Rig only sends required/specific choices when callers explicitly disable thinking via{"thinking":{"type":"disabled"}}; otherwise it preserves the previous safenullbehavior used by extractor cassettes.id,model,object, andsystem_fingerprint, and public usage now includescompletion_tokens_details.reasoning_tokens.New scenarios
sequential_complex_tool_calls_nonstreaming: four ordered tools with empty args, nested JSON, arrays, escaping/backslashes/newlines/unicode.sequential_complex_tool_calls_streaming: streaming parity for the same four-tool loop.parallel_tool_calls_single_turn_nonstreaming: two zero-arg tools in a single assistant turn.parallel_tool_calls_single_turn_streaming: streaming parity for parallel tool calls.raw_stream_complex_tool_call_deltas_have_object_arguments: streamed fragmented tool args are reassembled as JSON objects.long_history_replay_with_tool_result_continuation: replays user/assistant/tool-call/tool-result history and forces no new tools.tool_choice_required_specific_and_none: validates required, specific-tool, and no-tools behavior.reasoning_enabled_preserves_reasoning_content_deltas_and_usage: validatesreasoning_content, streaming reasoning deltas before text, and reasoning-token accounting.chat_alias_vs_reasoner_alias_behavior: locks downdeepseek-chatnon-reasoning vsdeepseek-reasonerreasoning behavior.json_object_response_format_roundtrip: validates DeepSeek JSON object mode throughresponse_format.Models used
deepseek-v4-flash: primary stable/inexpensive model for tool calling, streaming, JSON mode, and thinking-enabled reasoning.deepseek-chat: isolated alias behavior coverage for non-thinking chat mode.deepseek-reasoner: isolated alias behavior coverage for reasoning mode.Inspiration references used
inspirations/pydantic-ai/tests/providers/test_deepseek.pyinspirations/pydantic-ai/pydantic_ai_slim/pydantic_ai/providers/deepseek.pyinspirations/pydantic-ai/pydantic_ai_slim/pydantic_ai/profiles/deepseek.pyinspirations/vercel-ai-sdk/packages/deepseek/src/chat/deepseek-chat-language-model.tsinspirations/vercel-ai-sdk/packages/deepseek/src/chat/deepseek-chat-language-model.test.tsinspirations/vercel-ai-sdk/packages/deepseek/src/chat/deepseek-prepare-tools.tsinspirations/vercel-ai-sdk/packages/deepseek/src/chat/convert-to-deepseek-chat-messages.tsinspirations/vercel-ai-sdk/packages/deepseek/src/chat/convert-to-deepseek-usage.tsinspirations/langchain/libs/partners/deepseek/langchain_deepseek/chat_models.pyinspirations/langchain/libs/partners/deepseek/tests/unit_tests/test_chat_models.pyinspirations/langchain/libs/partners/deepseek/tests/integration_tests/test_chat_models.pyRecording / replay commands run
Record:
Replay without credentials:
Other validation:
Manual cassette inspection notes
tests/cassettes/deepseek/agent_tool_sessions/and the updatedstreaming_tools/raw_stream_emits_required_zero_arg_tool_call.yaml./chat/completionsrequest bodies forthinking,tool_choice,parallel_tool_calls,response_format, tool calls/results, stream options, and alias models.finish_reason,model,id,system_fingerprint,reasoning_content,completion_tokens_details.reasoning_tokens, prompt cache details, and streamed tool-call deltas.Known gaps / non-goals
cargo testgets through the new DeepSeek coverage but still fails on an unrelated existing OpenAI cassette replay mismatch inopenai::cassette::permission_control::permission_control_streaming_example.