Add Mistral agent tool cassette regressions#2010
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
tests/providers/mistral/agent_tool_sessions.rswith 9 live-recorded fixtures.Bugs found and fixed
CompletionModel::streamcalled the normal completion path and converted the final response into stream items. This PR implements real Mistral SSE chat-completions streaming using the shared OpenAI chat-completions-compatible stream state machine and recordsstream: truefixtures.required, while Mistral usesany; Rig also rejected specific tool choice even though Mistral accepts the OpenAI-style function object. This PR serializesRequiredas"any", supportsAuto/None, and supports single specific tool choice as{ "type": "function", "function": { "name": "..." } }.output_schemaonly logged a warning. This PR maps Rig output schemas to Mistralresponse_format: { type: "json_schema", json_schema: ... }and adds a live-recorded JSON-schema cassette.max_tokenswas not sent: Mistral completion requests now includemax_tokenswhen configured.completion_tokensdirectly instead oftotal_tokens - prompt_tokens.message_idfrom the raw Mistral response id and tests assert rawid,model,finish_reason, and usage survive.content: nulland array content (e.g. text plus thinking parts) now deserialize without failing; text parts are preserved and unsupported thinking is ignored consistently with existing behavior.tool_call_id, so the provider now omits the optionalnamefield for tool-result messages.New cassette scenarios
sequential_complex_tool_calls_nonstreaming: five ordered tools covering empty args, nested JSON, arrays, escaped strings, optional/nullable fields, and usage/tool-history checks.sequential_complex_tool_calls_streaming: real SSE streaming parity for the same long multi-turn tool loop.parallel_tool_calls_single_turn_nonstreaming: two zero-arg tools in one assistant turn.parallel_tool_calls_single_turn_streaming: real SSE streaming parity for parallel tool calls.raw_stream_complex_tool_call_deltas_have_object_arguments: raw streaming tool-call output reassembles into JSON object arguments.long_history_replay_with_tool_result_continuation: replays system/user/assistant/tool-call/tool-result history and forces no new tools.tool_choice_auto_any_specific_and_none: validatesauto, Mistralany, specific function-object choice, andnone.json_object_response_format_roundtrip: validatesresponse_format: { type: "json_object" }.json_schema_structured_output_roundtrip: validates native Mistral JSON-schema response format from Rigoutput_schema.Models used
mistral-small-latestfor all scenarios: current stable Mistral model with tool calling, streaming, JSON object mode, JSON-schema structured output, and low enough cost for cassette recording.Inspiration references used
inspirations/vercel-ai-sdk/packages/mistral/src/mistral-prepare-tools.tsinspirations/vercel-ai-sdk/packages/mistral/src/mistral-chat-language-model.tsinspirations/vercel-ai-sdk/packages/mistral/src/mistral-chat-language-model.test.tsinspirations/vercel-ai-sdk/packages/mistral/src/convert-to-mistral-chat-messages.test.tsinspirations/pydantic-ai/pydantic_ai_slim/pydantic_ai/models/mistral.pyinspirations/pydantic-ai/tests/models/test_mistral.pyinspirations/pydantic-ai/tests/providers/test_mistral.pyinspirations/langchain/libs/partners/mistralai/langchain_mistralai/chat_models.pyinspirations/langchain/libs/partners/mistralai/tests/unit_tests/test_chat_models.pyinspirations/langchain/libs/partners/mistralai/tests/integration_tests/test_chat_models.pyinspirations/semantic-kernel/python/semantic_kernel/connectors/ai/mistral_ai/services/mistral_ai_chat_completion.pyinspirations/semantic-kernel/python/tests/unit/connectors/ai/mistral_ai/services/test_mistralai_chat_completion.pyRecording / replay commands run
Record:
Replay without credentials:
Other validation:
Manual cassette inspection notes
tests/cassettes/mistral/agent_tool_sessions/./v1/chat/completionsrequest bodies for tools,tool_choice,parallel_tool_calls,stream: true, JSON object mode, JSON schema, and long history with assistant tool calls/tool results.finish_reason,model, response ids, usage (prompt_tokens,completion_tokens,total_tokens, cache details), and streamed tool-call/text deltas.namevalues while preservingtool_call_id.Known gaps / non-goals
cargo testwas run and reaches the existing OpenAI cassette failure inopenai::cassette::permission_control::permission_control_streaming_example(request body mismatch against/v1/responses); Mistral coverage passes before that unrelated failure.