[BUG] use litellm auto_router not worker

### Problem (one or two sentences)

When I use littlem's auto_router automatic routing mode, it doesn't work properly, but accessing auto_router using curl or openai scripts works.

### Context (who is affected and when)

use litellm auto_router

### Reproduction steps

1.m1 mac macbook pro
2.litellm[proxy] 1.81.5
litellm/config.yaml:
```yaml
model_list:
  - model_name: ollama-stable-code
    litellm_params:
      model: ollama/stable-code:3b-code-q4_K_M
      api_base: http://localhost:11434
    model_info:
      mode: chat
      input_cost_per_token: 0.0  # 输入token成本（每百万）
      output_cost_per_token: 0.0  # 输出token成本（每百万）
      tags: ["backend", "frontend", "local", "default"]
      id: local
  - model_name: ollama-bge-zh-embedding
    litellm_params:
      model: ollama/quentinz/bge-large-zh-v1.5:q4_0
      api_base: http://localhost:11434
      encoding_format": float
    model_info:
      mode: embedding
      input_cost_per_token: 0.0  # 输入token成本（每百万）
      output_cost_per_token: 0.0  # 输出token成本（每百万）
      tags: ["local-embedding"]
      id: local-embedding
  - model_name: "auto_router1"
      litellm_params:
        model: "auto_router/auto_router1"
        auto_router_config_path: "./router.json"
        auto_router_default_model: "ollama-stable-code"
        auto_router_embedding_model: "ollama-bge-zh-embedding"
      model_info:
        mode: chat
        input_cost_per_token: 0.0
        output_cost_per_token: 0.0
        tags: ["auto", "system"]
        id: auto

```
litellm/router.json:
```json
{
  "encoder_type": "ollama",
  "encoder_name": "quentinz/bge-large-zh-v1.5:q4_0",
  "routes": [
    {
      "name": "gpt-3.5-turbo",
      "utterances": ["ESLint配置", "Prettier配置", "Git Hooks", "前端CI/CD", "Monorepo配置", "TypeScript配置"],
      "description": "前端工具链",
      "score_threshold": 0.6,
      "metadata": {
        "group": "frontend",
        "priority": 80
      }
    }
]
```
```shell
$ curl -X POST http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer litellm" \
-d '{
    "model": "auto_router1",
    "messages": [{"role": "user", "content": "ESLint配置查看"}]
}'
{"id":"chatcmpl-D3wi1BwQsPZ6Y2h18UWTrvERNJdRU","created":1769833033,"model":"gpt-4.1-mini-2025-04-14","object":"chat.completion","system_fingerprint":"fp_3dcd5944f5","choices":[{"finish_reason":"stop","index":0,"message":{"content":"ESLint 配置是用于定义代码风格和代码质量检查规则的文件。你可以通过多种方式查看和管理 ESLint 的配置，下面是几种常用的方法：\n\n---\n\n### 1. 查看项目中的 ESLint 配置文件\n\nESLint 的配置文件通常放在项目根目录，常见的配置文件名有：\n\n- `.eslintrc.js`\n- `.eslintrc.json`\n- `.eslintrc.yaml` 或 `.eslintrc.yml`\n- `eslint.config.js` （ESLint 8+ 支持的配置文件）\n\n你可以直接打开这些文件查看配置内容。\n\n示例 `.eslintrc.js` 文件：\n\n```js\nmodule.exports = {\n  env: {\n    browser: true,\n    es2021: true,\n  },\n  extends: [\n    'eslint:recommended',\n    'plugin:react/recommended',\n  ],\n  parserOptions: {\n    ecmaFeatures: {\n      jsx: true,\n    },\n    ecmaVersion: 12,\n    sourceType: 'module',\n  },\n  plugins: [\n    'react',\n  ],\n  rules: {\n    'no-unused-vars': 'warn',\n    'react/prop-types': 'off',\n  },\n};\n```\n\n---\n\n### 2. 使用命令行查看 ESLint 配置\n\n如果你安装了 ESLint，可以使用以下命令打印某个文件最终生效的配置规则：\n\n```bash\neslint --print-config path/to/file.js\n```\n\n这条命令会输出 ESLint 对 `path/to/file.js` 应用的完整配置，包含继承的配置和插件等，方便排查配置问题。\n\n---\n\n### 3. 在编辑器中查看配置（VSCode）\n\n如果你使用 Visual Studio Code，并安装了 ESLint 插件，可以：\n\n- 在 `.eslintrc.*` 文件中直接查看和编辑 ESLint 配置\n- 在报错的代码处查看规则说明\n- 插件会自动读取项目 ESLint 配置\n\n---\n\n### 4. 在项目 package.json 中查看\n\n有时项目会把 ESLint 配置写到 `package.json` 里：\n\n```json\n{\n  \"eslintConfig\": {\n    \"env\": {\n      \"browser\": true,\n      \"node\": true\n    },\n    \"extends\": \"eslint:recommended\",\n    \"rules\": {\n      \"semi\": [\"error\", \"always\"]\n    }\n  }\n}\n```\n\n---\n\n### 5. 配置层次和规则继承说明\n\n- ESLint 会合并继承的配置，比如 `\"extends\": [\"eslint:recommended\", \"plugin:react/recommended\"]`\n- 优先级：本地规则会覆盖继承规则\n- 可以通过 `overrides` 针对不同文件设置不同规则\n\n---\n\n如果你有更具体的需求，比如想查看某个规则的含义，或者如何配置 ESLint，可以告诉我，我帮你详细解答！","role":"assistant"}}],"usage":{"completion_tokens":619,"prompt_tokens":11,"total_tokens":630,"completion_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":0,"rejected_prediction_tokens":0},"prompt_tokens_details":{"audio_tokens":0,"cached_tokens":0}}}%

$ curl -X POST http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer litellm" \
-d '{
    "model": "auto_router1",
    "messages": [{"role": "user", "content": "how to code a program in python"}]
}'
{"id":"chatcmpl-84430275-6549-4e65-b548-799aa5de97cb","created":1769832935,"model":"ollama/stable-code:3b-code-q4_K_M","object":"chat.completion","choices":[{"finish_reason":"stop","index":0,"message":{"content":"","role":"assistant"}}],"usage":{"completion_tokens":1,"prompt_tokens":12,"total_tokens":13}}%
```
when is use roocode ask input "ESLint配置查看",the litellm error,litellm use embedding models

### Expected result

no error and output message

### Actual result

api error

### Variations tried (optional)

provider use litellm or openai both error

### App Version

3.46.0

### API Provider (optional)

LiteLLM

### Model Used (optional)

_No response_

### Roo Code Task Links (optional)

_No response_

### Relevant logs or errors (optional)

```shell
12:19:30 - LiteLLM Router:INFO: router.py:1664 - litellm.acompletion(model=None) Exception Internal_litellm_router API call failed. Error: litellm.BadRequestError: Invalid input for ollama embeddings. input=[[{'type': 'text', 'text': '<user_message>\nESLint配置查看\n</user_message>'}, {'type': 'text', 'text': '<environment_details>\n# VSCode Visible Files\n\n\n# VSCode Open Tabs\n\n\n# Current Time\nCurrent time in ISO 8601 UTC format: 2026-01-31T04:19:21.492Z\nUser time zone: Asia/Shanghai, UTC+8:00\n\n# Current Cost\n$0.00\n\n# Current Mode\n<slug>ask</slug>\n<name>❓ Ask</name>\n<model>auto_router1</model>\n<tool_format>native</tool_format>\n\n\n# Current Workspace Directory (/Users/fire/Desktop/p-work/projects/xgmem) Files\n.gitignore\nbuild.mjs\nDockerfile\nindex.ts\njest.config.js\npackage-lock.json\npackage.json\nREADME.md\ntsconfig.json\nsrc/\nsrc/main.ts\nsrc/application/\nsrc/application/ProjectMemoryApplication.ts\nsrc/application/services/\nsrc/application/services/EntityService.ts\nsrc/application/services/ObservationService.ts\nsrc/application/services/ProjectService.ts\nsrc/application/services/RelationService.ts\nsrc/core/\nsrc/core/interfaces/\nsrc/core/interfaces/IDocument.ts\nsrc/core/interfaces/IEntity.ts\nsrc/core/interfaces/IFilterQuery.ts\nsrc/core/interfaces/IObservation.ts\nsrc/core/interfaces/IProject.ts\nsrc/core/interfaces/IRelation.ts\nsrc/core/interfaces/IRepository.ts\nsrc/core/interfaces/IServices.ts\nsrc/core/interfaces/IStorageProvider.ts\nsrc/examples/\nsrc/examples/usage-example.ts\nsrc/infrastructure/\nsrc/infrastructure/dependency-injection/\nsrc/infrastructure/dependency-injection/ApplicationSetup.ts\nsrc/infrastructure/dependency-injection/Container.ts\nsrc/infrastructure/repositories/\nsrc/infrastructure/repositories/BaseRepository.ts\nsrc/infrastructure/repositories/EntityRepository.ts\nsrc/infrastructure/repositories/ObservationRepository.ts\nsrc/infrastructure/repositories/ProjectRepository.ts\nsrc/infrastructure/repositories/RelationRepository.ts\nsrc/infrastructure/storage/\nsrc/infrastructure/storage/FileStorageProvider.ts\nsrc/presentation/\nsrc/presentation/mcp/\nsrc/presentation/mcp/MCPServer.ts\nsrc/presentation/mcp/MCPToolHandler.ts\nYou have not created a todo list yet. Create one with `update_todo_list` if your task is complicated or involves multiple steps.\n</environment_details>'}]]No fallback model group found for original model_group=ollama-bge-zh-embedding. Fallbacks=[{'claude-opus-4-5-20251101-thinking': ['claude-opus-4-5-20251101', 'claude-sonnet-4-5-20250929-thinking', 'gemini-2.5-pro-thinking']}, {'claude-sonnet-4-5-20250929-thinking': ['claude-sonnet-4-5-20251101', 'gemini-2.5-pro-thinking']}, {'gemini-2.5-pro-thinking': ['gemini-2.5-pro', 'claude-haiku-4-5-20251001-thinking']}, {'claude-haiku-4-5-20251001-thinking': ['claude-haiku-4-5-20251001']}, {'gemini-2.5-pro': ['claude-haiku-4-5-20251001']}, {'claude-opus-4-5-20251101': ['claude-sonnet-4-5-20250929', 'gemini-2.5-pro']}, {'claude-sonnet-4-5-20250929': ['gemini-2.5-pro']}, {'gemini-2.5-flash-thinking': ['gemini-2.5-flash', 'gpt-3.5-turbo']}, {'gemini-2.5-flash': ['gpt-3.5-turbo']}, {'gpt-3.5-turbo': ['ollama-stable-code']}, {'ollama-stable-code': ['gemini-2.5-flash']}]. Received Model Group=ollama-bge-zh-embedding
Available Model Group Fallbacks=None
Error doing the fallback: litellm.BadRequestError: Invalid input for ollama embeddings. input=[[{'type': 'text', 'text': '<user_message>\nESLint配置查看\n</user_message>'}, {'type': 'text', 'text': '<environment_details>\n# VSCode Visible Files\n\n\n# VSCode Open Tabs\n\n\n# Current Time\nCurrent time in ISO 8601 UTC format: 2026-01-31T04:19:21.492Z\nUser time zone: Asia/Shanghai, UTC+8:00\n\n# Current Cost\n$0.00\n\n# Current Mode\n<slug>ask</slug>\n<name>❓ Ask</name>\n<model>auto_router1</model>\n<tool_format>native</tool_format>\n\n\n# Current Workspace Directory (/Users/fire/Desktop/p-work/projects/xgmem) Files\n.gitignore\nbuild.mjs\nDockerfile\nindex.ts\njest.config.js\npackage-lock.json\npackage.json\nREADME.md\ntsconfig.json\nsrc/\nsrc/main.ts\nsrc/application/\nsrc/application/ProjectMemoryApplication.ts\nsrc/application/services/\nsrc/application/services/EntityService.ts\nsrc/application/services/ObservationService.ts\nsrc/application/services/ProjectService.ts\nsrc/application/services/RelationService.ts\nsrc/core/\nsrc/core/interfaces/\nsrc/core/interfaces/IDocument.ts\nsrc/core/interfaces/IEntity.ts\nsrc/core/interfaces/IFilterQuery.ts\nsrc/core/interfaces/IObservation.ts\nsrc/core/interfaces/IProject.ts\nsrc/core/interfaces/IRelation.ts\nsrc/core/interfaces/IRepository.ts\nsrc/core/interfaces/IServices.ts\nsrc/core/interfaces/IStorageProvider.ts\nsrc/examples/\nsrc/examples/usage-example.ts\nsrc/infrastructure/\nsrc/infrastructure/dependency-injection/\nsrc/infrastructure/dependency-injection/ApplicationSetup.ts\nsrc/infrastructure/dependency-injection/Container.ts\nsrc/infrastructure/repositories/\nsrc/infrastructure/repositories/BaseRepository.ts\nsrc/infrastructure/repositories/EntityRepository.ts\nsrc/infrastructure/repositories/ObservationRepository.ts\nsrc/infrastructure/repositories/ProjectRepository.ts\nsrc/infrastructure/repositories/RelationRepository.ts\nsrc/infrastructure/storage/\nsrc/infrastructure/storage/FileStorageProvider.ts\nsrc/presentation/\nsrc/presentation/mcp/\nsrc/presentation/mcp/MCPServer.ts\nsrc/presentation/mcp/MCPToolHandler.ts\nYou have not created a todo list yet. Create one with `update_todo_list` if your task is complicated or involves multiple steps.\n</environment_details>'}]]No fallback model group found for original model_group=ollama-bge-zh-embedding. Fallbacks=[{'claude-opus-4-5-20251101-thinking': ['claude-opus-4-5-20251101', 'claude-sonnet-4-5-20250929-thinking', 'gemini-2.5-pro-thinking']}, {'claude-sonnet-4-5-20250929-thinking': ['claude-sonnet-4-5-20251101', 'gemini-2.5-pro-thinking']}, {'gemini-2.5-pro-thinking': ['gemini-2.5-pro', 'claude-haiku-4-5-20251001-thinking']}, {'claude-haiku-4-5-20251001-thinking': ['claude-haiku-4-5-20251001']}, {'gemini-2.5-pro': ['claude-haiku-4-5-20251001']}, {'claude-opus-4-5-20251101': ['claude-sonnet-4-5-20250929', 'gemini-2.5-pro']}, {'claude-sonnet-4-5-20250929': ['gemini-2.5-pro']}, {'gemini-2.5-flash-thinking': ['gemini-2.5-flash', 'gpt-3.5-turbo']}, {'gemini-2.5-flash': ['gpt-3.5-turbo']}, {'gpt-3.5-turbo': ['ollama-stable-code']}, {'ollama-stable-code': ['gemini-2.5-flash']}]
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] use litellm auto_router not worker #11132

Problem (one or two sentences)

Context (who is affected and when)

Reproduction steps

Expected result

Actual result

Variations tried (optional)

App Version

API Provider (optional)

Model Used (optional)

Roo Code Task Links (optional)

Relevant logs or errors (optional)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] use litellm auto_router not worker #11132

Description

Problem (one or two sentences)

Context (who is affected and when)

Reproduction steps

Expected result

Actual result

Variations tried (optional)

App Version

API Provider (optional)

Model Used (optional)

Roo Code Task Links (optional)

Relevant logs or errors (optional)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions