-
Notifications
You must be signed in to change notification settings - Fork 3k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Problem (one or two sentences)
When I use littlem's auto_router automatic routing mode, it doesn't work properly, but accessing auto_router using curl or openai scripts works.
Context (who is affected and when)
use litellm auto_router
Reproduction steps
1.m1 mac macbook pro
2.litellm[proxy] 1.81.5
litellm/config.yaml:
model_list:
- model_name: ollama-stable-code
litellm_params:
model: ollama/stable-code:3b-code-q4_K_M
api_base: http://localhost:11434
model_info:
mode: chat
input_cost_per_token: 0.0 # 输入token成本(每百万)
output_cost_per_token: 0.0 # 输出token成本(每百万)
tags: ["backend", "frontend", "local", "default"]
id: local
- model_name: ollama-bge-zh-embedding
litellm_params:
model: ollama/quentinz/bge-large-zh-v1.5:q4_0
api_base: http://localhost:11434
encoding_format": float
model_info:
mode: embedding
input_cost_per_token: 0.0 # 输入token成本(每百万)
output_cost_per_token: 0.0 # 输出token成本(每百万)
tags: ["local-embedding"]
id: local-embedding
- model_name: "auto_router1"
litellm_params:
model: "auto_router/auto_router1"
auto_router_config_path: "./router.json"
auto_router_default_model: "ollama-stable-code"
auto_router_embedding_model: "ollama-bge-zh-embedding"
model_info:
mode: chat
input_cost_per_token: 0.0
output_cost_per_token: 0.0
tags: ["auto", "system"]
id: auto
litellm/router.json:
{
"encoder_type": "ollama",
"encoder_name": "quentinz/bge-large-zh-v1.5:q4_0",
"routes": [
{
"name": "gpt-3.5-turbo",
"utterances": ["ESLint配置", "Prettier配置", "Git Hooks", "前端CI/CD", "Monorepo配置", "TypeScript配置"],
"description": "前端工具链",
"score_threshold": 0.6,
"metadata": {
"group": "frontend",
"priority": 80
}
}
]$ curl -X POST http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer litellm" \
-d '{
"model": "auto_router1",
"messages": [{"role": "user", "content": "ESLint配置查看"}]
}'
{"id":"chatcmpl-D3wi1BwQsPZ6Y2h18UWTrvERNJdRU","created":1769833033,"model":"gpt-4.1-mini-2025-04-14","object":"chat.completion","system_fingerprint":"fp_3dcd5944f5","choices":[{"finish_reason":"stop","index":0,"message":{"content":"ESLint 配置是用于定义代码风格和代码质量检查规则的文件。你可以通过多种方式查看和管理 ESLint 的配置,下面是几种常用的方法:\n\n---\n\n### 1. 查看项目中的 ESLint 配置文件\n\nESLint 的配置文件通常放在项目根目录,常见的配置文件名有:\n\n- `.eslintrc.js`\n- `.eslintrc.json`\n- `.eslintrc.yaml` 或 `.eslintrc.yml`\n- `eslint.config.js` (ESLint 8+ 支持的配置文件)\n\n你可以直接打开这些文件查看配置内容。\n\n示例 `.eslintrc.js` 文件:\n\n```js\nmodule.exports = {\n env: {\n browser: true,\n es2021: true,\n },\n extends: [\n 'eslint:recommended',\n 'plugin:react/recommended',\n ],\n parserOptions: {\n ecmaFeatures: {\n jsx: true,\n },\n ecmaVersion: 12,\n sourceType: 'module',\n },\n plugins: [\n 'react',\n ],\n rules: {\n 'no-unused-vars': 'warn',\n 'react/prop-types': 'off',\n },\n};\n```\n\n---\n\n### 2. 使用命令行查看 ESLint 配置\n\n如果你安装了 ESLint,可以使用以下命令打印某个文件最终生效的配置规则:\n\n```bash\neslint --print-config path/to/file.js\n```\n\n这条命令会输出 ESLint 对 `path/to/file.js` 应用的完整配置,包含继承的配置和插件等,方便排查配置问题。\n\n---\n\n### 3. 在编辑器中查看配置(VSCode)\n\n如果你使用 Visual Studio Code,并安装了 ESLint 插件,可以:\n\n- 在 `.eslintrc.*` 文件中直接查看和编辑 ESLint 配置\n- 在报错的代码处查看规则说明\n- 插件会自动读取项目 ESLint 配置\n\n---\n\n### 4. 在项目 package.json 中查看\n\n有时项目会把 ESLint 配置写到 `package.json` 里:\n\n```json\n{\n \"eslintConfig\": {\n \"env\": {\n \"browser\": true,\n \"node\": true\n },\n \"extends\": \"eslint:recommended\",\n \"rules\": {\n \"semi\": [\"error\", \"always\"]\n }\n }\n}\n```\n\n---\n\n### 5. 配置层次和规则继承说明\n\n- ESLint 会合并继承的配置,比如 `\"extends\": [\"eslint:recommended\", \"plugin:react/recommended\"]`\n- 优先级:本地规则会覆盖继承规则\n- 可以通过 `overrides` 针对不同文件设置不同规则\n\n---\n\n如果你有更具体的需求,比如想查看某个规则的含义,或者如何配置 ESLint,可以告诉我,我帮你详细解答!","role":"assistant"}}],"usage":{"completion_tokens":619,"prompt_tokens":11,"total_tokens":630,"completion_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":0,"rejected_prediction_tokens":0},"prompt_tokens_details":{"audio_tokens":0,"cached_tokens":0}}}%
$ curl -X POST http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer litellm" \
-d '{
"model": "auto_router1",
"messages": [{"role": "user", "content": "how to code a program in python"}]
}'
{"id":"chatcmpl-84430275-6549-4e65-b548-799aa5de97cb","created":1769832935,"model":"ollama/stable-code:3b-code-q4_K_M","object":"chat.completion","choices":[{"finish_reason":"stop","index":0,"message":{"content":"","role":"assistant"}}],"usage":{"completion_tokens":1,"prompt_tokens":12,"total_tokens":13}}%when is use roocode ask input "ESLint配置查看",the litellm error,litellm use embedding models
Expected result
no error and output message
Actual result
api error
Variations tried (optional)
provider use litellm or openai both error
App Version
3.46.0
API Provider (optional)
LiteLLM
Model Used (optional)
No response
Roo Code Task Links (optional)
No response
Relevant logs or errors (optional)
12:19:30 - LiteLLM Router:INFO: router.py:1664 - litellm.acompletion(model=None) Exception Internal_litellm_router API call failed. Error: litellm.BadRequestError: Invalid input for ollama embeddings. input=[[{'type': 'text', 'text': '<user_message>\nESLint配置查看\n</user_message>'}, {'type': 'text', 'text': '<environment_details>\n# VSCode Visible Files\n\n\n# VSCode Open Tabs\n\n\n# Current Time\nCurrent time in ISO 8601 UTC format: 2026-01-31T04:19:21.492Z\nUser time zone: Asia/Shanghai, UTC+8:00\n\n# Current Cost\n$0.00\n\n# Current Mode\n<slug>ask</slug>\n<name>❓ Ask</name>\n<model>auto_router1</model>\n<tool_format>native</tool_format>\n\n\n# Current Workspace Directory (/Users/fire/Desktop/p-work/projects/xgmem) Files\n.gitignore\nbuild.mjs\nDockerfile\nindex.ts\njest.config.js\npackage-lock.json\npackage.json\nREADME.md\ntsconfig.json\nsrc/\nsrc/main.ts\nsrc/application/\nsrc/application/ProjectMemoryApplication.ts\nsrc/application/services/\nsrc/application/services/EntityService.ts\nsrc/application/services/ObservationService.ts\nsrc/application/services/ProjectService.ts\nsrc/application/services/RelationService.ts\nsrc/core/\nsrc/core/interfaces/\nsrc/core/interfaces/IDocument.ts\nsrc/core/interfaces/IEntity.ts\nsrc/core/interfaces/IFilterQuery.ts\nsrc/core/interfaces/IObservation.ts\nsrc/core/interfaces/IProject.ts\nsrc/core/interfaces/IRelation.ts\nsrc/core/interfaces/IRepository.ts\nsrc/core/interfaces/IServices.ts\nsrc/core/interfaces/IStorageProvider.ts\nsrc/examples/\nsrc/examples/usage-example.ts\nsrc/infrastructure/\nsrc/infrastructure/dependency-injection/\nsrc/infrastructure/dependency-injection/ApplicationSetup.ts\nsrc/infrastructure/dependency-injection/Container.ts\nsrc/infrastructure/repositories/\nsrc/infrastructure/repositories/BaseRepository.ts\nsrc/infrastructure/repositories/EntityRepository.ts\nsrc/infrastructure/repositories/ObservationRepository.ts\nsrc/infrastructure/repositories/ProjectRepository.ts\nsrc/infrastructure/repositories/RelationRepository.ts\nsrc/infrastructure/storage/\nsrc/infrastructure/storage/FileStorageProvider.ts\nsrc/presentation/\nsrc/presentation/mcp/\nsrc/presentation/mcp/MCPServer.ts\nsrc/presentation/mcp/MCPToolHandler.ts\nYou have not created a todo list yet. Create one with `update_todo_list` if your task is complicated or involves multiple steps.\n</environment_details>'}]]No fallback model group found for original model_group=ollama-bge-zh-embedding. Fallbacks=[{'claude-opus-4-5-20251101-thinking': ['claude-opus-4-5-20251101', 'claude-sonnet-4-5-20250929-thinking', 'gemini-2.5-pro-thinking']}, {'claude-sonnet-4-5-20250929-thinking': ['claude-sonnet-4-5-20251101', 'gemini-2.5-pro-thinking']}, {'gemini-2.5-pro-thinking': ['gemini-2.5-pro', 'claude-haiku-4-5-20251001-thinking']}, {'claude-haiku-4-5-20251001-thinking': ['claude-haiku-4-5-20251001']}, {'gemini-2.5-pro': ['claude-haiku-4-5-20251001']}, {'claude-opus-4-5-20251101': ['claude-sonnet-4-5-20250929', 'gemini-2.5-pro']}, {'claude-sonnet-4-5-20250929': ['gemini-2.5-pro']}, {'gemini-2.5-flash-thinking': ['gemini-2.5-flash', 'gpt-3.5-turbo']}, {'gemini-2.5-flash': ['gpt-3.5-turbo']}, {'gpt-3.5-turbo': ['ollama-stable-code']}, {'ollama-stable-code': ['gemini-2.5-flash']}]. Received Model Group=ollama-bge-zh-embedding
Available Model Group Fallbacks=None
Error doing the fallback: litellm.BadRequestError: Invalid input for ollama embeddings. input=[[{'type': 'text', 'text': '<user_message>\nESLint配置查看\n</user_message>'}, {'type': 'text', 'text': '<environment_details>\n# VSCode Visible Files\n\n\n# VSCode Open Tabs\n\n\n# Current Time\nCurrent time in ISO 8601 UTC format: 2026-01-31T04:19:21.492Z\nUser time zone: Asia/Shanghai, UTC+8:00\n\n# Current Cost\n$0.00\n\n# Current Mode\n<slug>ask</slug>\n<name>❓ Ask</name>\n<model>auto_router1</model>\n<tool_format>native</tool_format>\n\n\n# Current Workspace Directory (/Users/fire/Desktop/p-work/projects/xgmem) Files\n.gitignore\nbuild.mjs\nDockerfile\nindex.ts\njest.config.js\npackage-lock.json\npackage.json\nREADME.md\ntsconfig.json\nsrc/\nsrc/main.ts\nsrc/application/\nsrc/application/ProjectMemoryApplication.ts\nsrc/application/services/\nsrc/application/services/EntityService.ts\nsrc/application/services/ObservationService.ts\nsrc/application/services/ProjectService.ts\nsrc/application/services/RelationService.ts\nsrc/core/\nsrc/core/interfaces/\nsrc/core/interfaces/IDocument.ts\nsrc/core/interfaces/IEntity.ts\nsrc/core/interfaces/IFilterQuery.ts\nsrc/core/interfaces/IObservation.ts\nsrc/core/interfaces/IProject.ts\nsrc/core/interfaces/IRelation.ts\nsrc/core/interfaces/IRepository.ts\nsrc/core/interfaces/IServices.ts\nsrc/core/interfaces/IStorageProvider.ts\nsrc/examples/\nsrc/examples/usage-example.ts\nsrc/infrastructure/\nsrc/infrastructure/dependency-injection/\nsrc/infrastructure/dependency-injection/ApplicationSetup.ts\nsrc/infrastructure/dependency-injection/Container.ts\nsrc/infrastructure/repositories/\nsrc/infrastructure/repositories/BaseRepository.ts\nsrc/infrastructure/repositories/EntityRepository.ts\nsrc/infrastructure/repositories/ObservationRepository.ts\nsrc/infrastructure/repositories/ProjectRepository.ts\nsrc/infrastructure/repositories/RelationRepository.ts\nsrc/infrastructure/storage/\nsrc/infrastructure/storage/FileStorageProvider.ts\nsrc/presentation/\nsrc/presentation/mcp/\nsrc/presentation/mcp/MCPServer.ts\nsrc/presentation/mcp/MCPToolHandler.ts\nYou have not created a todo list yet. Create one with `update_todo_list` if your task is complicated or involves multiple steps.\n</environment_details>'}]]No fallback model group found for original model_group=ollama-bge-zh-embedding. Fallbacks=[{'claude-opus-4-5-20251101-thinking': ['claude-opus-4-5-20251101', 'claude-sonnet-4-5-20250929-thinking', 'gemini-2.5-pro-thinking']}, {'claude-sonnet-4-5-20250929-thinking': ['claude-sonnet-4-5-20251101', 'gemini-2.5-pro-thinking']}, {'gemini-2.5-pro-thinking': ['gemini-2.5-pro', 'claude-haiku-4-5-20251001-thinking']}, {'claude-haiku-4-5-20251001-thinking': ['claude-haiku-4-5-20251001']}, {'gemini-2.5-pro': ['claude-haiku-4-5-20251001']}, {'claude-opus-4-5-20251101': ['claude-sonnet-4-5-20250929', 'gemini-2.5-pro']}, {'claude-sonnet-4-5-20250929': ['gemini-2.5-pro']}, {'gemini-2.5-flash-thinking': ['gemini-2.5-flash', 'gpt-3.5-turbo']}, {'gemini-2.5-flash': ['gpt-3.5-turbo']}, {'gpt-3.5-turbo': ['ollama-stable-code']}, {'ollama-stable-code': ['gemini-2.5-flash']}]Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working