Add local OpenAI-compatible backend routing by Pro777 · Pull Request #68 · huggingface/ml-intern

Pro777 · 2026-04-23T03:47:32Z

Summary

Implements the narrow local-backend slice of #67 without pulling in the larger provider-registry work from #55 / #60.

This PR adds direct routing for these OpenAI-compatible prefixes:

ollama/<model>
lm_studio/<model>
vllm/<model>
openai-compat/<model>

Each prefix resolves to a configurable api_base + API key env var while keeping the existing Hugging Face Router fallback path unchanged.

Why this shape

#67 asks for an escape hatch to local OpenAI-compatible backends. There is already broader work in #55 / #60 around provider registries and additional hosted providers.

This PR intentionally stays smaller:

no new provider catalog
no frontend model-picker changes
no session-manager or restore-summary changes
no Rowan-specific defaults

It only teaches the current resolver and CLI validation layer about local/OpenAI-compatible prefixes.

New prefixes

ollama/ -> OLLAMA_BASE_URL (default http://localhost:11434/v1)
lm_studio/ -> LMSTUDIO_BASE_URL (default http://127.0.0.1:1234/v1)
vllm/ -> VLLM_BASE_URL (default http://127.0.0.1:8000/v1)
openai-compat/ -> OPENAI_COMPAT_BASE_URL (default http://127.0.0.1:8000/v1)

The local prefixes set sensible default API keys for backends that typically ignore auth (ollama, lm-studio, EMPTY) while still allowing override via env.

Tests

Ran:

uv run --with pytest pytest tests/ -v --tb=short

Result:

5 passed

The new tests cover:

Ollama param resolution
LM Studio param resolution
vLLM param resolution
generic OpenAI-compatible param resolution
/model validation for the new local prefixes

Closes #67.

statikkkkk · 2026-04-24T03:18:08Z

Drive-by triage from a community contributor (no commit access). Sharing one-pass review notes to help maintainers prioritize. Verify before acting.

Verdict: NEEDS-CHANGES · Risk: MED (potential SSRF) · State: Mergeable

_resolve_openai_compat_params in llm_params.py reads api_base directly from env vars with no validation of scheme or host. If this agent ever runs in a shared/multi-tenant context, a malicious OPENAI_COMPAT_BASE_URL could route LLM calls to internal metadata endpoints (SSRF). Even in single-user mode, document explicitly. Also _DIRECT_PREFIXES in model_switcher.py is defined but is_valid_model_id already checks startswith — confirm there's no double-check divergence. There's significant overlap with #60 (which also adds lm_studio/); coordinate before merge. Tests cover only the happy path; add a test for env-var override.

akseljoonas · 2026-04-24T14:08:17Z

@claude please review this PR. Focus on correctness, security, LiteLLM/Bedrock routing impact, and whether it's safe to merge against current main. Keep it concise and prefer inline comments where it matters.

claude · 2026-04-24T14:08:37Z

Claude finished @akseljoonas's task in 1m 42s —— View job

PR #68 Review: Add local OpenAI-compatible backend routing

Tasks

Gather context (read all changed files)
Analyze correctness and logic
Review security implications (SSRF, env-var handling)
Check LiteLLM/Bedrock routing impact
Assess merge safety vs current main

Verdict: Conditionally safe to merge

The core logic is correct and additive — no existing routing paths are touched. Two items need resolution before merge: overlap with #60, and a _DIRECT_PREFIXES duplication that will silently diverge. Everything else is low-risk or informational.

Correctness

Routing logic (llm_params.py:115–145) — clean. Prefix iteration, model stripping (model_name[len(prefix):]), and rstrip("/") on api_base are all correct. LiteLLM's openai/{model} + custom api_base is the right pattern for OpenAI-compat backends.

Effort handling — correct. extra_body is used for reasoning_effort so local servers can silently ignore it. strict=True raises UnsupportedEffortError cleanly, which the cascade in effort_probe.py:165–169 catches and advances past. Integration with the existing cascade is correct with no regressions.

is_valid_model_id (model_switcher.py:46–66) — this function doesn't actually use _DIRECT_PREFIXES. It only checks for two non-empty /-separated parts. That's fine — but _DIRECT_PREFIXES being a separate list from _OPENAI_COMPAT_PROVIDERS is a maintainability trap (see below).

Issues

[Blocking] Overlap with #60 — #60 also adds lm_studio/. Whichever merges second will conflict. Coordinate with the #60 author before merging either.

[Medium] _DIRECT_PREFIXES must be kept in sync manually

model_switcher.py:36–43 defines its own prefix list that duplicates keys from _OPENAI_COMPAT_PROVIDERS. _print_hf_routing_info uses _DIRECT_PREFIXES to skip the HF catalog lookup — if a new prefix is added to _OPENAI_COMPAT_PROVIDERS but not here, /model will print a spurious "isn't in the HF router catalog" warning. The fix is trivial: Fix this →

# model_switcher.py — instead of a hardcoded tuple:
from agent.core.llm_params import _OPENAI_COMPAT_PROVIDERS
_DIRECT_PREFIXES = ("anthropic/", "openai/", *_OPENAI_COMPAT_PROVIDERS)

[Low] No URL validation on api_base (llm_params.py:127–129)

os.environ.get(config["api_base_env"], ...) flows directly into LiteLLM with no scheme/host check. In a single-user local deployment this is fine. In a multi-tenant hosted context (HF Spaces) where an operator could set env vars, a file:// or http://169.254.169.254/ value would cause the agent to make requests to internal endpoints. Worth a short docstring note that these env vars are admin-controlled and not user-controllable. Fix this →

Test gaps

The existing tests only exercise the happy path. These cases are uncovered:

Missing case	Why it matters
`OLLAMA_BASE_URL` override (custom URL)	Verifies env-var override actually wins over the default
`strict=True` + unsupported effort (e.g. `max` on ollama)	The cascade depends on `UnsupportedEffortError` being raised
No `reasoning_effort` arg	Confirm `extra_body` key is absent (not just `None`)
Trailing slash on env var URL	Confirm `rstrip("/")` actually fires

The 5 passing tests are a reasonable baseline for a first merge, but the strict-mode path in particular is exercised only through integration and should have a unit test. Fix this →

LiteLLM / Bedrock routing impact

No impact on Bedrock. The PR adds no bedrock/ prefix and doesn't touch any Bedrock-adjacent code paths.

LiteLLM routing — clean integration. openai/{model} + custom api_base is the documented LiteLLM pattern; it doesn't interfere with anthropic/, openai/, or HF router fallback paths. The _patch_litellm_effort_validation() call is unchanged.

Merge safety summary

Concern	Severity	Blocks merge?
Overlap with #60 (`lm_studio/`)	Medium	Yes — coordinate first
`_DIRECT_PREFIXES` duplication	Medium	No, but fix before it drifts
SSRF via env-var `api_base`	Low (admin-controlled env)	No
Test coverage gaps	Low	No
LiteLLM/Bedrock routing	None	—

The implementation is solid and intentionally scoped. Resolve the #60 conflict, fix the _DIRECT_PREFIXES derivation, and this is ready.

Add local OpenAI-compatible routing

415c33b

andrejvysny mentioned this pull request Apr 23, 2026

Draft: Add new providers via adapters #60

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add local OpenAI-compatible backend routing#68

Add local OpenAI-compatible backend routing#68
Pro777 wants to merge 1 commit intohuggingface:mainfrom
Spitfire-Cowboy:Pro777/openai-compat-local

Pro777 commented Apr 23, 2026

Uh oh!

statikkkkk commented Apr 24, 2026

Uh oh!

akseljoonas commented Apr 24, 2026

Uh oh!

claude Bot commented Apr 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Pro777 commented Apr 23, 2026

Summary

Why this shape

New prefixes

Tests

Uh oh!

statikkkkk commented Apr 24, 2026

Uh oh!

akseljoonas commented Apr 24, 2026

Uh oh!

claude Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR #68 Review: Add local OpenAI-compatible backend routing

Verdict: Conditionally safe to merge

Correctness

Issues

Test gaps

LiteLLM / Bedrock routing impact

Merge safety summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

claude Bot commented Apr 24, 2026 •

edited

Loading