-
Notifications
You must be signed in to change notification settings - Fork 282
Add Provider Abstraction with support for Hosted API Calls #2192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer's GuideIntroduces a provider-agnostic chat abstraction and hosted API transport (starting with OpenAI), refactors chat message handling to a structured parts-based model, and updates file loaders, CLI, and tests to use the new abstractions and support hosted API models like openai://gpt-4o-mini. File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Summary of ChangesHello @ieaves, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the chat and API interaction components of the application. The primary goal is to introduce a robust and extensible abstraction for handling various chat API providers, making it easier to add support for new services and maintain existing ones. This involves creating a new set of structured data models for chat messages and a dedicated API transport layer, which streamlines how the application communicates with external language models. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and found some issues that need to be addressed.
- In _req, catching Exception and then calling e.read().decode() assumes all exceptions have a .read() method, which will fail for non-HTTP exceptions (e.g., URLError); consider checking for an HTTPError type or hasattr(e, 'read') before accessing it.
- In OpenAIHostedChatProvider.build_payload, you unconditionally access payload['max_completion_tokens'] after conditionally popping it, which can raise a KeyError when max_tokens is None; guard the zero-check with an 'if "max_completion_tokens" in payload' or similar.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In _req, catching Exception and then calling e.read().decode() assumes all exceptions have a .read() method, which will fail for non-HTTP exceptions (e.g., URLError); consider checking for an HTTPError type or hasattr(e, 'read') before accessing it.
- In OpenAIHostedChatProvider.build_payload, you unconditionally access payload['max_completion_tokens'] after conditionally popping it, which can raise a KeyError when max_tokens is None; guard the zero-check with an 'if "max_completion_tokens" in payload' or similar.
## Individual Comments
### Comment 1
<location> `ramalama/chat.py:62-65` </location>
<code_context>
self.request_in_process = False
self.prompt = args.prefix
self.url = f"{args.url}/chat/completions"
+ self.provider = provider or OpenAIChatProvider(args.url, getattr(args, "api_key", None))
+ add_api_key(self.args)
self.prep_rag_message()
self.mcp_agent: LLMAgent | None = None
</code_context>
<issue_to_address>
**suggestion:** Calling add_api_key() here is a no-op and may be misleading.
add_api_key() only returns a headers dict and logs a warning; it doesn’t mutate args. Since its return value is ignored here, the call has no effect and may imply that auth is being wired when it isn’t. Please either remove this call (the provider already gets api_key) or move/refactor it so its returned headers are actually used where requests are built.
```suggestion
self.url = f"{args.url}/chat/completions"
self.provider = provider or OpenAIChatProvider(args.url, getattr(args, "api_key", None))
self.prep_rag_message()
```
</issue_to_address>
### Comment 2
<location> `ramalama/chat.py:380-381` </location>
<code_context>
response = urllib.request.urlopen(request)
break
- except Exception:
+ except Exception as e:
+ print(e.read().decode())
if sys.stdout.isatty():
perror(f"\r{c}", end="", flush=True)
</code_context>
<issue_to_address>
**issue (bug_risk):** Unconditionally calling e.read() on a generic Exception can raise a new error and hide the original failure.
This assumes every Exception has a .read() method (like urllib.error.HTTPError), but many (e.g., connection errors, timeouts) do not, so this can raise AttributeError and hide the real error. Consider either catching HTTPError explicitly or checking hasattr(e, "read") before calling it, and for other exceptions log/print str(e) instead.
</issue_to_address>
### Comment 3
<location> `ramalama/chat_utils.py:145-146` </location>
<code_context>
+ if part.detail:
+ payload["detail"] = part.detail
+ return {"type": "image_url", "image_url": payload}
+ if isinstance(part, ImageBytesPart):
+ return {"type": "image_bytes", "image_bytes": {"data": part.data, "mime_type": part.mime_type}}
+ if isinstance(part, ToolCallPart):
+ return {"type": "tool_call", "name": part.name, "arguments": dict(part.arguments)}
</code_context>
<issue_to_address>
**issue (bug_risk):** Serializing ImageBytesPart with raw bytes will break JSON encoding for OpenAI-style providers.
For ImageBytesPart, serialize_part() returns a dict containing raw bytes. When OpenAIChatProvider later calls json.dumps() on the payload, this will fail with "Object of type bytes is not JSON serializable". Please convert part.data to a JSON-serializable form (e.g., base64 string or data URL) before returning it from serialize_part().
</issue_to_address>
### Comment 4
<location> `ramalama/chat_providers/openai.py:83-87` </location>
<code_context>
+class OpenAIHostedChatProvider(OpenAIChatProvider):
+ def build_payload(self, messages: Sequence[ChatMessage], options: ChatRequestOptions) -> dict[str, object]:
+ payload = super().build_payload(messages, options)
+ max_tokens = payload.pop("max_tokens", None)
+ if max_tokens is not None:
+ payload["max_completion_tokens"] = max_tokens
+
+ if payload["max_completion_tokens"] == 0:
+ payload.pop("max_completion_tokens")
+
</code_context>
<issue_to_address>
**issue (bug_risk):** Accessing payload["max_completion_tokens"] unconditionally can raise a KeyError when max_tokens is None.
When max_tokens is None (not present), we never set "max_completion_tokens", but we still do `if payload["max_completion_tokens"] == 0:`, which will raise a KeyError. Instead, compare the local variable (`if max_tokens == 0:`) or guard the dict access (`if "max_completion_tokens" in payload and payload["max_completion_tokens"] == 0:`).
</issue_to_address>
### Comment 5
<location> `test/unit/test_api_transport.py:10-19` </location>
<code_context>
+from ramalama.transports.api import APIProviderSpec, APITransport
+
+
+def test_api_transport_run(monkeypatch):
+ provider = APIProviderSpec("openai", "https://api.openai.com/v1")
+ transport = APITransport("gpt-4o-mini", provider)
+ recorded: dict[str, object] = {}
+
+ def fake_chat(args, operational_args=None, provider=None):
+ recorded["args"] = args
+ recorded["operational_args"] = operational_args
+ recorded["provider"] = provider
+
+ monkeypatch.setattr(api_module, "chat", fake_chat)
+
+ args = SimpleNamespace(container=True, engine="podman", url="http://localhost", model=None, api="none")
+ transport.run(args, [])
+
+ assert args.container is False
</code_context>
<issue_to_address>
**suggestion (testing):** Add tests to cover additional APITransport behaviors such as inspect(), exists(), and provider selection
Current tests cover run(), argument mutation, and API key resolution well. To better lock in behavior and prevent regressions, please also add tests for:
- inspect(): asserting the returned dict includes provider, model, and base_url as expected.
- exists(): verifying it always returns True, with a test name that documents this as intentional.
- _build_chat_provider(): confirming that `scheme == "openai"` yields an OpenAIHostedChatProvider and unknown schemes return None.
These will make APITransport’s contract clearer and safer to evolve.
Suggested implementation:
```python
from ramalama.transports import api as api_module
from ramalama.transports.api import APIProviderSpec, APITransport, OpenAIHostedChatProvider
```
```python
def test_api_transport_ensure_exists_mutates_args():
provider = APIProviderSpec("openai", "https://api.openai.com/v1")
transport = APITransport("gpt-4", provider)
args = SimpleNamespace(container=True, engine="podman", url="http://localhost", model=None, api="none")
# ensure_exists should mutate args in the same way as run()
transport.ensure_exists(args)
assert args.container is False
assert args.engine is None
assert args.url == provider.base_url
assert args.model == "gpt-4"
assert args.api == provider.scheme
def test_api_transport_inspect_includes_provider_model_and_base_url():
provider = APIProviderSpec("openai", "https://api.openai.com/v1")
transport = APITransport("gpt-4o-mini", provider)
info = transport.inspect()
# Contract: inspect() returns a dict that exposes provider, model, and base_url
assert isinstance(info, dict)
assert info["provider"] == provider.scheme
assert info["model"] == "gpt-4o-mini"
assert info["base_url"] == provider.base_url
def test_api_transport_exists_always_true():
provider = APIProviderSpec("openai", "https://api.openai.com/v1")
transport = APITransport("gpt-4o-mini", provider)
args = SimpleNamespace(container=True, engine="podman", url="http://localhost", model=None, api="none")
# Contract: exists() is intentionally always True for API transports
assert transport.exists(args) is True
def test_api_transport_build_chat_provider_openai_and_unknown_scheme():
provider = APIProviderSpec("openai", "https://api.openai.com/v1")
transport = APITransport("gpt-4o-mini", provider)
# openai scheme should yield an OpenAIHostedChatProvider
chat_provider = transport._build_chat_provider(provider)
assert isinstance(chat_provider, OpenAIHostedChatProvider)
# Unknown schemes should not build any provider and return None
unknown_provider = APIProviderSpec("unknown-scheme", "https://example.com")
assert transport._build_chat_provider(unknown_provider) is None
```
These edits assume the following existing signatures and behaviors:
1. APITransport.ensure_exists(args) exists and mutates args in the same way as run() with respect to:
- args.container (set to False)
- args.engine (set to None)
- args.url (set to provider.base_url)
- args.model (set to the transport's model)
- args.api (set to provider.scheme)
2. APITransport.inspect() returns a dict containing at least keys "provider", "model", and "base_url", where:
- "provider" is the provider.scheme
- "model" is the model string passed to APITransport
- "base_url" is provider.base_url
3. APITransport.exists(args) exists and always returns True.
4. APITransport._build_chat_provider() accepts an APIProviderSpec instance and:
- Returns an OpenAIHostedChatProvider when provider.scheme == "openai"
- Returns None for unknown schemes.
5. OpenAIHostedChatProvider is importable from ramalama.transports.api.
If any of these assumptions differ from the current implementation, you'll need to:
- Adjust the assertions in the tests to match the actual keys/values returned by inspect().
- Update the tests to match the actual signatures of ensure_exists(), exists(), and _build_chat_provider().
- Update the import of OpenAIHostedChatProvider to match its real module path or class name.
</issue_to_address>
### Comment 6
<location> `test/unit/test_file_loader_integration.py:11-12` </location>
<code_context>
from ramalama.file_loaders.file_types.txt import TXTFileLoader
+def _text_content(message):
+ return "".join(part.text for part in message.parts if isinstance(part, TextPart))
+
+
</code_context>
<issue_to_address>
**nitpick:** Use the shared _text_content helper consistently for readability and to avoid subtle inconsistencies
Some tests in this file still build message text manually (e.g., in test_chat_with_file_input_single_file) instead of calling _text_content. Please use _text_content everywhere you need the message’s text so future structure changes only require updating this helper, not each test.
</issue_to_address>
### Comment 7
<location> `ramalama/chat_utils.py:55` </location>
<code_context>
+ type: Literal["custom"] = "custom"
+
+
+MessagePart = TextPart | ImageURLPart | ImageBytesPart | ToolCallPart | ToolResultPart | CustomPart
+
+
</code_context>
<issue_to_address>
**issue (complexity):** Consider narrowing the message/part API to only the currently used text parts and adding a simple content accessor on ChatMessage to reduce cognitive load while keeping room for future features.
You can trim a lot of surface area here without losing any currently-used behavior.
Right now only simple text messages (and possibly no multimodal/tooling yet) are exercised, but there’s a full `MessagePart` hierarchy and `serialize_part` branches for six different kinds of parts. That makes the mental model heavier for readers and reviewers.
Two concrete simplifications that keep the provider abstraction and streaming logic intact:
1) Narrow `MessagePart` and `serialize_part` to what’s actually used
If only text is currently used in `ChatMessage` and serialization, you can keep the future-proof layout but only expose/handle `TextPart` for now. The other part types can be added back when you introduce multimodal/tooling.
For example:
```python
# Keep the dataclass for future use, but make it internal / unused for now
@dataclass(slots=True)
class TextPart:
text: str
type: Literal["text"] = "text"
# Only expose the part(s) you actually use
MessagePart = TextPart # | OtherPartTypesLater
def serialize_part(part: MessagePart) -> dict[str, Any]:
# Only handle supported parts
if isinstance(part, TextPart):
return {"type": "text", "text": part.text}
raise TypeError(f"Unsupported message part: {part!r}")
```
And in `__all__`:
```python
__all__ = [
"ChatMessage",
"MessagePart",
"TextPart",
"add_api_key",
"default_prefix",
"stream_response",
"serialize_part",
]
```
You can keep the other part classes in the file but not include them in `MessagePart` or `__all__` until they’re actually needed. That:
- Removes unused branches from `serialize_part`.
- Reduces the public API surface the reader has to understand.
- Still keeps the file ready to grow when multimodal/tooling support is implemented.
2) Simplify `ChatMessage` usage with a `content` property
Since you always create messages with a single `TextPart`, you can make that pattern explicit and easier to read, while keeping the `parts` structure internally:
```python
@dataclass(slots=True)
class ChatMessage:
role: RoleType
parts: list[MessagePart] = field(default_factory=list)
metadata: MutableMapping[str, Any] = field(default_factory=dict)
@property
def content(self) -> str:
# Simple view for the current use case: first text part or empty string
for part in self.parts:
if isinstance(part, TextPart):
return part.text
return ""
@staticmethod
def system(text: str) -> "ChatMessage":
return ChatMessage(role="system", parts=[TextPart(text=text)])
@staticmethod
def user(text: str) -> "ChatMessage":
return ChatMessage(role="user", parts=[TextPart(text=text)])
@staticmethod
def assistant(text: str) -> "ChatMessage":
return ChatMessage(role="assistant", parts=[TextPart(text=text)])
@staticmethod
def tool(text: str) -> "ChatMessage":
return ChatMessage(role="tool", parts=[TextPart(text=text)])
```
Callers can then work with `message.content` instead of thinking about parts, while the internal structure remains ready for future expansion.
These two changes keep the new abstraction (provider-agnostic streaming, `ChatMessage` type, etc.) but significantly reduce the immediate cognitive load by:
- Eliminating unused branches and types from the public path.
- Presenting a simple “text message” mental model to users of the API.
</issue_to_address>
### Comment 8
<location> `ramalama/chat_providers/base.py:56` </location>
<code_context>
+ self.payload = payload
+
+
+class ChatProvider(ABC):
+ """Abstract base class for hosted chat providers."""
+
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the ChatProvider customization hooks by collapsing multiple header and path extension points into single overridable methods to make request construction easier to follow.
You can trim some indirection in the base provider without losing any current/extensibility benefits.
1) Collapse header hooks into one
`prepare_headers` currently orchestrates three sources:
- `self._default_headers`
- `self.provider_headers(options)`
- `self.additional_request_headers(options)`
…but all of that can be expressed as a single customization point. This keeps the same power while making it easier to see where headers come from.
Suggested refactor:
- Replace `provider_headers` and `additional_request_headers` with a single `request_headers` override.
- Wire `create_request` to call only `request_headers`.
Example:
```python
class ChatProvider(ABC):
...
def prepare_headers(
self,
*,
include_auth: bool = True,
extra: Mapping[str, str] | None = None,
options: ChatRequestOptions | None = None,
) -> dict[str, str]:
headers: dict[str, str] = {"Content-Type": "application/json"}
headers.update(self._default_headers)
# Single customization point
headers.update(self.request_headers(options))
if include_auth:
headers.update(self.auth_headers())
if extra:
headers.update(extra)
return headers
# Replace the two hooks with one
def request_headers(self, options: ChatRequestOptions | None = None) -> Mapping[str, str]:
return {}
```
Then simplify `create_request`:
```python
def create_request(self, messages: Sequence[ChatMessage], options: ChatRequestOptions) -> urllib_request.Request:
payload = self.build_payload(messages, options)
headers = self.prepare_headers(options=options)
body = self.serialize_payload(payload)
return urllib_request.Request(
self.build_url(), # see path suggestion below
data=body,
headers=headers,
method="POST",
)
```
If a specific provider needs extra headers, it just overrides `request_headers`:
```python
class OpenAIChatProvider(ChatProvider):
def request_headers(self, options: ChatRequestOptions | None = None) -> Mapping[str, str]:
return {"OpenAI-Beta": "assistants=v1"}
```
2) Inline `resolve_request_path` into `build_url`
`resolve_request_path` currently just returns `self.default_path` and is only used in `create_request`. You can keep the flexibility by letting `build_url` accept an optional `options` argument and make it the single path hook.
Base class:
```python
class ChatProvider(ABC):
default_path: str = "/chat/completions"
def build_url(self, options: ChatRequestOptions | None = None) -> str:
# Default behavior – subclasses can override if they need different paths
rel = self.default_path
if not rel.startswith("/"):
rel = f"/{rel}"
return f"{self.base_url}{rel}"
```
Usage in `create_request`:
```python
def create_request(self, messages: Sequence[ChatMessage], options: ChatRequestOptions) -> urllib_request.Request:
payload = self.build_payload(messages, options)
headers = self.prepare_headers(options=options)
body = self.serialize_payload(payload)
return urllib_request.Request(
self.build_url(options),
data=body,
headers=headers,
method="POST",
)
```
If a provider needs to change paths based on options later, it overrides `build_url(options)`:
```python
class AzureChatProvider(ChatProvider):
def build_url(self, options: ChatRequestOptions | None = None) -> str:
deployment = getattr(options, "deployment", "default")
rel = f"/openai/deployments/{deployment}/chat/completions"
return f"{self.base_url}{rel}"
```
This keeps all current behavior and extensibility, but reduces the number of indirection points a reader has to mentally track for “how do we get from options → headers → URL → request?”.
</issue_to_address>
### Comment 9
<location> `ramalama/chat_providers/openai.py:80` </location>
<code_context>
+ return payload
+
+
+class OpenAIHostedChatProvider(OpenAIChatProvider):
+ def build_payload(self, messages: Sequence[ChatMessage], options: ChatRequestOptions) -> dict[str, object]:
+ payload = super().build_payload(messages, options)
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the hosted provider payload handling and optionally inlining delta extraction logic to reduce branching and helper indirection.
You can simplify a couple of spots without changing behavior:
1) Avoid the extra `max_completion_tokens` mutation/potential KeyError
You can make the `OpenAIHostedChatProvider.build_payload` logic more direct by only setting `max_completion_tokens` when it has a meaningful value. This removes the second conditional and avoids relying on the key being present:
```python
class OpenAIHostedChatProvider(OpenAIChatProvider):
def build_payload(
self,
messages: Sequence[ChatMessage],
options: ChatRequestOptions,
) -> dict[str, object]:
payload = super().build_payload(messages, options)
max_tokens = payload.pop("max_tokens", None)
# Only include when explicitly set and non-zero
if max_tokens not in (None, 0):
payload["max_completion_tokens"] = max_tokens
return payload
```
This keeps the existing semantics:
- `max_tokens` unset → not present in payload
- `max_tokens == 0` → treated as “no limit” / removed
- `max_tokens > 0` → sent as `max_completion_tokens`
but reduces branching and the temporary inconsistent state where `max_completion_tokens` is set only to be removed.
2) Inline `_extract_delta` to reduce indirection (optional)
Given that this provider is OpenAI-specific and the JSON shape is fixed, inlining `_extract_delta` into `parse_stream_chunk` can make the flow easier to follow without losing robustness:
Current:
```python
delta = self._extract_delta(parsed)
if delta:
events.append(ChatStreamEvent(text=delta, raw=parsed))
```
Simplified inline version:
```python
try:
parsed = json.loads(payload)
except json.JSONDecodeError:
continue
choices = parsed.get("choices")
if (
isinstance(choices, list) and choices
and isinstance(choices[0], Mapping)
):
delta = choices[0].get("delta")
if isinstance(delta, Mapping):
content = delta.get("content")
if isinstance(content, str):
events.append(ChatStreamEvent(text=content, raw=parsed))
```
This removes one layer of helper indirection and keeps all the defensive checks, while making it clear that you only care about the first choice’s `delta.content` for now.
</issue_to_address>
### Comment 10
<location> `test/unit/test_file_loader_integration.py:340-343` </location>
<code_context>
</code_context>
<issue_to_address>
**issue (code-quality):** Avoid loops in tests. ([`no-loop-in-tests`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/no-loop-in-tests))
<details><summary>Explanation</summary>Avoid complex code, like loops, in test functions.
Google's software engineering guidelines says:
"Clear tests are trivially correct upon inspection"
To reach that avoid complex code in tests:
* loops
* conditionals
Some ways to fix this:
* Use parametrized tests to get rid of the loop.
* Move the complex logic into helpers.
* Move the complex part into pytest fixtures.
> Complexity is most often introduced in the form of logic. Logic is defined via the imperative parts of programming languages such as operators, loops, and conditionals. When a piece of code contains logic, you need to do a bit of mental computation to determine its result instead of just reading it off of the screen. It doesn't take much logic to make a test more difficult to reason about.
Software Engineering at Google / [Don't Put Logic in Tests](https://abseil.io/resources/swe-book/html/ch12.html#donapostrophet_put_logic_in_tests)
</details>
</issue_to_address>
### Comment 11
<location> `test/unit/test_file_loader_integration.py:374-379` </location>
<code_context>
</code_context>
<issue_to_address>
**issue (code-quality):** Avoid conditionals in tests. ([`no-conditionals-in-tests`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/no-conditionals-in-tests))
<details><summary>Explanation</summary>Avoid complex code, like conditionals, in test functions.
Google's software engineering guidelines says:
"Clear tests are trivially correct upon inspection"
To reach that avoid complex code in tests:
* loops
* conditionals
Some ways to fix this:
* Use parametrized tests to get rid of the loop.
* Move the complex logic into helpers.
* Move the complex part into pytest fixtures.
> Complexity is most often introduced in the form of logic. Logic is defined via the imperative parts of programming languages such as operators, loops, and conditionals. When a piece of code contains logic, you need to do a bit of mental computation to determine its result instead of just reading it off of the screen. It doesn't take much logic to make a test more difficult to reason about.
Software Engineering at Google / [Don't Put Logic in Tests](https://abseil.io/resources/swe-book/html/ch12.html#donapostrophet_put_logic_in_tests)
</details>
</issue_to_address>
### Comment 12
<location> `test/unit/test_file_loader_integration.py:440-443` </location>
<code_context>
</code_context>
<issue_to_address>
**issue (code-quality):** Avoid loops in tests. ([`no-loop-in-tests`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/no-loop-in-tests))
<details><summary>Explanation</summary>Avoid complex code, like loops, in test functions.
Google's software engineering guidelines says:
"Clear tests are trivially correct upon inspection"
To reach that avoid complex code in tests:
* loops
* conditionals
Some ways to fix this:
* Use parametrized tests to get rid of the loop.
* Move the complex logic into helpers.
* Move the complex part into pytest fixtures.
> Complexity is most often introduced in the form of logic. Logic is defined via the imperative parts of programming languages such as operators, loops, and conditionals. When a piece of code contains logic, you need to do a bit of mental computation to determine its result instead of just reading it off of the screen. It doesn't take much logic to make a test more difficult to reason about.
Software Engineering at Google / [Don't Put Logic in Tests](https://abseil.io/resources/swe-book/html/ch12.html#donapostrophet_put_logic_in_tests)
</details>
</issue_to_address>
### Comment 13
<location> `ramalama/chat.py:152-154` </location>
<code_context>
def _format_message_for_summary(self, msg: ChatMessage) -> str:
text_parts = [part.text for part in msg.parts if isinstance(part, TextPart)]
if not text_parts:
return msg.role
return f"{msg.role}: {' '.join(text_parts)}"
</code_context>
<issue_to_address>
**suggestion (code-quality):** We've found these issues:
- Lift code into else after jump in control flow ([`reintroduce-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/reintroduce-else/))
- Replace if statement with if expression ([`assign-if-exp`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/assign-if-exp/))
```suggestion
return msg.role if not text_parts else f"{msg.role}: {' '.join(text_parts)}"
```
</issue_to_address>
### Comment 14
<location> `ramalama/chat_providers/base.py:33-34` </location>
<code_context>
def to_dict(self) -> dict[str, Any]:
keys = ["model", "temperature", "max_tokens", "stream"]
result = {k: v for k in keys if (v := getattr(self, k)) is not None}
result |= {} if self.extra is None else dict(self.extra)
return result
</code_context>
<issue_to_address>
**suggestion (code-quality):** Merge dictionary updates via the union operator ([`dict-assign-update-to-union`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/dict-assign-update-to-union/))
```suggestion
result = {k: v for k in keys if (v := getattr(self, k)) is not None} | (
{} if self.extra is None else dict(self.extra)
)
```
</issue_to_address>
### Comment 15
<location> `ramalama/chat_providers/base.py:86` </location>
<code_context>
def prepare_headers(
self,
*,
include_auth: bool = True,
extra: Mapping[str, str] | None = None,
options: ChatRequestOptions | None = None,
) -> dict[str, str]:
headers: dict[str, str] = {"Content-Type": "application/json"}
headers.update(self._default_headers)
headers.update(self.provider_headers(options))
if include_auth:
headers.update(self.auth_headers())
if extra:
headers.update(extra)
return headers
</code_context>
<issue_to_address>
**suggestion (code-quality):** Merge dictionary updates via the union operator ([`dict-assign-update-to-union`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/dict-assign-update-to-union/))
```suggestion
headers |= self._default_headers
```
</issue_to_address>
### Comment 16
<location> `ramalama/chat_providers/base.py:95-97` </location>
<code_context>
def auth_headers(self) -> Mapping[str, str]:
if not self.api_key:
return {}
return {"Authorization": f"Bearer {self.api_key}"}
</code_context>
<issue_to_address>
**suggestion (code-quality):** We've found these issues:
- Lift code into else after jump in control flow ([`reintroduce-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/reintroduce-else/))
- Replace if statement with if expression ([`assign-if-exp`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/assign-if-exp/))
```suggestion
return {} if not self.api_key else {"Authorization": f"Bearer {self.api_key}"}
```
</issue_to_address>
### Comment 17
<location> `ramalama/chat_providers/base.py:150-152` </location>
<code_context>
def parse_response_body(self, body: bytes) -> Any:
if not body:
return None
return json.loads(body.decode("utf-8"))
</code_context>
<issue_to_address>
**suggestion (code-quality):** We've found these issues:
- Lift code into else after jump in control flow ([`reintroduce-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/reintroduce-else/))
- Replace if statement with if expression ([`assign-if-exp`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/assign-if-exp/))
```suggestion
return None if not body else json.loads(body.decode("utf-8"))
```
</issue_to_address>
### Comment 18
<location> `ramalama/chat_providers/openai.py:50-51` </location>
<code_context>
def parse_stream_chunk(self, chunk: bytes) -> Iterable[ChatStreamEvent]:
events: list[ChatStreamEvent] = []
self._stream_buffer += chunk.decode("utf-8")
while "\n\n" in self._stream_buffer:
raw_event, self._stream_buffer = self._stream_buffer.split("\n\n", 1)
raw_event = raw_event.strip()
if not raw_event:
continue
for line in raw_event.splitlines():
if not line.startswith("data:"):
continue
payload = line[len("data:") :].strip()
if not payload:
continue
if payload == "[DONE]":
events.append(ChatStreamEvent(done=True))
continue
try:
parsed = json.loads(payload)
except json.JSONDecodeError:
continue
delta = self._extract_delta(parsed)
if delta:
events.append(ChatStreamEvent(text=delta, raw=parsed))
return events
</code_context>
<issue_to_address>
**suggestion (code-quality):** Use named expression to simplify assignment and conditional ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
```suggestion
if delta := self._extract_delta(parsed):
```
</issue_to_address>
### Comment 19
<location> `ramalama/chat_providers/openai.py:71-76` </location>
<code_context>
def _serialize_message(self, message: ChatMessage) -> dict[str, Any]:
payload: dict[str, Any] = {"role": message.role}
if message.parts:
payload["content"] = [serialize_part(part) for part in message.parts]
else:
payload["content"] = ""
payload.update(message.metadata)
return payload
</code_context>
<issue_to_address>
**suggestion (code-quality):** We've found these issues:
- Merge dictionary assignment with declaration ([`merge-dict-assign`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/merge-dict-assign/))
- Replace if statement with if expression ([`assign-if-exp`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/assign-if-exp/))
- Merge dictionary updates via the union operator ([`dict-assign-update-to-union`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/dict-assign-update-to-union/))
```suggestion
payload: dict[str, Any] = {
"role": message.role,
"content": (
[serialize_part(part) for part in message.parts]
if message.parts
else ""
),
}
payload |= message.metadata
```
</issue_to_address>
### Comment 20
<location> `ramalama/chat_utils.py:115-117` </location>
<code_context>
def default_prefix() -> str:
if not EMOJI:
return "> "
if CONFIG.prefix:
return CONFIG.prefix
engine = CONFIG.engine
if engine:
if os.path.basename(engine) == "podman":
return "🦭 > "
if os.path.basename(engine) == "docker":
return "🐋 > "
return "🦙 > "
</code_context>
<issue_to_address>
**suggestion (code-quality):** Use named expression to simplify assignment and conditional ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
```suggestion
if engine := CONFIG.engine:
```
</issue_to_address>
### Comment 21
<location> `ramalama/cli.py:1147-1150` </location>
<code_context>
def run_cli(args):
try:
# detect available port and update arguments
args.port = compute_serving_port(args)
model = New(args.MODEL, args)
model.ensure_model_exists(args)
except KeyError as e:
logger.debug(e)
try:
args.quiet = True
model = TransportFactory(args.MODEL, args, ignore_stderr=True).create_oci()
model.ensure_model_exists(args)
except Exception as exc:
raise e from exc
is_api_transport = isinstance(model, APITransport)
if args.rag and is_api_transport:
raise ValueError("ramalama run --rag is not supported for hosted API transports.")
if args.rag:
if not args.container:
raise ValueError("ramalama run --rag cannot be run with the --nocontainer option.")
args = _rag_args(args)
model = RagTransport(model, assemble_command(args.model_args), args)
model.ensure_model_exists(args)
server_cmd: list[str] = []
if not isinstance(model, APITransport):
server_cmd = assemble_command(args)
model.run(args, server_cmd)
</code_context>
<issue_to_address>
**suggestion (code-quality):** We've found these issues:
- Move setting of default value for variable into `else` branch ([`introduce-default-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/introduce-default-else/))
- Swap if/else branches ([`swap-if-else-branches`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/swap-if-else-branches/))
- Replace if statement with if expression ([`assign-if-exp`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/assign-if-exp/))
```suggestion
server_cmd = [] if isinstance(model, APITransport) else assemble_command(args)
```
</issue_to_address>
### Comment 22
<location> `ramalama/transports/api.py:68` </location>
<code_context>
def run(self, args, server_cmd: list[str]): # pragma: no cover - exercised via CLI integration
"""Connect directly to the provider instead of launching a local server."""
args.container = False
args.engine = None
args.url = self.base_url
if not getattr(args, "api_key", None):
key = self._resolve_api_key()
if key:
args.api_key = key
args.model = self.model_name
args.api = self.provider.scheme
provider = self._build_chat_provider(args)
chat(args, provider=provider)
return 0
</code_context>
<issue_to_address>
**issue (code-quality):** Use named expression to simplify assignment and conditional ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
</issue_to_address>
### Comment 23
<location> `ramalama/transports/api.py:99-100` </location>
<code_context>
def _resolve_api_key(self) -> str | None:
resolver = PROVIDER_API_KEY_RESOLVERS.get(self.provider.scheme)
if resolver:
return resolver()
return CONFIG.api_key
</code_context>
<issue_to_address>
**suggestion (code-quality):** Use named expression to simplify assignment and conditional ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
```suggestion
if resolver := PROVIDER_API_KEY_RESOLVERS.get(self.provider.scheme):
```
</issue_to_address>
### Comment 24
<location> `ramalama/transports/transport_factory.py:51-52` </location>
<code_context>
def detect_model_model_type(self) -> tuple[type[CLASS_MODEL_TYPES], Callable[[], CLASS_MODEL_TYPES]]:
api_spec = self._match_api_provider()
if api_spec:
self._api_provider_spec = api_spec
return APITransport, self.create_api_transport
for prefix in ["huggingface://", "hf://", "hf.co/"]:
if self.model.startswith(prefix):
return Huggingface, self.create_huggingface
for prefix in ["modelscope://", "ms://"]:
if self.model.startswith(prefix):
return ModelScope, self.create_modelscope
for prefix in ["ollama://", "ollama.com/library/"]:
if self.model.startswith(prefix):
return Ollama, self.create_ollama
for prefix in ["oci://", "docker://"]:
if self.model.startswith(prefix):
return OCI, self.create_oci
if self.model.startswith("rlcr://"):
return RamalamaContainerRegistry, self.create_rlcr
for prefix in ["http://", "https://", "file://"]:
if self.model.startswith(prefix):
return URL, self.create_url
if self.transport == "huggingface":
return Huggingface, self.create_huggingface
if self.transport == "modelscope":
return ModelScope, self.create_modelscope
if self.transport == "ollama":
return Ollama, self.create_ollama
if self.transport == "rlcr":
return RamalamaContainerRegistry, self.create_rlcr
if self.transport == "oci":
return OCI, self.create_oci
raise KeyError(f'transport "{self.transport}" not supported. Must be oci, huggingface, modelscope, or ollama.')
</code_context>
<issue_to_address>
**suggestion (code-quality):** Use named expression to simplify assignment and conditional ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
```suggestion
if api_spec := self._match_api_provider():
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant and well-designed architectural improvement by abstracting chat providers. This decouples the core logic from specific API implementations and improves maintainability. The transition to structured ChatMessage objects is a great step for type safety and code clarity. The new APITransport for hosted models is also a valuable addition, and it's well-integrated into the CLI. The accompanying test updates and additions are thorough. I've identified a few potential issues related to error handling and edge cases that could improve the robustness of the new implementation, but overall, this is an excellent contribution.
Signed-off-by: Ian Eaves <[email protected]>
Signed-off-by: Ian Eaves <[email protected]>
…g cleanup Signed-off-by: Ian Eaves <[email protected]>
Signed-off-by: Ian Eaves <[email protected]>
Signed-off-by: Ian Eaves <[email protected]>
Signed-off-by: Ian Eaves <[email protected]>
Signed-off-by: Ian Eaves <[email protected]>
30cad54 to
a3b4047
Compare
|
@rhatdan would you be willing to take a look at this? The primary goal was supporting multiple API backends with different chat implementations but core improvements include:
|
Signed-off-by: Ian Eaves <[email protected]>
Signed-off-by: Ian Eaves <[email protected]>
Signed-off-by: Ian Eaves <[email protected]>
Signed-off-by: Ian Eaves <[email protected]>
Signed-off-by: Ian Eaves <[email protected]>
Signed-off-by: Ian Eaves <[email protected]>
Signed-off-by: Ian Eaves <[email protected]>
|
@olliewalsh PTAL |
Summary by Sourcery
Introduce a provider-agnostic chat abstraction and hosted API transport support while updating file loading and testing to use structured chat messages.
New Features:
Enhancements:
Tests: