fix: harden OpenAI attachment recovery#7004
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the OpenAI integration by improving the reliability of image attachment processing and error recovery. It addresses issues where image URLs might fail to load or be rejected by the API, by converting them into data URLs and implementing a fallback mechanism for 'INVALID_ATTACHMENT' errors. This ensures a more stable user experience when interacting with vision models, while also adding extensive test coverage for different image handling scenarios. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In
_resolve_image_partthe HTTP branch opens the image twice (once via_is_valid_image_file, once viaencode_image_bs64, which itself re-opens the file for mime detection and encoding); consider consolidating validation/mime detection/encoding into a single helper to avoid duplicate I/O and simplify error handling. - The
file://and local-path branches in_resolve_image_partdon’t perform the_is_valid_image_filecheck that the HTTP branch does; it may be worth aligning these paths so malformed or non-image files are handled consistently regardless of their origin.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `_resolve_image_part` the HTTP branch opens the image twice (once via `_is_valid_image_file`, once via `encode_image_bs64`, which itself re-opens the file for mime detection and encoding); consider consolidating validation/mime detection/encoding into a single helper to avoid duplicate I/O and simplify error handling.
- The `file://` and local-path branches in `_resolve_image_part` don’t perform the `_is_valid_image_file` check that the HTTP branch does; it may be worth aligning these paths so malformed or non-image files are handled consistently regardless of their origin.
## Individual Comments
### Comment 1
<location path="astrbot/core/provider/sources/openai_source.py" line_range="196-203" />
<code_context>
+
+ @staticmethod
+ def _file_uri_to_path(file_uri: str) -> str:
+ parsed = urlparse(file_uri)
+ if parsed.scheme != "file":
+ return file_uri
+
+ path = unquote(parsed.path or "")
+ if re.match(r"^/[A-Za-z]:/", path):
+ path = path[1:]
+ if parsed.netloc and parsed.netloc != "localhost":
+ path = f"//{parsed.netloc}{path}"
+ return str(Path(path))
</code_context>
<issue_to_address>
**issue:** File-URI to path conversion may mishandle some Windows-style `file://` URIs.
For URIs like `file://C:/path/to/file` (emitted by some Windows tools), `urlparse` sets `netloc='C'` and `path='/path/to/file'`, so this code treats them as UNC and returns `//C/path/to/file` instead of `C:/path/to/file`. Because the drive-letter regex only runs on `parsed.path`, this case never hits that branch. Consider first normalizing `file://<drive_letter>:/...` to `C:/...` (e.g. by special-casing `scheme=='file'` with a single-letter `netloc` and `path` starting with `:/`) before applying the UNC handling.
</issue_to_address>
### Comment 2
<location path="astrbot/core/provider/sources/openai_source.py" line_range="141" />
<code_context>
return True
return False
+ def _is_invalid_attachment_error(self, error: Exception) -> bool:
+ body = getattr(error, "body", None)
+ if isinstance(body, dict):
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting small helper functions for error parsing, image loading, and safe image opening to flatten nested logic and remove duplication while preserving behavior.
You can keep all new behavior while shaving off some of the incidental complexity by extracting tiny helpers where you already have duplication and deep nesting.
### 1. Flatten `_is_invalid_attachment_error`
You’re doing essentially the same matching twice (from `error.body` and from string candidates), and the body path is heavily nested. A small helper to normalize the error dict and a single list of predicates makes the method much flatter:
```python
def _get_error_info(self, error: Exception) -> tuple[str | None, str | None]:
body = getattr(error, "body", None)
if isinstance(body, dict):
err_obj = body.get("error")
if isinstance(err_obj, dict):
code = err_obj.get("code")
message = err_obj.get("message")
if isinstance(code, str):
code = code.lower()
if isinstance(message, str):
message = message.lower()
return code, message
return None, None
def _is_invalid_attachment_error(self, error: Exception) -> bool:
code, message = self._get_error_info(error)
candidates = [
c.lower()
for c in self._extract_error_text_candidates(error)
]
if message:
candidates.append(message)
for text in candidates:
if "invalid_attachment" in text:
return True
if "download attachment" in text and "404" in text:
return True
return code == "invalid_attachment"
```
This keeps all existing rules but removes nested `if isinstance(...)` chains and duplicate `any(...)` scans.
### 2. Split URL resolution from payload construction in `_resolve_image_part`
Right now `_resolve_image_part` both decides how to load the image and builds the final `{"type": "image_url"}` structure, plus you repeat `encode_image_bs64`/validation logic. A tiny loader helper keeps responsibilities separate:
```python
async def _load_image_data(self, image_url: str) -> str | None:
if image_url.startswith("http"):
image_path = await download_image_by_url(image_url)
if not self._is_valid_image_file(image_path):
logger.warning("图片 URL %s 下载结果不是有效图片,将忽略。", image_url)
return None
elif image_url.startswith("file://"):
image_path = self._file_uri_to_path(image_url)
else:
image_path = image_url
return await self.encode_image_bs64(image_path)
async def _resolve_image_part(
self,
image_url: str,
*,
image_detail: str | None = None,
) -> dict | None:
if image_url.startswith("data:"):
image_payload = {"url": image_url}
else:
image_data = await self._load_image_data(image_url)
if not image_data:
logger.warning("图片 %s 得到的结果为空,将忽略。", image_url)
return None
image_payload = {"url": image_data}
if image_detail:
image_payload["detail"] = image_detail
return {"type": "image_url", "image_url": image_payload}
```
This keeps behavior identical (including `file://` handling and validity check) but flattens `_resolve_image_part` and removes duplicated branches.
### 3. Reduce duplication in `_is_valid_image_file` / `_detect_image_mime_type`
Both functions open the image with PIL and catch the same exceptions. Centralizing that into a safe opener helper reduces code and mental overhead:
```python
@staticmethod
def _safe_open_image(image_path: str) -> PILImage.Image | None:
try:
return PILImage.open(image_path)
except (FileNotFoundError, OSError, UnidentifiedImageError):
return None
@staticmethod
def _is_valid_image_file(image_path: str) -> bool:
image = AstrBotOpenAIProvider._safe_open_image(image_path)
if not image:
return False
try:
image.verify()
except Exception:
return False
return True
@staticmethod
def _detect_image_mime_type(image_path: str) -> str:
image = AstrBotOpenAIProvider._safe_open_image(image_path)
if not image:
return "image/jpeg"
image_format = str(image.format or "").upper()
return {
"JPEG": "image/jpeg",
"PNG": "image/png",
"GIF": "image/gif",
"WEBP": "image/webp",
"BMP": "image/bmp",
}.get(image_format, "image/jpeg")
```
The external behavior (defaulting to `image/jpeg` and treating invalid files as invalid) stays the same, but the exception handling is no longer duplicated.
These focused extractions keep all of your new functionality (better MIME detection, `file://` handling, invalid-attachment fallback, and image pre-materialization) while making the code flatter and easier to reason about.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Code Review
This pull request significantly enhances image handling for OpenAI API calls by introducing a comprehensive mechanism to resolve various image URLs (HTTP, file, data URIs) into base64 encoded data within the chat context. Key changes include new utility functions for image validation, MIME type detection, and file URI to path conversion. It also implements robust error handling for 'invalid_attachment' errors, allowing the system to retry requests with text-only content if image processing fails. The review suggests an improvement to the _detect_image_mime_type method to use PILImage.MIME for more accurate and comprehensive MIME type detection, rather than a limited dictionary, to ensure better compatibility and correctness.
| "JPEG": "image/jpeg", | ||
| "PNG": "image/png", | ||
| "GIF": "image/gif", | ||
| "WEBP": "image/webp", | ||
| "BMP": "image/bmp", | ||
| }.get(image_format, "image/jpeg") | ||
|
|
There was a problem hiding this comment.
The _detect_image_mime_type method uses a limited dictionary to map image formats to MIME types and defaults to image/jpeg for unrecognized formats. While _is_valid_image_file ensures the file is an image, using a more comprehensive approach like PILImage.MIME would provide more accurate MIME types for a wider range of image formats, improving correctness and compatibility with APIs that might be strict about MIME types. For example, a TIFF image would currently be incorrectly identified as image/jpeg.
return PILImage.MIME.get(image_format, "image/jpeg")|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/provider/sources/openai_source.py" line_range="1166-1167" />
<code_context>
- with open(image_url, "rb") as f:
- image_bs64 = base64.b64encode(f.read()).decode("utf-8")
- return "data:image/jpeg;base64," + image_bs64
+ image_data = self._encode_image_file_to_data_url(image_url)
+ return image_data or ""
async def terminate(self):
</code_context>
<issue_to_address>
**issue (bug_risk):** `encode_image_bs64` now silently returns an empty string on failure, which changes its error semantics and may hide issues.
This change shifts `encode_image_bs64` from raising on unreadable/missing files to returning `""` via `_encode_image_file_to_data_url` returning `None`. Callers that don’t explicitly treat an empty string as failure may proceed as if they have a valid image, especially if they previously relied on exceptions or `None` checks. Consider preserving the prior behavior (propagate the exception) or returning `None` instead of `""` so failures remain explicit and easier to handle correctly.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The new image helpers (
_encode_image_file_to_data_urlvsencode_image_bs64) now duplicate format-detection and MIME-type logic; consider havingencode_image_bs64delegate to_encode_image_file_to_data_urlto avoid divergence in behavior over time. _materialize_context_image_partsruns on every_prepare_chat_payloadcall even when no images are present; you could short-circuit with_context_contains_imageto avoid the extra deep iteration on purely text-only conversations.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new image helpers (`_encode_image_file_to_data_url` vs `encode_image_bs64`) now duplicate format-detection and MIME-type logic; consider having `encode_image_bs64` delegate to `_encode_image_file_to_data_url` to avoid divergence in behavior over time.
- `_materialize_context_image_parts` runs on every `_prepare_chat_payload` call even when no images are present; you could short-circuit with `_context_contains_image` to avoid the extra deep iteration on purely text-only conversations.
## Individual Comments
### Comment 1
<location path="astrbot/core/provider/sources/openai_source.py" line_range="1172-1175" />
<code_context>
- with open(image_url, "rb") as f:
- image_bs64 = base64.b64encode(f.read()).decode("utf-8")
- return "data:image/jpeg;base64," + image_bs64
+ image_bytes = self._read_file_bytes(image_url, suppress_errors=False)
+ if image_bytes is None:
+ raise FileNotFoundError(image_url)
+
</code_context>
<issue_to_address>
**suggestion:** The `image_bytes is None` check in `encode_image_bs64` is unreachable with `suppress_errors=False` and can be simplified.
With `suppress_errors=False`, `_read_file_bytes` will either return bytes or raise `OSError`, never `None`, so the `if image_bytes is None:` branch is dead code. Either remove the `None` check and let `OSError` propagate, or call `_read_file_bytes` with `suppress_errors=True` and handle the missing-file case here, matching `_encode_image_file_to_data_url` for consistent error behavior.
```suggestion
image_bytes = self._read_file_bytes(image_url, suppress_errors=False)
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The interaction between
_load_image_dataandencode_image_bs64forbase64://URLs is a bit non-obvious (both now handle that scheme differently from other paths); consider centralizing this base64-handling logic in one place to avoid future regressions or accidental double-handling. - In
_file_uri_to_path, the UNC/netloc handling (netlocnot empty and notlocalhostleading to//{netloc}{path}) could produce surprising results on non-Windows platforms; it may be worth tightening or documenting the expected behavior for remotefile://host/...URIs versus local paths.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The interaction between `_load_image_data` and `encode_image_bs64` for `base64://` URLs is a bit non-obvious (both now handle that scheme differently from other paths); consider centralizing this base64-handling logic in one place to avoid future regressions or accidental double-handling.
- In `_file_uri_to_path`, the UNC/netloc handling (`netloc` not empty and not `localhost` leading to `//{netloc}{path}`) could produce surprising results on non-Windows platforms; it may be worth tightening or documenting the expected behavior for remote `file://host/...` URIs versus local paths.
## Individual Comments
### Comment 1
<location path="astrbot/core/provider/sources/openai_source.py" line_range="187" />
<code_context>
+ }.get(str(image_format or "").upper(), "image/jpeg")
+
+ @staticmethod
+ def _read_file_bytes(
+ image_path: str,
+ *,
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the image handling and error-classification helpers by inlining small functions, centralizing URL dispatch, and extracting validation helpers to make the control flow more linear and easier to follow.
You can trim a fair amount of indirection without losing any behavior. The main opportunities are in image encoding/URL dispatch and error handling.
### 1. Collapse the image encoding helper chain
`_read_file_bytes` and `_detect_image_format` are only used in `_encode_image_file_to_data_url` and are very small. Inlining them makes the core flow easier to follow while preserving behavior:
```python
@classmethod
def _encode_image_file_to_data_url(
cls,
image_path: str,
*,
suppress_errors: bool = True,
raise_on_invalid_image: bool = False,
) -> str | None:
try:
image_bytes = Path(image_path).read_bytes()
except OSError:
if not suppress_errors:
raise
return None
try:
with PILImage.open(BytesIO(image_bytes)) as image:
image.verify()
image_format = str(image.format or "").upper()
except (OSError, UnidentifiedImageError):
if raise_on_invalid_image:
raise ValueError(f"Invalid image file: {image_path}")
return None
mime_type = cls._image_format_to_mime_type(image_format)
image_bs64 = base64.b64encode(image_bytes).decode("utf-8")
return f"data:{mime_type};base64,{image_bs64}"
```
After this, you can delete `_read_file_bytes` and `_detect_image_format` without changing call sites.
### 2. Centralize URL-type dispatch
You currently have overlapping handling in `_normalize_image_path`, `_load_image_data`, `_resolve_image_part`, and `encode_image_bs64`. You can centralize the URL dispatch into a single helper and reuse it from both `_resolve_image_part` and `encode_image_bs64`:
```python
async def _image_ref_to_data_url(self, image_ref: str) -> str | None:
if image_ref.startswith("base64://"):
return image_ref.replace("base64://", "data:image/jpeg;base64,")
if image_ref.startswith("http"):
image_path = await download_image_by_url(image_ref)
else:
image_path = self._normalize_image_path(image_ref)
return self._encode_image_file_to_data_url(image_path)
```
Then `_resolve_image_part` simplifies to:
```python
async def _resolve_image_part(
self,
image_url: str,
*,
image_detail: str | None = None,
) -> dict | None:
if image_url.startswith("data:"):
image_payload = {"url": image_url}
else:
image_data = await self._image_ref_to_data_url(image_url)
if not image_data:
logger.warning("图片 %s 得到的结果为空,将忽略。", image_url)
return None
image_payload = {"url": image_data}
if image_detail:
image_payload["detail"] = image_detail
return {
"type": "image_url",
"image_url": image_payload,
}
```
And `encode_image_bs64` can delegate to the same path (keeping its stricter error behavior):
```python
async def encode_image_bs64(self, image_url: str) -> str:
if image_url.startswith("base64://"):
return image_url.replace("base64://", "data:image/jpeg;base64,")
image_path = self._normalize_image_path(image_url)
image_data = self._encode_image_file_to_data_url(
image_path,
suppress_errors=False,
raise_on_invalid_image=True,
)
return cast(str, image_data)
```
This keeps behavior but reduces duplicated decision trees about URL schemes.
### 3. Decompose `_materialize_context_image_parts`
The nested checks/logging and `_resolve_image_part` call can be flattened by extracting validation into a small helper:
```python
def _extract_image_part_info(
self,
part: dict,
) -> tuple[str | None, str | None]:
if not isinstance(part, dict) or part.get("type") != "image_url":
return None, None
image_url_data = part.get("image_url")
if not isinstance(image_url_data, dict):
logger.warning("图片内容块格式无效,将保留原始内容。")
return None, None
url = image_url_data.get("url")
if not isinstance(url, str) or not url:
logger.warning("图片内容块缺少有效 URL,将保留原始内容。")
return None, None
detail = image_url_data.get("detail")
return url, detail if isinstance(detail, str) else None
```
Then `_materialize_context_image_parts` becomes more linear:
```python
async def _materialize_context_image_parts(self, context_query: list[dict]) -> None:
for message in context_query:
content = message.get("content")
if not isinstance(content, list):
continue
new_content: list[dict] = []
content_changed = False
for part in content:
url, detail = self._extract_image_part_info(part)
if not url:
new_content.append(part)
continue
try:
resolved_part = await self._resolve_image_part(url, image_detail=detail)
except Exception as exc:
logger.warning("图片 %s 预处理失败,将保留原始内容。错误: %s", url, exc)
new_content.append(part)
continue
new_content.append(resolved_part or part)
if resolved_part and resolved_part != part:
content_changed = True
if content_changed:
message["content"] = new_content
```
Behavior stays the same (including logging), but the control flow is easier to follow.
### 4. Inline `_get_error_info` into `_is_invalid_attachment_error`
`_get_error_info` is only used once and is simple. Inlining it reduces hops when understanding the error classification:
```python
def _is_invalid_attachment_error(self, error: Exception) -> bool:
body = getattr(error, "body", None)
code = message = None
if isinstance(body, dict):
err_obj = body.get("error")
if isinstance(err_obj, dict):
raw_code = err_obj.get("code")
raw_message = err_obj.get("message")
code = raw_code.lower() if isinstance(raw_code, str) else None
message = raw_message.lower() if isinstance(raw_message, str) else None
candidates = [
candidate.lower()
for candidate in self._extract_error_text_candidates(error)
]
if message:
candidates.append(message)
for candidate in candidates:
if "invalid_attachment" in candidate:
return True
if "download attachment" in candidate and "404" in candidate:
return True
return code == "invalid_attachment"
```
This keeps all current matching behavior but makes it faster to see when an error is treated as `invalid_attachment`.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/provider/sources/openai_source.py" line_range="142" />
<code_context>
return True
return False
+ def _is_invalid_attachment_error(self, error: Exception) -> bool:
+ body = getattr(error, "body", None)
+ code = None
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring the new error/image helper logic to flatten conditionals, remove redundant helpers, and introduce clearer strict vs. tolerant APIs to reduce cognitive load while preserving behavior.
You can keep all the new functionality while shaving off some cognitive load with a few targeted refactors.
### 1. Flatten `_is_invalid_attachment_error`
You don’t really need the `candidates` list + helper call on the hot path. You can normalize once into a single string and check that:
```python
def _is_invalid_attachment_error(self, error: Exception) -> bool:
body = getattr(error, "body", None)
code: str | None = None
message: str | None = None
if isinstance(body, dict):
err_obj = body.get("error")
if isinstance(err_obj, dict):
raw_code = err_obj.get("code")
raw_message = err_obj.get("message")
code = raw_code.lower() if isinstance(raw_code, str) else None
message = raw_message.lower() if isinstance(raw_message, str) else None
# unified, lowercased error text
parts: list[str] = []
if code:
parts.append(code)
if message:
parts.append(message)
parts.extend(map(str, self._extract_error_text_candidates(error)))
parts.append(str(error))
s = " ".join(p.lower() for p in parts if p)
if "invalid_attachment" in s:
return True
if "download attachment" in s and "404" in s:
return True
# keep explicit code check for safety
return code == "invalid_attachment"
```
This preserves your matching behavior but removes the extra list, loop, and branching.
---
### 2. Collapse image-path helpers
`_file_uri_to_path` + `_normalize_image_path` + part of `_image_ref_to_data_url` can be simplified by handling file/base64/http directly in `_image_ref_to_data_url`, and keeping path normalization local:
```python
async def _image_ref_to_data_url(
self,
image_ref: str,
*,
suppress_errors: bool = True,
raise_on_invalid_image: bool = False,
) -> str | None:
if image_ref.startswith("base64://"):
return image_ref.replace("base64://", "data:image/jpeg;base64,")
if image_ref.startswith("http"):
image_path = await download_image_by_url(image_ref)
elif image_ref.startswith("file://"):
image_path = self._file_uri_to_path(image_ref)
else:
image_path = image_ref
return self._encode_image_file_to_data_url(
image_path,
suppress_errors=suppress_errors,
raise_on_invalid_image=raise_on_invalid_image,
)
```
With that, `_normalize_image_path` becomes redundant and can be removed.
If you don’t truly need UNC/drive-letter support, `_file_uri_to_path` can also be trimmed:
```python
@staticmethod
def _file_uri_to_path(file_uri: str) -> str:
parsed = urlparse(file_uri)
if parsed.scheme != "file":
return file_uri
return str(Path(unquote(parsed.path or "")))
```
---
### 3. Inline `_image_format_to_mime_type`
The extra indirection for 4–5 types doesn’t buy much. Inlining the mapping keeps `_encode_image_file_to_data_url` self-contained:
```python
@classmethod
def _encode_image_file_to_data_url(
cls,
image_path: str,
*,
suppress_errors: bool = True,
raise_on_invalid_image: bool = False,
) -> str | None:
try:
image_bytes = Path(image_path).read_bytes()
except OSError:
if not suppress_errors:
raise
return None
try:
with PILImage.open(BytesIO(image_bytes)) as image:
image.verify()
image_format = (image.format or "").upper()
except (OSError, UnidentifiedImageError):
if raise_on_invalid_image:
raise ValueError(f"Invalid image file: {image_path}")
return None
mime_map = {
"JPEG": "image/jpeg",
"PNG": "image/png",
"GIF": "image/gif",
"WEBP": "image/webp",
"BMP": "image/bmp",
}
mime_type = mime_map.get(image_format, "image/jpeg")
image_bs64 = base64.b64encode(image_bytes).decode("utf-8")
return f"data:{mime_type};base64,{image_bs64}"
```
---
### 4. Simplify `_materialize_context_image_parts` (drop `content_changed`)
Given message lists are small, always reassigning `message["content"]` is simpler and cheap:
```python
async def _materialize_context_image_parts(self, context_query: list[dict]) -> None:
for message in context_query:
content = message.get("content")
if not isinstance(content, list):
continue
new_content: list[dict] = []
for part in content:
url, image_detail = self._extract_image_part_info(part)
if not url:
new_content.append(part)
continue
try:
resolved_part = await self._resolve_image_part(
url, image_detail=image_detail
)
except Exception as exc:
logger.warning(
"图片 %s 预处理失败,将保留原始内容。错误: %s",
url,
exc,
)
new_content.append(part)
continue
new_content.append(resolved_part or part)
message["content"] = new_content
```
This keeps behavior identical but removes the boolean tracking and conditional reassignment.
---
### 5. Reduce flag usage by exposing clearer APIs
You already have two distinct behaviors: strict (`encode_image_bs64`) and tolerant (`_materialize_context_image_parts`). Instead of threading flags everywhere, wrap `_image_ref_to_data_url` with mode-specific helpers:
```python
async def _image_ref_to_data_url_strict(self, image_ref: str) -> str:
data = await self._image_ref_to_data_url(
image_ref,
suppress_errors=False,
raise_on_invalid_image=True,
)
if data is None:
raise RuntimeError(f"Failed to encode image data: {image_ref}")
return data
async def _image_ref_to_data_url_safe(self, image_ref: str) -> str | None:
return await self._image_ref_to_data_url(
image_ref,
suppress_errors=True,
raise_on_invalid_image=False,
)
```
Then:
```python
async def encode_image_bs64(self, image_url: str) -> str:
return await self._image_ref_to_data_url_strict(image_url)
```
And `_resolve_image_part` can call `_image_ref_to_data_url_safe` instead of passing flags each time. This keeps functionality but makes the call sites easier to read.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/provider/sources/openai_source.py" line_range="168" />
<code_context>
+ return True
+ return code == "invalid_attachment"
+
+ @classmethod
+ def _encode_image_file_to_data_url(
+ cls,
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the new image helper API and making context image materialization return a new structure instead of mutating in place to reduce branching and make the control flow easier to follow.
A few of the new helpers can be simplified without changing behavior, mainly around image conversion and context traversal.
### 1. Collapse the multiple image-ref helpers into one explicit mode
Right now you have:
- `_image_ref_to_data_url(image_ref, suppress_errors, raise_on_invalid_image)`
- `_image_ref_to_data_url_safe`
- `_image_ref_to_data_url_strict`
- `_encode_image_file_to_data_url` with `suppress_errors` and `raise_on_invalid_image`
This gives four behavioral combinations and requires understanding both flags and which wrapper is used at each call site.
You can keep the same semantics but make the API easier to reason about by using a single explicit mode instead of flag combinations, and by inlining the “safe/strict” wrappers into that function:
```python
# replace _image_ref_to_data_url + _image_ref_to_data_url_safe/_strict
from typing import Literal
class YourProvider(...):
...
@classmethod
def _encode_image_file_to_data_url(
cls,
image_path: str,
*,
mode: Literal["safe", "strict"],
) -> str | None:
try:
image_bytes = Path(image_path).read_bytes()
except OSError:
if mode == "strict":
raise
return None
try:
with PILImage.open(BytesIO(image_bytes)) as image:
image.verify()
image_format = str(image.format or "").upper()
except (OSError, UnidentifiedImageError):
if mode == "strict":
raise ValueError(f"Invalid image file: {image_path}")
return None
mime_type = {
"JPEG": "image/jpeg",
"PNG": "image/png",
"GIF": "image/gif",
"WEBP": "image/webp",
"BMP": "image/bmp",
}.get(image_format, "image/jpeg")
image_bs64 = base64.b64encode(image_bytes).decode("utf-8")
return f"data:{mime_type};base64,{image_bs64}"
async def _image_ref_to_data_url(
self,
image_ref: str,
*,
mode: Literal["safe", "strict"] = "safe",
) -> str | None:
if image_ref.startswith("base64://"):
return image_ref.replace("base64://", "data:image/jpeg;base64,")
if image_ref.startswith("http"):
image_path = await download_image_by_url(image_ref)
elif image_ref.startswith("file://"):
image_path = self._file_uri_to_path(image_ref)
else:
image_path = image_ref
return self._encode_image_file_to_data_url(image_path, mode=mode)
```
Call sites then become explicit and easier to read, and you can drop `_image_ref_to_data_url_safe` / `_image_ref_to_data_url_strict`:
```python
async def _resolve_image_part(
self,
image_url: str,
*,
image_detail: str | None = None,
) -> dict | None:
if image_url.startswith("data:"):
image_payload = {"url": image_url}
else:
image_data = await self._image_ref_to_data_url(image_url, mode="safe")
if not image_data:
logger.warning("图片 %s 得到的结果为空,将忽略。", image_url)
return None
image_payload = {"url": image_data}
if image_detail:
image_payload["detail"] = image_detail
return {
"type": "image_url",
"image_url": image_payload,
}
async def encode_image_bs64(self, image_url: str) -> str:
data = await self._image_ref_to_data_url(image_url, mode="strict")
if data is None:
# should not happen in strict mode, but keep guard if desired
raise RuntimeError(f"Failed to encode image data: {image_url}")
return data
```
This keeps functionality and error behavior but reduces branching and mental overhead around the different helper combinations.
---
### 2. Make the context image materialization a pure helper
`_materialize_context_image_parts` mutates `context_query` in place and is wired into `_prepare_chat_payload`, which already does a `deepcopy`. The control flow is:
- `assemble_context` / other callers → `_prepare_chat_payload`
- `copy.deepcopy(...)`
- `_context_contains_image(...)`
- `_materialize_context_image_parts(...)` (mutating `context_query`)
You can make the traversal easier to follow and test by making it a pure helper that returns a new structure, and keep the in-place update only at the top level:
```python
async def _materialize_context_image_parts(
self, context_query: list[dict]
) -> list[dict]:
new_messages: list[dict] = []
for message in context_query:
content = message.get("content")
if not isinstance(content, list):
new_messages.append(message)
continue
new_content: list[dict] = []
for part in content:
url, image_detail = self._extract_image_part_info(part)
if not url:
new_content.append(part)
continue
try:
resolved_part = await self._resolve_image_part(
url, image_detail=image_detail
)
except Exception as exc:
logger.warning(
"图片 %s 预处理失败,将保留原始内容。错误: %s",
url,
exc,
)
new_content.append(part)
continue
new_content.append(resolved_part or part)
new_msg = dict(message)
new_msg["content"] = new_content
new_messages.append(new_msg)
return new_messages
```
And in `_prepare_chat_payload` (or equivalent):
```python
context_query = copy.deepcopy(self._ensure_message_to_dicts(contexts))
...
if self._context_contains_image(context_query):
context_query = await self._materialize_context_image_parts(context_query)
```
This keeps all current behavior (including fallbacks) but localizes the structural manipulation into a pure, testable function, making the provider’s main control flow simpler to read.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
_prepare_chat_payloadyou deep-copycontextsand then_materialize_context_image_partsdeep-copies each message again; you can likely avoid one of these copies (for example by letting_materialize_context_image_partsoperate on the already-copied list or by having it work without additional deep copies) to reduce overhead on large histories. - Inside
_materialize_context_image_parts, each message is fullycopy.deepcopy'd and thencontentis replaced with a separately-built list of deep-copied parts, which results in redundant copying of the content; consider shallow-copying the message and only deep-copying the non-image parts you preserve.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `_prepare_chat_payload` you deep-copy `contexts` and then `_materialize_context_image_parts` deep-copies each message again; you can likely avoid one of these copies (for example by letting `_materialize_context_image_parts` operate on the already-copied list or by having it work without additional deep copies) to reduce overhead on large histories.
- Inside `_materialize_context_image_parts`, each message is fully `copy.deepcopy`'d and then `content` is replaced with a separately-built list of deep-copied parts, which results in redundant copying of the content; consider shallow-copying the message and only deep-copying the non-image parts you preserve.
## Individual Comments
### Comment 1
<location path="astrbot/core/provider/sources/openai_source.py" line_range="142" />
<code_context>
return True
return False
+ def _is_invalid_attachment_error(self, error: Exception) -> bool:
+ body = getattr(error, "body", None)
+ code: str | None = None
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring the image-materialization pipeline and error-classification helper to flatten control flow, avoid deep copies, and clarify matching logic without changing behavior.
You can reduce complexity meaningfully in two focused spots without changing behavior.
### 1. Flatten `_materialize_context_image_parts` and avoid deepcopies
The current implementation does a lot of `copy.deepcopy` calls and nested branching. Since the function already treats `context_query` as an input → output transformer, you can:
* Introduce a small helper to transform a single message.
* Build new structures directly instead of deep-copying and mutating.
* Reuse non-image parts as-is, assuming they aren’t mutated later (they are replaced by the output list).
For example:
```python
async def _transform_content_part(self, part: dict) -> dict:
url, image_detail = self._extract_image_part_info(part)
if not url:
return part # non-image or invalid; keep original
try:
resolved_part = await self._resolve_image_part(url, image_detail=image_detail)
except Exception as exc:
logger.warning(
"图片 %s 预处理失败,将保留原始内容。错误: %s",
url,
exc,
)
return part
return resolved_part or part
async def _materialize_message_image_parts(self, message: dict) -> dict:
content = message.get("content")
if not isinstance(content, list):
# no content list → just return a shallow copy
return {**message}
new_content = [
await self._transform_content_part(part)
for part in content
]
return {**message, "content": new_content}
async def _materialize_context_image_parts(
self,
context_query: list[dict],
) -> list[dict]:
return [
await self._materialize_message_image_parts(message)
for message in context_query
]
```
This:
* Eliminates repeated `copy.deepcopy` calls.
* Makes the control flow flatter (one place that decides “transform or keep original”).
* Highlights that this is a pure “map over messages” transformation.
If you still need defensive copying for some fields, you can selectively copy those inside `_materialize_message_image_parts` instead of deep-copying the entire structure.
### 2. Simplify `_is_invalid_attachment_error` with early returns
Current logic builds an aggregate `parts` list and then checks both `code` directly and via `error_text`, which is harder to reason about. You can keep the same matching behavior but reduce branching and make the priority clearer:
```python
def _is_invalid_attachment_error(self, error: Exception) -> bool:
body = getattr(error, "body", None)
code: str | None = None
message: str | None = None
if isinstance(body, dict):
err = body.get("error")
if isinstance(err, dict):
raw_code = err.get("code")
raw_message = err.get("message")
code = raw_code.lower() if isinstance(raw_code, str) else None
message = raw_message.lower() if isinstance(raw_message, str) else None
# 1. Prefer explicit code if present
if code == "invalid_attachment":
return True
# 2. Then inspect error message fields
text_sources: list[str] = []
if message:
text_sources.append(message)
if code:
text_sources.append(code)
text_sources.extend(map(str, self._extract_error_text_candidates(error)))
error_text = " ".join(s.lower() for s in text_sources if s)
if "invalid_attachment" in error_text:
return True
if "download attachment" in error_text and "404" in error_text:
return True
return False
```
This keeps all existing checks but:
* Avoids building `parts` when an explicit code is already sufficient.
* Removes the redundant final `return code == "invalid_attachment"` (now handled early).
* Makes the source priority (`code` → messages → generic text) explicit.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- The image materialization helpers (
_materialize_message_image_parts/_materialize_context_image_parts) run all async work sequentially; if you expect multiple remote images per request, consider usingasyncio.gatherto resolve parts concurrently to avoid introducing latency bottlenecks. - In
_image_ref_to_data_url, checkingimage_ref.startswith("http")andimage_ref.startswith("file://")is fairly ad-hoc; usingurlparseto distinguish schemes would make the URL handling more robust (including schemes likehttps,HTTP, or other valid URL forms).
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The image materialization helpers (`_materialize_message_image_parts` / `_materialize_context_image_parts`) run all async work sequentially; if you expect multiple remote images per request, consider using `asyncio.gather` to resolve parts concurrently to avoid introducing latency bottlenecks.
- In `_image_ref_to_data_url`, checking `image_ref.startswith("http")` and `image_ref.startswith("file://")` is fairly ad-hoc; using `urlparse` to distinguish schemes would make the URL handling more robust (including schemes like `https`, `HTTP`, or other valid URL forms).Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Summary
image_urlcontext parts into data URLs without mutating caller-owned context payloadsINVALID_ATTACHMENTimage failures in text-only mode and preserve original image parts when local materialization cannot prove they are staleTesting
uv run ruff check astrbot/core/provider/sources/openai_source.py tests/test_openai_source.pyuv run pytest tests/test_openai_source.py -qSummary by Sourcery
Harden OpenAI image handling by materializing context images into data URLs without mutating caller payloads and by adding robust attachment error recovery.
Bug Fixes:
Enhancements:
Tests: