Skip to content

fix(provider): 修复图片 MIME 类型硬编码及 client 复用已关闭连接问题#6999

Open
Tz-WIND wants to merge 10 commits intoAstrBotDevs:masterfrom
Tz-WIND:master
Open

fix(provider): 修复图片 MIME 类型硬编码及 client 复用已关闭连接问题#6999
Tz-WIND wants to merge 10 commits intoAstrBotDevs:masterfrom
Tz-WIND:master

Conversation

@Tz-WIND
Copy link

@Tz-WIND Tz-WIND commented Mar 26, 2026

图片 MIME 类型自动检测和Client 关闭后复用问题修复

Modifications / 改动点

修复两个独立问题:

  1. encode_image_bs64 将所有图片的 MIME 类型硬编码为 image/jpeg,
    导致 PNG、WebP 等格式在严格校验的接口上请求失败。
    现通过读取文件头魔术字节自动检测实际 MIME 类型,
    将检测逻辑提取至公共工具模块 image_utils.py,
    消除 ProviderRequest 与适配器之间的重复实现。
    SVG 检测所需头部字节数从 16 字节提升至 256 字节,
    修复原实现字节数不足导致检测失败的问题。

  2. terminate() 关闭 client 后未将 self.client 置为 None,
    配置重载(reload)期间可能复用已关闭的 client,
    导致稳定复现 APIConnectionError。
    新增 _create_openai_client() 将 client 初始化逻辑解耦,
    新增 _ensure_client() 在每次使用前检查 client 状态,
    若为 None 或底层连接已关闭则自动重建,提供多层防御。
    terminate() 通过 try/finally 确保 client 引用始终被清空。

注:_ensure_client 中通过 self.client._client.is_closed 访问了openai 库的私有属性,依赖其内部实现,已添加注释说明此依赖关系,待 SDK 提供公开 API 后应及时替换。

......

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

2675F38E458D52EC71E75238577DD9F5 图中400的报错就是未修改时候硬编码导致的报错,现在多轮测试都可以正常工作。

Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。(无新功能)

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Improve image handling and OpenAI client robustness by adding automatic MIME type detection and safe client reinitialization.

New Features:

  • Automatically detect image MIME types from magic bytes for both local files and base64 image inputs, supporting common formats like JPEG, PNG, GIF, WebP, BMP, TIFF, ICO, SVG, AVIF, and HEIF/HEIC.

Bug Fixes:

  • Fix failures when sending non-JPEG images by replacing hardcoded image/jpeg data URLs with correctly detected MIME types.
  • Prevent errors caused by using a closed or uninitialized OpenAI/Azure client by ensuring it is recreated before each request and properly reset on termination.

Enhancements:

  • Refactor OpenAI/Azure client creation into a dedicated helper to simplify initialization and reuse across methods.

Tz-WIND added 6 commits March 26, 2026 15:17
识别图片的前16字节检测格式。
self.client = None,防止残留引用被复用。
将 client 初始化逻辑提取为独立方法,便于重建。
使用 client 前检查其是否为 `None` 或底层连接已关闭,若不可用则自动重建,防爆。
@auto-assign auto-assign bot requested review from LIghtJUNction and Soulter March 26, 2026 08:45
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 26, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求旨在提升系统处理多媒体内容(特别是图片)的健壮性和稳定性,并优化与 OpenAI API 的交互。通过引入图片 MIME 类型自动检测,解决了因图片格式硬编码导致的兼容性问题。同时,通过改进 OpenAI 客户端的生命周期管理,确保了客户端在长时间运行或异常情况下的可用性,有效避免了因客户端关闭而引发的连接错误,从而提升了整体的用户体验和系统可靠性。

Highlights

  • 图片 MIME 类型自动检测: 引入了一个新的静态方法 _detect_mime_type,用于根据文件头魔术字节自动检测图片的实际 MIME 类型。这解决了之前硬编码 image/jpeg 导致非 JPEG 格式图片报错的问题,现在支持 JPEG、PNG、GIF、WebP、BMP、TIFF、ICO、SVG、AVIF、HEIF/HEIC 等多种格式。
  • OpenAI Client 复用问题修复: 新增了 _ensure_client() 方法,用于在每次 API 调用前检查 OpenAI 客户端是否已关闭或未初始化。如果检测到客户端不可用,则会自动重新创建,从而解决了客户端在关闭后无法复用导致后续请求失败的问题。
  • OpenAI Client 初始化逻辑重构: 将 OpenAI 客户端的创建逻辑封装到 _create_openai_client() 方法中,提高了代码的模块化和可维护性。同时,terminate 方法也得到了增强,现在会处理客户端关闭时的异常并确保客户端实例被正确清空。
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@dosubot dosubot bot added the area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. label Mar 26, 2026
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • The _detect_mime_type implementation is duplicated in both openai_source.py and entities.py; consider extracting this into a shared utility to keep the logic in one place and avoid future inconsistencies.
  • In _ensure_client, accessing self.client._client.is_closed relies on a private attribute of the OpenAI client; it would be safer to either feature-detect with hasattr and handle non-HTTPX-backed clients explicitly, or maintain your own closed state when calling terminate().
  • _ensure_client() is now called in many hot paths (e.g., get_current_key, set_key, text_chat, streaming); you may want to ensure this cannot recreate a client too often under normal use, or at least log at debug level instead of warning to avoid noisy logs when recreation is expected.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `_detect_mime_type` implementation is duplicated in both `openai_source.py` and `entities.py`; consider extracting this into a shared utility to keep the logic in one place and avoid future inconsistencies.
- In `_ensure_client`, accessing `self.client._client.is_closed` relies on a private attribute of the OpenAI client; it would be safer to either feature-detect with `hasattr` and handle non-HTTPX-backed clients explicitly, or maintain your own `closed` state when calling `terminate()`.
- `_ensure_client()` is now called in many hot paths (e.g., `get_current_key`, `set_key`, `text_chat`, streaming); you may want to ensure this cannot recreate a client too often under normal use, or at least log at debug level instead of warning to avoid noisy logs when recreation is expected.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/sources/openai_source.py" line_range="1023-1032" />
<code_context>
+    async def encode_image_bs64(self, image_url: str) -> str:
</code_context>
<issue_to_address>
**suggestion:** MIME-type detection and base64 handling logic is duplicated and could be centralized.

To avoid maintaining two parallel implementations, please factor out a shared helper used here and in `entities.py` for MIME detection and base64 → Data URL conversion so both call sites stay consistent.

Suggested implementation:

```python
    async def encode_image_bs64(self, image_url: str) -> str:
        """
        将图片转换为 base64 Data URL,自动检测实际 MIME 类型。

        实际的 MIME 检测和 base64 → Data URL 组装逻辑由共享工具函数负责,
        以便在此处和 entities.py 中保持实现一致。
        """
        from astrbot.core.utils.image_encoding import encode_image_to_data_url

        return await encode_image_to_data_url(image_url)

```

To fully realize the refactor and remove duplication:

1. Create a new shared helper module, for example `astrbot/core/utils/image_encoding.py`, with something like:

   ```python
   import base64
   from typing import Callable

   def _detect_mime_from_header(header_bytes: bytes) -> str:
       # PNG
       if header_bytes.startswith(b"\x89PNG\r\n\x1a\n"):
           return "image/png"
       # GIF
       if header_bytes.startswith(b"GIF87a") or header_bytes.startswith(b"GIF89a"):
           return "image/gif"
       # WebP
       if len(header_bytes) >= 12 and header_bytes[0:4] == b"RIFF" and header_bytes[8:12] == b"WEBP":
           return "image/webp"
       # AVIF
       if len(header_bytes) >= 12 and header_bytes[4:12] == b"ftypavif":
           return "image/avif"
       # HEIF / HEIC
       if len(header_bytes) >= 12 and header_bytes[4:8] == b"ftyp":
           brand = header_bytes[8:12]
           if brand in (b"heic", b"heix", b"hevc", b"hevx", b"mif1"):
               return "image/heif"
       return "image/jpeg"

   async def encode_image_to_data_url(image_url: str) -> str:
       """
       将图片 URL 或 base64:// 前缀的字符串统一转换为 data:*/*;base64,... 形式。
       - 对 base64:// 的输入,解析 header 字节并通过 _detect_mime_from_header 检测 MIME。
       - 对 http(s) URL,需要按现有逻辑下载图片、检测 MIME 并编码为 base64。
       """
       # 此处应移植当前 openai_source.py 和 entities.py 中已有的下载、检测、
       # 编码逻辑,避免两处重复实现。
       ...
   ```

   Adapt the contents of `_detect_mime_from_header` and `encode_image_to_data_url` by moving the currently duplicated MIME-detection and base64-handling logic from both `openai_source.py` and `entities.py` into this module.

2. In `entities.py`, replace its local MIME-detection + base64 → Data URL implementation with a call to `encode_image_to_data_url(image_url)` from `astrbot.core.utils.image_encoding`.

3. Ensure any previous helper methods in `openai_source.py` or `entities.py` that performed the same work are removed or updated to delegate to the new shared helper so there is a single source of truth for:
   - How header bytes are extracted from base64 data.
   - How the MIME type is detected from those header bytes.
   - How the final `data:{mime};base64,{payload}` string is constructed.
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/sources/openai_source.py" line_range="168" />
<code_context>
         proxy = provider_config.get("proxy", "")
         return create_proxy_client("OpenAI", proxy)

+    def _create_openai_client(self) -> AsyncOpenAI | AsyncAzureOpenAI:
+        """创建 OpenAI/Azure 客户端实例"""
+        if "api_version" in self.provider_config:
</code_context>
<issue_to_address>
**issue (complexity):** Consider centralizing OpenAI client lifecycle management behind a single accessor and extracting shared image MIME/base64 helpers into a utility module to avoid duplication and scattered logic.

You can keep the new behaviors but simplify the lifecycle and image helpers a lot:

### 1. Centralize client lifecycle, avoid sprinkling `_ensure_client` + private attributes

Right now `_ensure_client()` is:

- Called manually in many hot-path methods (`get_models`, `_query`, `_query_stream`, `text_chat*`, `get_current_key`, `set_key`, …).
- Coupled to a private attribute: `self.client._client.is_closed` wrapped in `try/except`.

You can keep the auto‑reinit behavior but:

1. Track closed state yourself.
2. Use a single accessor/property instead of manually calling `_ensure_client()` everywhere.

For example:

```python
def __init__(self, provider_config, provider_settings) -> None:
    super().__init__(provider_config, provider_settings)
    ...
    self.client: AsyncOpenAI | AsyncAzureOpenAI | None = None
    self._client_closed = False
    self.client = self._create_openai_client()
    self._client_closed = False
    self.default_params = inspect.signature(
        self.client.chat.completions.create,
    ).parameters.keys()
    ...
```

Replace `_ensure_client` with something that doesn’t touch internals:

```python
def _get_client(self) -> AsyncOpenAI | AsyncAzureOpenAI:
    if self.client is None or self._client_closed:
        logger.warning("检测到 OpenAI client 已关闭或未初始化,正在重新创建...")
        self.client = self._create_openai_client()
        self._client_closed = False
        self.default_params = inspect.signature(
            self.client.chat.completions.create,
        ).parameters.keys()
    return self.client
```

Mark closed in `terminate` instead of probing `._client.is_closed`:

```python
async def terminate(self):
    if self.client:
        try:
            await self.client.close()
        except Exception as e:
            logger.warning(f"关闭 OpenAI client 时出错: {e}")
        finally:
            self._client_closed = True
```

Then use `_get_client()` instead of manual `_ensure_client()` calls, e.g.:

```python
async def get_models(self):
    client = self._get_client()
    try:
        models_str = []
        models = await client.models.list()
        ...
```

```python
async def _query(self, payloads: dict, tools: ToolSet | None) -> LLMResponse:
    client = self._get_client()
    ...
    completion: ChatCompletion = await client.chat.completions.create(**payloads)
```

```python
def get_current_key(self) -> str:
    return self._get_client().api_key

def set_key(self, key) -> None:
    self._get_client().api_key = key
```

And in `text_chat` / `text_chat_stream`:

```python
for retry_cnt in range(max_retries):
    try:
        client = self._get_client()
        client.api_key = chosen_key
        llm_response = await self._query(payloads, func_tool)
        break
    ...
```

This avoids scattering `_ensure_client()` calls and removes the dependency on `self.client._client.is_closed` while preserving auto‑reinit semantics.

### 2. Deduplicate MIME detection / base64 logic

`_detect_mime_type` and the extended `encode_image_bs64` here are substantial and, per the other review, nearly identical to helpers in `entities.py`. To reduce maintenance cost and cognitive load, consider extracting them into a shared utility and delegating from both places.

For example, in a shared module (e.g. `image_utils.py`):

```python
# image_utils.py
def detect_mime_type(header_bytes: bytes) -> str:
    ...
    return "image/jpeg"

async def encode_image_bs64(image_url: str) -> str:
    ...
```

Then in this class:

```python
from .image_utils import detect_mime_type, encode_image_bs64 as encode_image_bs64_util

class YourClass(...):
    async def encode_image_bs64(self, image_url: str) -> str:
        # keep method for API compatibility, delegate to shared helper
        return await encode_image_bs64_util(image_url)
```

This keeps behavior intact, but future changes to supported formats only need to be made in one place.
</issue_to_address>

### Comment 3
<location path="astrbot/core/provider/entities.py" line_range="261" />
<code_context>
+        # 无法识别,回退到 image/jpeg
+        return "image/jpeg"
+
     async def _encode_image_bs64(self, image_url: str) -> str:
-        """将图片转换为 base64"""
+        """将图片转换为 base64 Data URL,自动检测实际 MIME 类型。
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the MIME detection and data-URL encoding into shared helper functions so this method becomes a small wrapper that delegates image handling.

You can keep the new functionality while reducing complexity and duplication by extracting the MIME detection + base64 logic into a shared helper and turning this method into a thin wrapper.

### 1. Extract shared helpers (e.g. `image_utils.py`)

```python
# image_utils.py
import base64
from typing import BinaryIO

def detect_mime_type(header_bytes: bytes) -> str:
    if len(header_bytes) >= 3 and header_bytes[:3] == b'\xff\xd8\xff':
        return "image/jpeg"
    if len(header_bytes) >= 8 and header_bytes[:8] == b'\x89PNG\r\n\x1a\n':
        return "image/png"
    if len(header_bytes) >= 4 and header_bytes[:4] == b'GIF8':
        return "image/gif"
    if len(header_bytes) >= 12 and header_bytes[:4] == b'RIFF' and header_bytes[8:12] == b'WEBP':
        return "image/webp"
    if len(header_bytes) >= 2 and header_bytes[:2] == b'BM':
        return "image/bmp"
    if len(header_bytes) >= 4 and header_bytes[:4] in (b'II\x2a\x00', b'MM\x00\x2a'):
        return "image/tiff"
    if len(header_bytes) >= 4 and header_bytes[:4] == b'\x00\x00\x01\x00':
        return "image/x-icon"
    if b'<svg' in header_bytes[:256].lower():
        return "image/svg+xml"
    if len(header_bytes) >= 12 and header_bytes[4:12] == b'ftypavif':
        return "image/avif"
    if len(header_bytes) >= 12 and header_bytes[4:8] == b'ftyp':
        brand = header_bytes[8:12]
        if brand in (b'heic', b'heix', b'hevc', b'hevx', b'mif1'):
            return "image/heif"
    return "image/jpeg"


def data_url_from_base64(raw_b64: str) -> str:
    sample = raw_b64[:32]
    missing_padding = len(sample) % 4
    if missing_padding:
        sample += '=' * (4 - missing_padding)
    try:
        header_bytes = base64.b64decode(sample)
        mime_type = detect_mime_type(header_bytes)
    except Exception:
        mime_type = "image/jpeg"
    return f"data:{mime_type};base64,{raw_b64}"


def data_url_from_file(f: BinaryIO) -> str:
    header_bytes = f.read(16)
    mime_type = detect_mime_type(header_bytes)
    f.seek(0)
    image_b64 = base64.b64encode(f.read()).decode("utf-8")
    return f"data:{mime_type};base64,{image_b64}"
```

You can reuse this from both this entity class and `openai_source.py`, eliminating the duplicated protocol-level logic.

### 2. Simplify the entity method

```python
from .image_utils import data_url_from_base64, data_url_from_file

async def _encode_image_bs64(self, image_url: str) -> str:
    if image_url.startswith("base64://"):
        raw_b64 = image_url[len("base64://"):]
        return data_url_from_base64(raw_b64)

    with open(image_url, "rb") as f:
        return data_url_from_file(f)
```

This keeps the entity focused on assembling messages and defers low-level image handling to a shared utility, while preserving all current behavior and MIME-detection features.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces automatic MIME type detection for images when converting them to base64, replacing the previous hardcoded image/jpeg assumption. This functionality is implemented in astrbot/core/provider/entities.py. Additionally, the PR refactors the OpenAI client management in astrbot/core/provider/sources/openai_source.py by adding methods to create and ensure the client's availability, including re-initialization if the client is closed. Review feedback indicates significant code duplication of the image MIME type detection and base64 encoding logic across entities.py and openai_source.py, suggesting refactoring. There's also a potential bug identified in SVG MIME type detection due to insufficient header bytes being read/decoded, and a note about the fragility of directly accessing a private _client attribute for checking client status.

Tz-WIND added 3 commits March 26, 2026 17:18
Refactor image encoding to use utility function and improve MIME type detection.
@NayukiChiba
Copy link
Contributor

?你这commit message何意味啊

@NayukiChiba
Copy link
Contributor

提交格式

<type>(<scope>): <subject>

<body>

<footer>

Type类型规范

  • feat: 新功能
  • fix: 修复bug
  • docs: 文档变更
  • style: 代码格式(不影响代码运行)
  • refactor: 重构(既非新增功能,也非修复bug)
  • perf: 性能优化
  • test: 测试相关
  • build: 构建系统或外部依赖变更
  • ci: CI配置变更
  • chore: 其他不修改src或test的变更
  • revert: 回退之前的提交

编写要求

  1. Header:

    • type: 必需,从上述类型中选择
    • scope: 可选,表示影响范围,使用英文小写
    • subject: 必需,使用中文简洁描述,不超过50字,动词开头,首字母小写,结尾不加句号
  2. Body (可选):

    • 使用中文详细说明改动内容
    • 说明为什么做这个改动
    • 每行不超过72字符
  3. Footer (可选):

    • 关闭Issue: close #123
    • 破坏性变更: BREAKING CHANGE: 说明内容

当前文件改动:[描述你的改动]
涉及的功能模块:[可选]

@Tz-WIND Tz-WIND changed the title 图片 MIME 类型自动检测和Client 关闭后复用问题修复 fix(provider): 修复图片 MIME 类型硬编码及 client 复用已关闭连接问题 Mar 26, 2026
@Tz-WIND
Copy link
Author

Tz-WIND commented Mar 26, 2026

提交格式

<type>(<scope>): <subject>

<body>

<footer>

Type类型规范

  • feat: 新功能
  • fix: 修复bug
  • docs: 文档变更
  • style: 代码格式(不影响代码运行)
  • refactor: 重构(既非新增功能,也非修复bug)
  • perf: 性能优化
  • test: 测试相关
  • build: 构建系统或外部依赖变更
  • ci: CI配置变更
  • chore: 其他不修改src或test的变更
  • revert: 回退之前的提交

编写要求

  1. Header:

    • type: 必需,从上述类型中选择
    • scope: 可选,表示影响范围,使用英文小写
    • subject: 必需,使用中文简洁描述,不超过50字,动词开头,首字母小写,结尾不加句号
  2. Body (可选):

    • 使用中文详细说明改动内容
    • 说明为什么做这个改动
    • 每行不超过72字符
  3. Footer (可选):

    • 关闭Issue: close #123
    • 破坏性变更: BREAKING CHANGE: 说明内容

当前文件改动:[描述你的改动] 涉及的功能模块:[可选]

QwQ改了喵

Remove all presentation of any part of API keys.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants