[CI]【Hackathon 10th Spring No.46】Windows Python runtime guards (3/3)#7329
[CI]【Hackathon 10th Spring No.46】Windows Python runtime guards (3/3)#7329r-cloudforge wants to merge 12 commits intoPaddlePaddle:developfrom
Conversation
|
cloudforge1 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
|
Thanks for your contribution! |
fastdeploy-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review | 2026-04-11
📋 Review 摘要
PR 概述:为 FastDeploy Python 模块添加 Windows 平台运行时兼容性支持,处理 POSIX-only API(/dev/shm、os.killpg、os.setsid、fork)的跨平台适配
变更范围:10 个 Python 文件添加 sys.platform 平台判断
影响面 Tag:[CI]
PR 规范检查
✅ 标题包含有效 Tag [CI]
✅ 描述包含 Motivation、Modifications、Usage、Checklist 等必要章节
问题
| 级别 | 文件 | 概述 |
|---|---|---|
| 🔴 Bug | fastdeploy/inter_communicator/ipc_signal.py:172 |
Windows 文件锁只锁定 1 字节,破坏互斥性 |
总体评价
PR 整体设计合理,为 Windows 平台兼容性做了全面的平台判断处理。大部分修改(如 /dev/shm 路径替换、os.killpg 替换为 os.kill、fork 替换为 spawn)都是正确的。
但 IPCLock 的 Windows 实现存在严重 bug,在多进程环境下无法正确实现互斥锁功能,需要修复。
| if sys.platform == "win32": | ||
| import msvcrt | ||
|
|
||
| msvcrt.locking(self._fd, msvcrt.LK_LOCK, 1) |
There was a problem hiding this comment.
🔴 Bug Windows 文件锁实现不正确。
msvcrt.locking(fd, mode, nbytes) 的第三个参数 nbytes 表示要锁定/解锁的字节数。当前代码设置为 1,意味着只锁定文件的第 1 个字节。
这会导致在多进程环境下,多个进程可能同时访问同一个锁文件的不同部分,破坏互斥性。
建议修复:
使用 os.lseek() 获取文件大小,然后锁定整个文件:
def acquire(self) -> None:
"""Acquire the lock (blocking). Uses kernel-level flock for atomicity."""
if sys.platform == "win32":
import msvcrt
# Move to start of file
os.lseek(self._fd, 0, os.SEEK_SET)
# Lock the entire file (get file size)
file_size = os.path.getsize(self._lock_path)
msvcrt.locking(self._fd, msvcrt.LK_LOCK, file_size)
else:
fcntl.flock(self._fd, fcntl.LOCK_EX)
def release(self) -> None:
"""Release the lock."""
if sys.platform == "win32":
import msvcrt
# Move to start of file
os.lseek(self._fd, 0, os.SEEK_SET)
# Unlock the entire file
file_size = os.path.getsize(self._lock_path)
msvcrt.locking(self._fd, msvcrt.LK_UNLCK, file_size)
else:
fcntl.flock(self._fd, fcntl.LOCK_UN)或者,更简单的方法是使用固定大小(如 1 字节),但确保所有进程都锁定相同位置(从文件开头开始)。
Motivation
Several FastDeploy Python modules use POSIX-only APIs that crash on Windows:
/dev/shmpaths (unavailable on Windows — must usetempfile.gettempdir())os.killpg()(not available on Windows — must useos.kill())os.setsidaspreexec_fn(not available on Windows)multiprocessing.get_context("fork")(fork unavailable — must use"spawn")This is Part 3 of 3 for 【Hackathon 10th Spring No.46】(Windows Build Support):
#ifndef _WIN32guards on POSIX-only includessetup_ops.pyplatform-conditional link args +build.bat/dev/shm,os.killpg,os.setsid,forkModifications
10 Python files modified with
sys.platformguards:fastdeploy/engine/common_engine.py/dev/shm→tempfile.gettempdir()on Windows;preexec_fn=os.setsid→Noneon Windowsfastdeploy/engine/engine.pyos.killpg→os.killon Windows;fork→spawnon Windowsfastdeploy/engine/expert_service.py/dev/shm→tempfile.gettempdir()on Windowsfastdeploy/worker/worker_process.pyfork→spawnon Windowsfastdeploy/cache_manager/cache_messager.py/dev/shmpath guardfastdeploy/cache_manager/prefix_cache_manager.py/dev/shmpath guardfastdeploy/eplb/async_expert_loader.py/dev/shmpath guardfastdeploy/inter_communicator/fmq.py/dev/shmpath guardfastdeploy/inter_communicator/zmq_client.py/dev/shmpath guardfastdeploy/inter_communicator/zmq_server.py/dev/shmpath guardAll guards follow the pattern:
Usage or Command
No change needed on Linux — guards are
win32-only.Accuracy Tests
Not applicable — platform guard additions only. No algorithmic changes.
Verified via
python -c "import fastdeploy"on Windows Server 2022 (no CUDA) —module loads without POSIX-related ImportError/OSError.
Checklist
sys.platform == "win32"checkpre-commithooks pass (black, isort, flake8, ruff)