Skip to content

[CI]【Hackathon 10th Spring No.46】Windows Python runtime guards (3/3)#7329

Open
r-cloudforge wants to merge 12 commits intoPaddlePaddle:developfrom
CloudForge-Solutions:task/h10-046-win-python-runtime-part-v2
Open

[CI]【Hackathon 10th Spring No.46】Windows Python runtime guards (3/3)#7329
r-cloudforge wants to merge 12 commits intoPaddlePaddle:developfrom
CloudForge-Solutions:task/h10-046-win-python-runtime-part-v2

Conversation

@r-cloudforge
Copy link
Copy Markdown

Motivation

Several FastDeploy Python modules use POSIX-only APIs that crash on Windows:

  • /dev/shm paths (unavailable on Windows — must use tempfile.gettempdir())
  • os.killpg() (not available on Windows — must use os.kill())
  • os.setsid as preexec_fn (not available on Windows)
  • multiprocessing.get_context("fork") (fork unavailable — must use "spawn")

This is Part 3 of 3 for 【Hackathon 10th Spring No.46】(Windows Build Support):

  • Part 1: C++ #ifndef _WIN32 guards on POSIX-only includes
  • Part 2: Build system — setup_ops.py platform-conditional link args + build.bat
  • Part 3 (this PR): Python runtime — platform guards for /dev/shm, os.killpg, os.setsid, fork

Modifications

10 Python files modified with sys.platform guards:

File Guard Added
fastdeploy/engine/common_engine.py /dev/shmtempfile.gettempdir() on Windows; preexec_fn=os.setsidNone on Windows
fastdeploy/engine/engine.py os.killpgos.kill on Windows; forkspawn on Windows
fastdeploy/engine/expert_service.py /dev/shmtempfile.gettempdir() on Windows
fastdeploy/worker/worker_process.py forkspawn on Windows
fastdeploy/cache_manager/cache_messager.py /dev/shm path guard
fastdeploy/cache_manager/prefix_cache_manager.py /dev/shm path guard
fastdeploy/eplb/async_expert_loader.py /dev/shm path guard
fastdeploy/inter_communicator/fmq.py /dev/shm path guard
fastdeploy/inter_communicator/zmq_client.py /dev/shm path guard
fastdeploy/inter_communicator/zmq_server.py /dev/shm path guard

All guards follow the pattern:

if sys.platform == "win32":
    # Windows-safe alternative
else:
    # Original POSIX code

Usage or Command

# On Windows:
import fastdeploy
# Previously: ImportError on os.setsid / /dev/shm access
# Now: gracefully uses Windows-compatible alternatives

No change needed on Linux — guards are win32-only.

Accuracy Tests

Not applicable — platform guard additions only. No algorithmic changes.

Verified via python -c "import fastdeploy" on Windows Server 2022 (no CUDA) —
module loads without POSIX-related ImportError/OSError.

Checklist

  • All POSIX-only calls guarded with sys.platform == "win32" check
  • Windows alternatives are functionally equivalent
  • No behavioral change on Linux
  • pre-commit hooks pass (black, isort, flake8, ruff)

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


cloudforge1 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Apr 10, 2026

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor External developers label Apr 10, 2026
Copy link
Copy Markdown

@fastdeploy-bot fastdeploy-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review | 2026-04-11

📋 Review 摘要

PR 概述:为 FastDeploy Python 模块添加 Windows 平台运行时兼容性支持,处理 POSIX-only API(/dev/shmos.killpgos.setsidfork)的跨平台适配

变更范围:10 个 Python 文件添加 sys.platform 平台判断

影响面 Tag[CI]

PR 规范检查

✅ 标题包含有效 Tag [CI]
✅ 描述包含 Motivation、Modifications、Usage、Checklist 等必要章节


问题

级别 文件 概述
🔴 Bug fastdeploy/inter_communicator/ipc_signal.py:172 Windows 文件锁只锁定 1 字节,破坏互斥性

总体评价

PR 整体设计合理,为 Windows 平台兼容性做了全面的平台判断处理。大部分修改(如 /dev/shm 路径替换、os.killpg 替换为 os.killfork 替换为 spawn)都是正确的。

IPCLock 的 Windows 实现存在严重 bug,在多进程环境下无法正确实现互斥锁功能,需要修复。

if sys.platform == "win32":
import msvcrt

msvcrt.locking(self._fd, msvcrt.LK_LOCK, 1)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug Windows 文件锁实现不正确。

msvcrt.locking(fd, mode, nbytes) 的第三个参数 nbytes 表示要锁定/解锁的字节数。当前代码设置为 1,意味着只锁定文件的第 1 个字节。

这会导致在多进程环境下,多个进程可能同时访问同一个锁文件的不同部分,破坏互斥性。

建议修复:

使用 os.lseek() 获取文件大小,然后锁定整个文件:

def acquire(self) -> None:
    """Acquire the lock (blocking). Uses kernel-level flock for atomicity."""
    if sys.platform == "win32":
        import msvcrt
        # Move to start of file
        os.lseek(self._fd, 0, os.SEEK_SET)
        # Lock the entire file (get file size)
        file_size = os.path.getsize(self._lock_path)
        msvcrt.locking(self._fd, msvcrt.LK_LOCK, file_size)
    else:
        fcntl.flock(self._fd, fcntl.LOCK_EX)

def release(self) -> None:
    """Release the lock."""
    if sys.platform == "win32":
        import msvcrt
        # Move to start of file
        os.lseek(self._fd, 0, os.SEEK_SET)
        # Unlock the entire file
        file_size = os.path.getsize(self._lock_path)
        msvcrt.locking(self._fd, msvcrt.LK_UNLCK, file_size)
    else:
        fcntl.flock(self._fd, fcntl.LOCK_UN)

或者,更简单的方法是使用固定大小(如 1 字节),但确保所有进程都锁定相同位置(从文件开头开始)。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants