[CI]【Hackathon 10th Spring No.39】fused_moe_marlin_backend unit test by r-cloudforge · Pull Request #7325 · PaddlePaddle/FastDeploy

r-cloudforge · 2026-04-10T19:51:20Z

Motivation

No.39 功能模块 fastdeploy/model_executor/layers/moe/fused_moe_marlin_backend.py 单测补充

Modifications

add unittest tests/layers/test_fused_moe_marlin_backend.py

develop 分支：覆盖率0%，Miss行数115（17-361）

当前PR：覆盖率100%，Miss行数0

注：截图为本测试文件单独运行的 pytest --cov 输出（含 branch coverage），上方文字为 develop 已有测试与本 PR 合并后的 statement 覆盖率预估值，两者统计口径不同，以 CI 合并后实际值为准。

完成单测覆盖行数 115-0 = 115 → 四舍五入 100 → 预估贡献度 0.1⭐

Usage or Command

pytest tests/layers/test_fused_moe_marlin_backend.py

Accuracy Tests

no need

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

…sed_moe_marlin_backend.py Add 38 unit tests covering: - get_scale_perms(): permutation list generation, lengths, value ranges - marlin_permute_scales(): per-channel and per-group permutation, shapes, value preservation, dtype handling - marlin_moe_permute_scales(): multi-expert permutation, independence, single/large expert counts - gptq_marlin_moe_repack(): per-expert repack calls, alignment assertions - MarlinWeightOnlyMoEMethod.__init__(): attribute names, quant config - MarlinWeightOnlyMoEMethod.create_weights(): weight/scale parameter creation, shapes, dtypes for various size combinations - MarlinWeightOnlyMoEMethod.process_loaded_weights(): expert count and shape assertions, repack/permute call counts - MarlinWeightOnlyMoEMethod.apply(): output shape, dual GEMM calls, hookfunc invocation, noaux_tc topk, block size selection - Edge cases: inheritance check, various hidden/intermediate sizes All tests run on CPU with mocked GPU ops (MoeWna16MarlinGemmApi, etc.).

Rewrite test: pytest-style, CPU-runnable with sys.modules stubs, 100% coverage on fused_moe_marlin_backend.py (115/115 stmts). Reduced from 514 to 141 lines.

layers/utils.py imports get_padding_offset at module level when CUDA is available. This caused collection error in CI.

… import and __main__ block

The explicit GPU op allowlist missed 'append_attention' and other ops needed by the transitive import chain (moe → attention → ops.gpu). Replace with a ModuleType subclass whose __getattr__ returns None for any unknown attribute.

…-marlin-backend-test

The CI coverage job failed with ModuleNotFoundError on import fastdeploy.model_executor.ops.gpu.deep_gemm because our _GpuOpsStub did not handle sub-module imports. Fix: - Set __path__=[] on stub to mark as package - Pre-register deep_gemm in sys.modules

Preventative fix: _GpuOpsStub.__getattr__ now resolves registered sub-modules from sys.modules. Same pattern as task 044.

…r reviewer

Address 3 Copilot review comments: 1. Save/restore sys.modules entries around GPU stub installation to avoid polluting the global module registry. 2-3. Add create=True to both paddle.incubate.nn.functional.swiglu patches so they work on Paddle builds lacking that submodule.

gptq_marlin_moe_repack() uses a lazy import at call time, and apply() accesses moe_topk_select via module attr traversal. Both require the stub to be in sys.modules during execution, not just at top-level import time.

Use explicit 'lambda topk_ids: None' instead of 'lambda **_: None' to match the keyword-argument call site in the source module. Addresses fastdeploy-bot review suggestion.

…_permute_scales - test_gptq_marlin_moe_repack: verifies per-expert loop with mocked C++ op - test_marlin_moe_permute_scales: runs pure-Python path, asserts per-expert output matches single-expert marlin_permute_scales Addresses fastdeploy-bot suggestion to cover the two utility wrappers.

Addresses all three review comments: 1. Copilot AI (stub pollution): Guard import with _NEED_STUB flag — try real import first; when stubs are used, explicitly clean up parent- package attribute binding to prevent cross-test pollution. 2. Copilot AI (weak assertions): test_create_and_process now validates shape/dtype after create_weights AND verifies scales are non-zero after process_loaded_weights (catches no-op regressions). 3. fastdeploy-bot (framework): Convert from unittest.TestCase to pytest classes with plain assert statements, matching sibling test files (test_fused_moe_triton_backend.py, test_fused_moe_cutlass_backend.py).

The noaux_tc code path calls 'from moe.moe import get_moe_scores', which triggers importing the full moe.py module. That module imports distributed/worker/ops modules whose CUDA teardown segfaults during process exit under coverage instrumentation. Fix: inject a lightweight moe stub into sys.modules (alongside the existing GPU ops stubs) so the import resolves without triggering the real heavy import chain.

paddle-bot · 2026-04-10T19:51:32Z

Thanks for your contribution!

CLAassistant · 2026-04-10T19:51:33Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

cloudforge1 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

fastdeploy-bot

🤖 AI Code Review | 2025-04-11

📋 Review 摘要

PR 概述：为 fused_moe_marlin_backend.py 添加单元测试，覆盖所有主要函数和类方法

变更范围：tests/layers/test_fused_moe_marlin_backend.py

影响面 Tag：[CI]

发现的问题

级别	文件	概述
🔴 Bug	`test_fused_moe_marlin_backend.py:41`	语法错误：多余的 `else` 关键字

总体评价

测试设计思路良好，使用了 stub 技巧处理 GPU 依赖，覆盖了被测模块的主要功能。但存在一个阻塞性语法错误需要修复。

fastdeploy-bot · 2026-04-10T20:02:03Z

tests/layers/test_fused_moe_marlin_backend.py

+
+class _GpuOpsStub(types.ModuleType):
+    """Catch-all module: returns registered sub-modules or ``None``."""
+


🔴 Bug 语法错误：第 41 行存在多余的 else 关键字

return sub if sub is not None else else None

这会导致 Python 语法错误（SyntaxError: invalid syntax），测试文件无法运行。

建议修复：

return sub if sub is not None else None

cloudforge1 added 16 commits March 12, 2026 23:40

[CI]【Hackathon 10th Spring No.39】fused_moe_marlin_backend unit test

3c112c4

Rewrite test: pytest-style, CPU-runnable with sys.modules stubs, 100% coverage on fused_moe_marlin_backend.py (115/115 stmts). Reduced from 514 to 141 lines.

fix: add get_padding_offset to GPU ops stub

f5014ff

layers/utils.py imports get_padding_offset at module level when CUDA is available. This caused collection error in CI.

[CI]【Hackathon 10th Spring No.39】fix Apache header format, add pytest…

06dbb48

… import and __main__ block

Merge remote-tracking branch 'upstream/develop' into task/h10-039-moe…

859b8b2

…-marlin-backend-test

[CI]【Hackathon 10th Spring No.39】fix: robust deep_gemm stub for CUDA CI

aba5642

Preventative fix: _GpuOpsStub.__getattr__ now resolves registered sub-modules from sys.modules. Same pattern as task 044.

[CI]【Hackathon 10th Spring No.39】refactor: unittest.TestCase style pe…

75cfd1e

…r reviewer

fix: use patch.dict for sys.modules stubs (scoped to import)

9dae0ce

fix: scope patch.dict(sys.modules) around runtime GPU-op imports

3873df6

gptq_marlin_moe_repack() uses a lazy import at call time, and apply() accesses moe_topk_select via module attr traversal. Both require the stub to be in sys.modules during execution, not just at top-level import time.

style: clarify topk_ids_hookfunc lambda signature

f631d12

Use explicit 'lambda topk_ids: None' instead of 'lambda **_: None' to match the keyword-argument call site in the source module. Addresses fastdeploy-bot review suggestion.

r-cloudforge temporarily deployed to Metax_ci April 10, 2026 19:51 — with GitHub Actions Inactive

paddle-bot bot added the contributor External developers label Apr 10, 2026

fastdeploy-bot suggested changes Apr 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI]【Hackathon 10th Spring No.39】fused_moe_marlin_backend unit test#7325

[CI]【Hackathon 10th Spring No.39】fused_moe_marlin_backend unit test#7325
r-cloudforge wants to merge 16 commits intoPaddlePaddle:developfrom
CloudForge-Solutions:task/h10-039-moe-marlin-backend-test-v2

r-cloudforge commented Apr 10, 2026

Uh oh!

paddle-bot bot commented Apr 10, 2026

Uh oh!

CLAassistant commented Apr 10, 2026

Uh oh!

fastdeploy-bot left a comment

Uh oh!

fastdeploy-bot Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		class _GpuOpsStub(types.ModuleType):
		"""Catch-all module: returns registered sub-modules or ``None``."""

Conversation

r-cloudforge commented Apr 10, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Apr 10, 2026

Uh oh!

CLAassistant commented Apr 10, 2026

Uh oh!

fastdeploy-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

发现的问题

总体评价

Uh oh!

fastdeploy-bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants