[CI]【Hackathon 10th Spring No.39】fused_moe_marlin_backend unit test#7325
Open
r-cloudforge wants to merge 16 commits intoPaddlePaddle:developfrom
Open
[CI]【Hackathon 10th Spring No.39】fused_moe_marlin_backend unit test#7325r-cloudforge wants to merge 16 commits intoPaddlePaddle:developfrom
r-cloudforge wants to merge 16 commits intoPaddlePaddle:developfrom
Conversation
added 16 commits
March 12, 2026 23:40
…sed_moe_marlin_backend.py Add 38 unit tests covering: - get_scale_perms(): permutation list generation, lengths, value ranges - marlin_permute_scales(): per-channel and per-group permutation, shapes, value preservation, dtype handling - marlin_moe_permute_scales(): multi-expert permutation, independence, single/large expert counts - gptq_marlin_moe_repack(): per-expert repack calls, alignment assertions - MarlinWeightOnlyMoEMethod.__init__(): attribute names, quant config - MarlinWeightOnlyMoEMethod.create_weights(): weight/scale parameter creation, shapes, dtypes for various size combinations - MarlinWeightOnlyMoEMethod.process_loaded_weights(): expert count and shape assertions, repack/permute call counts - MarlinWeightOnlyMoEMethod.apply(): output shape, dual GEMM calls, hookfunc invocation, noaux_tc topk, block size selection - Edge cases: inheritance check, various hidden/intermediate sizes All tests run on CPU with mocked GPU ops (MoeWna16MarlinGemmApi, etc.).
Rewrite test: pytest-style, CPU-runnable with sys.modules stubs, 100% coverage on fused_moe_marlin_backend.py (115/115 stmts). Reduced from 514 to 141 lines.
layers/utils.py imports get_padding_offset at module level when CUDA is available. This caused collection error in CI.
… import and __main__ block
The explicit GPU op allowlist missed 'append_attention' and other ops needed by the transitive import chain (moe → attention → ops.gpu). Replace with a ModuleType subclass whose __getattr__ returns None for any unknown attribute.
…-marlin-backend-test
The CI coverage job failed with ModuleNotFoundError on import fastdeploy.model_executor.ops.gpu.deep_gemm because our _GpuOpsStub did not handle sub-module imports. Fix: - Set __path__=[] on stub to mark as package - Pre-register deep_gemm in sys.modules
Preventative fix: _GpuOpsStub.__getattr__ now resolves registered sub-modules from sys.modules. Same pattern as task 044.
Address 3 Copilot review comments: 1. Save/restore sys.modules entries around GPU stub installation to avoid polluting the global module registry. 2-3. Add create=True to both paddle.incubate.nn.functional.swiglu patches so they work on Paddle builds lacking that submodule.
gptq_marlin_moe_repack() uses a lazy import at call time, and apply() accesses moe_topk_select via module attr traversal. Both require the stub to be in sys.modules during execution, not just at top-level import time.
Use explicit 'lambda topk_ids: None' instead of 'lambda **_: None' to match the keyword-argument call site in the source module. Addresses fastdeploy-bot review suggestion.
…_permute_scales - test_gptq_marlin_moe_repack: verifies per-expert loop with mocked C++ op - test_marlin_moe_permute_scales: runs pure-Python path, asserts per-expert output matches single-expert marlin_permute_scales Addresses fastdeploy-bot suggestion to cover the two utility wrappers.
Addresses all three review comments: 1. Copilot AI (stub pollution): Guard import with _NEED_STUB flag — try real import first; when stubs are used, explicitly clean up parent- package attribute binding to prevent cross-test pollution. 2. Copilot AI (weak assertions): test_create_and_process now validates shape/dtype after create_weights AND verifies scales are non-zero after process_loaded_weights (catches no-op regressions). 3. fastdeploy-bot (framework): Convert from unittest.TestCase to pytest classes with plain assert statements, matching sibling test files (test_fused_moe_triton_backend.py, test_fused_moe_cutlass_backend.py).
The noaux_tc code path calls 'from moe.moe import get_moe_scores', which triggers importing the full moe.py module. That module imports distributed/worker/ops modules whose CUDA teardown segfaults during process exit under coverage instrumentation. Fix: inject a lightweight moe stub into sys.modules (alongside the existing GPU ops stubs) so the import resolves without triggering the real heavy import chain.
|
Thanks for your contribution! |
|
cloudforge1 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
fastdeploy-bot
suggested changes
Apr 10, 2026
fastdeploy-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review |
2025-04-11
📋 Review 摘要
PR 概述:为 fused_moe_marlin_backend.py 添加单元测试,覆盖所有主要函数和类方法
变更范围:tests/layers/test_fused_moe_marlin_backend.py
影响面 Tag:[CI]
发现的问题
| 级别 | 文件 | 概述 |
|---|---|---|
| 🔴 Bug | test_fused_moe_marlin_backend.py:41 |
语法错误:多余的 else 关键字 |
总体评价
测试设计思路良好,使用了 stub 技巧处理 GPU 依赖,覆盖了被测模块的主要功能。但存在一个阻塞性语法错误需要修复。
|
|
||
| class _GpuOpsStub(types.ModuleType): | ||
| """Catch-all module: returns registered sub-modules or ``None``.""" | ||
|
|
There was a problem hiding this comment.
🔴 Bug 语法错误:第 41 行存在多余的 else 关键字
return sub if sub is not None else else None这会导致 Python 语法错误(SyntaxError: invalid syntax),测试文件无法运行。
建议修复:
return sub if sub is not None else None
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
No.39 功能模块 fastdeploy/model_executor/layers/moe/fused_moe_marlin_backend.py 单测补充
Modifications
add unittest tests/layers/test_fused_moe_marlin_backend.py
develop 分支:覆盖率0%,Miss行数115(17-361)
当前PR:覆盖率100%,Miss行数0
完成单测覆盖行数 115-0 = 115 → 四舍五入 100 → 预估贡献度 0.1⭐
Usage or Command
Accuracy Tests
no need
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.