Skip to content

[CI]【Hackathon 10th Spring No.39】fused_moe_marlin_backend unit test#7325

Open
r-cloudforge wants to merge 16 commits intoPaddlePaddle:developfrom
CloudForge-Solutions:task/h10-039-moe-marlin-backend-test-v2
Open

[CI]【Hackathon 10th Spring No.39】fused_moe_marlin_backend unit test#7325
r-cloudforge wants to merge 16 commits intoPaddlePaddle:developfrom
CloudForge-Solutions:task/h10-039-moe-marlin-backend-test-v2

Conversation

@r-cloudforge
Copy link
Copy Markdown

Motivation

No.39 功能模块 fastdeploy/model_executor/layers/moe/fused_moe_marlin_backend.py 单测补充

Modifications

add unittest tests/layers/test_fused_moe_marlin_backend.py

develop 分支:覆盖率0%,Miss行数115(17-361)

当前PR:覆盖率100%,Miss行数0

注:截图为本测试文件单独运行的 pytest --cov 输出(含 branch coverage),上方文字为 develop 已有测试与本 PR 合并后的 statement 覆盖率预估值,两者统计口径不同,以 CI 合并后实际值为准。

完成单测覆盖行数 115-0 = 115 → 四舍五入 100 → 预估贡献度 0.1⭐

Usage or Command

pytest tests/layers/test_fused_moe_marlin_backend.py

Accuracy Tests

no need

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

cloudforge1 added 16 commits March 12, 2026 23:40
…sed_moe_marlin_backend.py

Add 38 unit tests covering:
- get_scale_perms(): permutation list generation,  lengths, value ranges
- marlin_permute_scales(): per-channel and per-group permutation, shapes,
  value preservation, dtype handling
- marlin_moe_permute_scales(): multi-expert permutation, independence,
  single/large expert counts
- gptq_marlin_moe_repack(): per-expert repack calls, alignment assertions
- MarlinWeightOnlyMoEMethod.__init__(): attribute names, quant config
- MarlinWeightOnlyMoEMethod.create_weights(): weight/scale parameter
  creation, shapes, dtypes for various size combinations
- MarlinWeightOnlyMoEMethod.process_loaded_weights(): expert count and
  shape assertions, repack/permute call counts
- MarlinWeightOnlyMoEMethod.apply(): output shape, dual GEMM calls,
  hookfunc invocation, noaux_tc topk, block size selection
- Edge cases: inheritance check, various hidden/intermediate sizes

All tests run on CPU with mocked GPU ops (MoeWna16MarlinGemmApi, etc.).
Rewrite test: pytest-style, CPU-runnable with sys.modules stubs,
100% coverage on fused_moe_marlin_backend.py (115/115 stmts).
Reduced from 514 to 141 lines.
layers/utils.py imports get_padding_offset at module level when
CUDA is available. This caused collection error in CI.
The explicit GPU op allowlist missed 'append_attention' and other ops
needed by the transitive import chain (moe → attention → ops.gpu).
Replace with a ModuleType subclass whose __getattr__ returns None
for any unknown attribute.
The CI coverage job failed with ModuleNotFoundError on import
fastdeploy.model_executor.ops.gpu.deep_gemm because our _GpuOpsStub
did not handle sub-module imports. Fix:
- Set __path__=[] on stub to mark as package
- Pre-register deep_gemm in sys.modules
Preventative fix: _GpuOpsStub.__getattr__ now resolves registered
sub-modules from sys.modules. Same pattern as task 044.
Address 3 Copilot review comments:
1. Save/restore sys.modules entries around GPU stub installation
   to avoid polluting the global module registry.
2-3. Add create=True to both paddle.incubate.nn.functional.swiglu
   patches so they work on Paddle builds lacking that submodule.
gptq_marlin_moe_repack() uses a lazy import at call time, and
apply() accesses moe_topk_select via module attr traversal.
Both require the stub to be in sys.modules during execution,
not just at top-level import time.
Use explicit 'lambda topk_ids: None' instead of 'lambda **_: None'
to match the keyword-argument call site in the source module.

Addresses fastdeploy-bot review suggestion.
…_permute_scales

- test_gptq_marlin_moe_repack: verifies per-expert loop with mocked C++ op
- test_marlin_moe_permute_scales: runs pure-Python path, asserts per-expert
  output matches single-expert marlin_permute_scales

Addresses fastdeploy-bot suggestion to cover the two utility wrappers.
Addresses all three review comments:

1. Copilot AI (stub pollution): Guard import with _NEED_STUB flag — try
   real import first; when stubs are used, explicitly clean up parent-
   package attribute binding to prevent cross-test pollution.

2. Copilot AI (weak assertions): test_create_and_process now validates
   shape/dtype after create_weights AND verifies scales are non-zero
   after process_loaded_weights (catches no-op regressions).

3. fastdeploy-bot (framework): Convert from unittest.TestCase to pytest
   classes with plain assert statements, matching sibling test files
   (test_fused_moe_triton_backend.py, test_fused_moe_cutlass_backend.py).
The noaux_tc code path calls 'from moe.moe import get_moe_scores',
which triggers importing the full moe.py module. That module imports
distributed/worker/ops modules whose CUDA teardown segfaults during
process exit under coverage instrumentation.

Fix: inject a lightweight moe stub into sys.modules (alongside the
existing GPU ops stubs) so the import resolves without triggering the
real heavy import chain.
@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Apr 10, 2026

Thanks for your contribution!

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


cloudforge1 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@paddle-bot paddle-bot bot added the contributor External developers label Apr 10, 2026
Copy link
Copy Markdown

@fastdeploy-bot fastdeploy-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review | 2025-04-11

📋 Review 摘要

PR 概述:为 fused_moe_marlin_backend.py 添加单元测试,覆盖所有主要函数和类方法

变更范围:tests/layers/test_fused_moe_marlin_backend.py

影响面 Tag[CI]

发现的问题

级别 文件 概述
🔴 Bug test_fused_moe_marlin_backend.py:41 语法错误:多余的 else 关键字

总体评价

测试设计思路良好,使用了 stub 技巧处理 GPU 依赖,覆盖了被测模块的主要功能。但存在一个阻塞性语法错误需要修复。


class _GpuOpsStub(types.ModuleType):
"""Catch-all module: returns registered sub-modules or ``None``."""

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug 语法错误:第 41 行存在多余的 else 关键字

return sub if sub is not None else else None

这会导致 Python 语法错误(SyntaxError: invalid syntax),测试文件无法运行。

建议修复

return sub if sub is not None else None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants