[BugFix] Check optional params before .get() call in gqa_rope_write_cache#7311
[BugFix] Check optional params before .get() call in gqa_rope_write_cache#7311K11OntheBoat wants to merge 1 commit intoPaddlePaddle:release/2.5from
Conversation
|
Thanks for your contribution! |
|
K11OntheBoat seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
fastdeploy-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review |
2026-04-10
📋 Review 摘要
PR 概述:修复 gqa_rope_write_cache 中对空 optional tensor 使用 .get() 方法导致的 Assert 错误
变更范围:custom_ops/gpu_ops/append_attn/gqa_rope_write_cache.cu
影响面 Tag:[OP]
问题
| 级别 | 文件 | 概述 |
|---|---|---|
| 🟡 建议 | 代码整体 | 修复范围可能不完整,其他分支仍存在 .get() 调用 |
总体评价
PR 的修复思路和实现是正确的,通过将可选参数改为指针并使用三元运算符检查,有效避免了 optional tensor 为空时的 Assert 错误。但修复范围可能不完整,同一文件中 CascadeAppendWriteCacheKVC8QKV 和 CascadeAppendWriteCacheKVC4QKV 的调用处(行 1500-1525,不在本次 diff 范围内)仍使用 .get(),存在相同潜在问题。
建议确认这些调用路径是否也需要修复,或添加文档说明已知限制。
| reinterpret_cast<NV_TYPE *>(v_out->data<T>()), | ||
| reinterpret_cast<NV_TYPE *>(const_cast<T *>( | ||
| cache_k_dequant_scales.data<T>())), | ||
| cache_k_dequant_scales->data<T>())), |
There was a problem hiding this comment.
💡 设计说明:修复基于一个重要假设
当 cache_quant_type 不是 "none" 时,调用方必须提供相应的 optional tensor(如 cache_k_dequant_scales 等)。如果调用方未提供,当前代码会空指针解引用,行为未定义。
建议在 AppendCacheKV 函数开头添加参数验证,提供更清晰的错误信息。
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## release/2.5 #7311 +/- ##
==============================================
Coverage ? 69.96%
==============================================
Files ? 390
Lines ? 54424
Branches ? 8582
==============================================
Hits ? 38076
Misses ? 13610
Partials ? 2738
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Motivation
在gqa_rope_write_cache中, 存在针对 为空的optional tensor 使用.get()方法的情况;导致Paddle框架Assert错误:
python: paddle/include/paddle/utils/optional.h:563: paddle::optional::reference_const_type paddle::optional::get() const [with T = paddle::Tensor; reference_const_type = const paddle::Tensor&]: Assertion `this->is_initialized()' failed.”
Modifications
此PR采用更标准的C++写法,用指针表示可选参数, 不存在的Optional不使用 .get()方法,参数用nullptr向下传入的方法,可以修复此Assertion报错.