-
Notifications
You must be signed in to change notification settings - Fork 414
Fix hang for TestCTE.Concurrent test #10662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughIntroduces a shared read–write lock (rw_lock) in Changes
sequenceDiagram
participant Reader as CTEReader
participant CTE as CTE
participant Part as CTEPartition
Reader->>CTE: acquire shared rw_lock (shared)
Reader->>Part: lock partition mutex (mu)
alt no block available
Reader->>CTE: unlock shared rw_lock
Reader->>Part: wait on cv_for_test (waiting releases mu)
Part-->>Reader: notify cv_for_test
Reader->>CTE: acquire shared rw_lock (shared)
Reader->>Part: re-lock partition mutex (mu)
end
Reader->>CTE: proceed with block processing
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/cc @windtalker @gengliqi |
|
/run-check-issue-triage-complete |
dbms/src/Operators/CTE.h
Outdated
|
|
||
| CTEPartition & getPartitionForTest(size_t partition_idx) { return this->partitions[partition_idx]; } | ||
|
|
||
| std::shared_mutex * getRWLockForTest() { return &(this->rw_lock); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using a reference instead of a pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using a reference instead of a pointer?
done
|
/cc @windtalker |
|
/cc @windtalker @gengliqi |
| // For example: in `CTE::tryGetBlockAt`, we will lock rw_lock first then lock partition.mu. | ||
| // If locking partition.mu first here, `CTE::tryGetBlockAt` may have locked rw_lock. Then | ||
| // each of them needs to lock the other lock, but the other lock has been locked now. | ||
| rw_lock.lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks very weird to me to manually release and acquire lock after wait, why we need rw_lock here? Is it possible that we only hold rw_lock before while loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
dbms/src/Operators/CTEReader.cpp (1)
48-62: Potential missed wakeup between unlockingrw_lockandwait()Between Line 60 and Line 61,
is_eofcan flip + notify while this thread hasn’t started waiting yet (notifier doesn’t takepartition.mu), which can still deadlock the test. Consider guarding the predicate with the same mutex or using a timed wait to avoid indefinite sleep.🛠️ Minimal mitigation using a bounded wait
- rw_lock.unlock(); - partition.cv_for_test->wait(lock); + rw_lock.unlock(); + partition.cv_for_test->wait_for(lock, std::chrono::milliseconds(10));
🧹 Nitpick comments (1)
dbms/src/Operators/CTE.h (1)
188-193: Limit test-only lock accessor to debug builds
getRWLockForTest()is public and available in release builds. Consider guarding it with#ifndef NDEBUGto prevent production misuse.♻️ Proposed change
-#ifndef NDEBUG - CTEPartition & getPartitionForTest(size_t partition_idx) { return this->partitions[partition_idx]; } -#endif - - std::shared_mutex & getRWLockForTest() { return this->rw_lock; } +#ifndef NDEBUG + CTEPartition & getPartitionForTest(size_t partition_idx) { return this->partitions[partition_idx]; } + std::shared_mutex & getRWLockForTest() { return this->rw_lock; } +#endif
What problem does this PR solve?
Issue Number: close #10636
Problem Summary:
What is changed and how it works?
Check List
Tests
Side effects
Documentation
Release note
Summary by CodeRabbit
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.