-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Optimize boolean operations #9284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Optimize boolean operations #9284
Conversation
|
run benchmark filter_kernels boolean_kernels arrow_reader arrow_reader_clickbench |
|
🤖 Hi @Dandandan, thanks for the request (#9284 (comment)).
Please choose one or more of these with |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark boolean_kernels filter_kernels |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark boolean_kernels |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark boolean_kernels |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark filter_kernels boolean_kernels arrow_reader_clickbench |
|
🤖 |
|
FYI @alamb this is starting to look good perf wise |
| } | ||
|
|
||
| // both buffers have the same offset, we can use UnalignedBitChunk for both | ||
| let left_chunks = UnalignedBitChunk::new(left, left_offset_in_bits, len_in_bits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we actually don't have to use this part (use the byte aligned one above and set the correct offset).
| result.truncate(chunks.num_bytes()); | ||
| } | ||
| let src = src.as_ref(); | ||
| let chunks = UnalignedBitChunk::new(src, offset_in_bits, len_in_bits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not needed now that we have a fast byte aligned version.
| } | ||
|
|
||
| #[inline] | ||
| fn fold<B, F>(mut self, init: B, mut f: F) -> B |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here is to implement an improved fold implementation so from_trusted_len_iter is fast (I used AI assistance to come up with the implementation, but it seems to look jormal)
| let mut dst = buffer.data.as_ptr(); | ||
| for item in iterator { | ||
| let mut dst = buffer.data.as_ptr() as *mut T; | ||
| iterator.for_each(|item| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for_each uses fold so can use a more efficient implementation if it is available.
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
run benchmark coalesce_kernels |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
Which issue does this PR close?
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?