Skip to content

IN LIST: reinterpret small-width types for bitmap filters#23013

Draft
geoffreyclaude wants to merge 3 commits into
apache:mainfrom
geoffreyclaude:perf/in_list_reinterpret_bitmaps
Draft

IN LIST: reinterpret small-width types for bitmap filters#23013
geoffreyclaude wants to merge 3 commits into
apache:mainfrom
geoffreyclaude:perf/in_list_reinterpret_bitmaps

Conversation

@geoffreyclaude

@geoffreyclaude geoffreyclaude commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

#23011 and #23012 add bitmap lookups for unsigned 1-byte and 2-byte integers, and #23035 unifies those concrete filters behind one shared bitmap implementation. This PR lets other same-width primitive types reuse those same bitmaps without copying or converting the values.

The key idea is that some types have different meanings but the same physical shape in memory. For example:

  • UInt8 stores one byte.
  • Int8 also stores one byte.
  • UInt16 stores two bytes.
  • Int16 also stores two bytes.

The bitmap only cares about the exact bits. So an Int8 value can be viewed as its one-byte bit pattern and checked with the UInt8 bitmap. No new array is allocated and the underlying Arrow value buffer is shared.

That is what “zero-copy reinterpretation” means here: keep the same bytes, but use a lookup filter whose storage type matches the byte width.

What changes are included in this PR?

  • Adds a helper that reinterprets a primitive Arrow array as another primitive type with the same width.
  • Makes the helper slice-aware, so sliced Arrow arrays still start at the correct logical offset.
  • Wraps bitmap filters so signed 1-byte and 2-byte primitive arrays can reuse the unsigned bitmap storage.
  • Validates source and needle widths before using the reinterpreted path.
  • Adds focused coverage for signed boundary values, bit patterns, and sliced arrays.

Are these changes tested?

Yes.

  • cargo fmt --all
  • cargo test -p datafusion-physical-expr reinterpreted_bitmap_handles_signed_boundaries_and_slices --lib
  • cargo test -p datafusion-physical-expr test_in_list_from_array_type_combinations --lib
  • cargo test -p datafusion-physical-expr in_list_int_types --lib
  • cargo test -p datafusion-physical-expr test_in_list_dictionary_types --lib
  • cargo clippy -p datafusion-physical-expr --all-targets --all-features -- -D warnings

Are there any user-facing changes?

No. This is an internal performance optimization only.

Local benchmark snapshot

Benchmark command:

cargo bench -p datafusion-physical-expr --profile release-nonlto --bench in_list_strategy -- --save-baseline <name>

Method: compare adjacent saved baselines using raw Criterion sample minima (min(time / iters)). Lower is better; changes within +/-5% are treated as noise. These numbers were not rerun after splitting the behavior-preserving bitmap unification into #23035.

Compared baselines: #23035 -> #23013

Relevant scope: signed 16-bit reinterpretation rows.

Summary: 6 relevant rows, 6 faster, 0 slower, 0 within +/-5%.

Benchmark Before After Change
narrow_integer/i16/list=256/match=0% 19.15 us 4.00 us -79.1% (4.79x faster)
narrow_integer/i16/list=256/match=50% 31.32 us 4.00 us -87.2% (7.82x faster)
narrow_integer/i16/list=4/match=0% 16.79 us 4.01 us -76.1% (4.18x faster)
narrow_integer/i16/list=4/match=50% 34.80 us 4.01 us -88.5% (8.69x faster)
narrow_integer/i16/list=64/match=0% 19.21 us 4.11 us -78.6% (4.68x faster)
narrow_integer/i16/list=64/match=50% 34.72 us 4.01 us -88.5% (8.66x faster)

@github-actions github-actions Bot added the physical-expr Changes to the physical-expr crates label Jun 18, 2026
@geoffreyclaude geoffreyclaude force-pushed the perf/in_list_reinterpret_bitmaps branch 2 times, most recently from cc752d0 to 9925e82 Compare June 18, 2026 08:52
@geoffreyclaude geoffreyclaude changed the title Implement Zero-Copy Reinterpretation and enable Int8/Int16 Bitmaps IN LIST: reinterpret small-width types for bitmap filters Jun 18, 2026
@geoffreyclaude geoffreyclaude force-pushed the perf/in_list_reinterpret_bitmaps branch 2 times, most recently from 65f008f to 08fbe39 Compare June 19, 2026 05:35
@geoffreyclaude geoffreyclaude force-pushed the perf/in_list_reinterpret_bitmaps branch from 08fbe39 to 1db5627 Compare June 19, 2026 05:55
@github-actions github-actions Bot added the auto detected api change Auto detected API change label Jun 19, 2026
@geoffreyclaude geoffreyclaude force-pushed the perf/in_list_reinterpret_bitmaps branch from 1db5627 to c0a301a Compare June 22, 2026 13:50
@github-actions github-actions Bot removed the auto detected api change Auto detected API change label Jun 22, 2026
@geoffreyclaude geoffreyclaude force-pushed the perf/in_list_reinterpret_bitmaps branch 3 times, most recently from 26f47f3 to 14e21ff Compare June 24, 2026 20:49
adriangb pushed a commit to pydantic/datafusion that referenced this pull request Jun 26, 2026
## Which issue does this PR close?

- Part of apache#19241.
- Stacked on apache#23011.
- Next in stack: apache#23035.
- Extracted from apache#19390.

## Rationale for this change

apache#23011 uses a bitmap checklist for `UInt8`, where there are 256 possible
values. `UInt16` is the same idea with a larger value range: 0 through
65,535.

That is still small enough to represent directly. A `UInt16` bitmap
needs one bit for each possible value:

- 65,536 possible values
- 65,536 bits total
- 8 KB of memory

Then a lookup is still simple: use the input value as the bit position
and check whether that bit is set. For example, if the list contains
`42`, bit `42` is set, and every input row with value `42` can be
recognized with one bit test.

This PR keeps the scope narrow: it adds the unsigned 2-byte bitmap path
as a concrete `UInt16` filter. apache#23035 then unifies the `UInt8` and
`UInt16` implementations, and apache#23013 uses that shared shape for signed
same-width reinterpretation.

## What changes are included in this PR?

- Adds `UInt16BitmapFilter`, backed by a heap-allocated 65,536-bit
bitmap.
- Routes `UInt16` constant-list filtering to that bitmap path.
- Keeps the same `IN` / `NOT IN` null behavior as the generic path.
- Adds focused coverage for `UInt16` boundary values, nulls, and `NOT
IN`.

## Are these changes tested?

Yes.

- `cargo fmt --all`
- `cargo test -p datafusion-physical-expr bitmap_filter_u16 --lib`
- `cargo test -p datafusion-physical-expr in_list_int_types --lib`
- `cargo test -p datafusion-physical-expr
test_in_list_from_array_type_combinations --lib`
- `cargo test -p datafusion-physical-expr test_in_list_dictionary_types
--lib`
- `cargo clippy -p datafusion-physical-expr --all-targets --all-features
-- -D warnings`

## Are there any user-facing changes?

No. This is an internal performance optimization only.

<!-- codex-benchmark-start -->
## Benchmark note

No local `in_list_strategy` numbers are included for this PR because the
benchmark harness does not currently include a direct `UInt16` case. The
available `i16` rows measure the signed reinterpretation path added in
apache#23013 after the bitmap unification in apache#23035, not this PR's unsigned
`UInt16` bitmap filter.
<!-- codex-benchmark-end -->
@geoffreyclaude geoffreyclaude force-pushed the perf/in_list_reinterpret_bitmaps branch 3 times, most recently from e1b8b0b to efcd10e Compare June 26, 2026 15:14
Introduces zero-copy buffer reinterpretation to allow signed integers and other 1 or 2-byte primitive types (e.g. Float16) to use the high-performance bitmap filters. Triggers for all types with 1-byte or 2-byte width.
@geoffreyclaude geoffreyclaude force-pushed the perf/in_list_reinterpret_bitmaps branch from efcd10e to 3896a90 Compare June 26, 2026 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-expr Changes to the physical-expr crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant