Add blog: Sort Pushdown in DataFusion: Skip Sorts, Skip Decode, Skip I/O by zhuqi-lucas · Pull Request #186 · apache/datafusion-site

zhuqi-lucas · 2026-05-14T07:12:23Z

Summary

Blog post on the multi-release sort pushdown effort in Apache DataFusion (v52 → v55+), restructured around the newly-merged apache/datafusion#22450 as the centerpiece.

The post now covers three landed capabilities plus two future directions, all built on top of the Dynamic Filters primitive:

Exact path — PushdownSort deletes SortExec when statistics prove the scan is already ordered. BufferExec keeps the multi-partition path from regressing on SPM stalls.
Inexact path — runtime reorder for TopK and DESC queries. File-level early stop already worked; row-group-level early stop was the gap.
Runtime row-group dynamic pruning (#22450) — the new piece. At every row-group boundary inside an open file, the live TopK threshold drives a fresh PruningPredicate, and matching row groups are physically removed from the open RecordBatch stream via into_builder() → with_row_groups() → build(). Zero I/O, zero decompress, zero decode, not even the filter column.

Together these compose into a three-layer pruning stack (file + row-group + row), all driven by the same TopK dynamic filter.

Benchmarks documented

Suite	Headline
`sort_pushdown` (Exact path)	`ORDER BY ... LIMIT` runs 27× / 49× faster; full `ORDER BY` runs ~2× faster
`topk_tpch` (#22450, TPC-H SF1, `LIMIT 100`)	5 of 11 queries get 3–4× faster, 0 regressions, total runtime drops −44% (248.8 ms → 139.1 ms)

What changed since the original draft

Restructured around #22450 (just merged) as the post's centerpiece, with new sections on:
- Architecture · three eras of who drives the I/O + decode loop (pre-#20839 / #20839 / #22450)
- transition() · drain · decide · drive
- RowGroupPruner · watch · rebuild · prune
- Cascading prune · how the heap eats row groups
- Three-layer pruning stack (file + RG + row, stacked)
- topk_tpch benchmark
Reframed the Exact Path with honest credit: the trivial case (declared ordering + matching on-disk file list) has always worked via EnsureRequirements; Phase 2 (#21182) is what closes the wrong-on-disk-order gap.
Replaced the old Bottlenecks + Roadmap framing with a tight Future Directions section: A) page-level Exact reverse (gated on arrow-rs#9937); B) page-level dynamic prune at RG boundary (#23216, pure DataFusion — no new arrow-rs API needed).
Added explicit dependency acknowledgement: none of the runtime parts of this post are possible without TopK's dynamic filter pushdown — the prior Dynamic Filters post is recommended background.
Added a reference to the umbrella EPIC issue #23036 for phase-by-phase tracking of all the related PRs and follow-ups.
Cross-checked every code identifier (type names, function names, file paths) against the actual apache/datafusion and apache/arrow-rs sources. Dropped names that no longer exist (e.g. EnforceSorting → EnsureRequirements, PagePruningPredicate → PagePruningAccessPlanFilter).
Pulled 9 PNGs from the community-call deck for the #22450-specific diagrams; corrected two whose text overlays no longer matched the body (cascade threshold wording; transition() LOC framing).
Date filename moved to 2026-07-05-sort-pushdown.md to better align with publish timing.

Test plan

Rendered locally with the Pelican Docker image — all 18 images render, the TOC builds, all internal links resolve.
All PR / issue / blog-post links checked against current state (merged vs in-flight); closed PRs (#18817, #21712) and closed issues (#17348) removed from "in flight / open".
All type/function names grep-verified against current apache/datafusion and apache/arrow-rs source.
PR moves to ready-for-review when reviewers are ready to look.

🤖 Generated with Claude Code

A walkthrough of the sort pushdown work landed and in flight on Apache DataFusion. Covers: - Why SortExec is expensive and what `Exact` / `Inexact` ordering mean at runtime (static `fetch` vs `TopK` dynamic filter). - Phase 1 (#19064): the `PushdownSort` rule + reverse row-group case. - Phase 2 (#21182): statistics-based file sort that upgrades `Unsupported` to `Exact`, eliminating the `SortExec` on non-overlapping ASC scans. Includes the BufferExec compensation (#21426) so the SPM above doesn't lose its implicit memory buffer. - Reverse scans: today's row-group reverse (Inexact, #18817) and the community decision to wait for arrow-rs page-level reverse (#9937) before pursuing Exact reverse, after memory-profile pushback on the original row-group-level proposal. - Benchmarks: 2.1×-49× on the ASC-LIMIT sort_pushdown suite. - What's next: the dynamic / TopK-driven path (#21351 merged, #21733, #21712, #21956, #21580) including the precise RG-pruning vs mid-stream-early-return distinction, and the EnsureRequirements unification (#21976). - Links into the prior dynamic filters and limit pruning posts so the series reads as a coherent thread.

The post previously only mentioned #21956 in passing. The PR is landing the full mechanism — `try_pushdown_sort` decision tree, two flags on `ParquetSource`, three composable runtime steps (file reorder + RG reorder + reverse), and a `sort_prefix`- preserving short-circuit — so cover it as a dedicated phase between Phase 2 and the existing reverse-scan section. - TL;DR: add a Phase 3 bullet alongside Phase 1 and Phase 2. - Phase 1: replace the in-flight `#21956` aside with a cross-link to the new section. - Phase 2: keep the caveat about function-wrapped sorts but note that #21956's `Inexact` path now covers them via monotonicity inference. - New `## Phase 3` section with two SVG diagrams: a decision tree for the three protocol outcomes, and a three-step pipeline for the `Inexact` runtime. Covers the two-flag design, the nested file/RG layers, when EXPLAIN surfaces each flag, and four scenarios where Phase 3 does not fire (aggregations, multi- column secondary keys, function-wrapped sorts without a declared ordering, source declares a forward prefix of the request). - "What's Next": rename the old "Phase 3 — filter + sort" bullet to "Filtered reverse TopK end-to-end" so the label doesn't clash with the new section, and add a follow-up bullet referencing #22198 for multi-column / function-wrapped reorder.

Add an ### Empirical note subsection inside "Reverse Scans for `ORDER BY ... DESC`" that records what we found running an in-house RG-level `Exact` reverse against upstream `Inexact` + `TopK`: - `LIMIT N` does not propagate as a static stop signal in the Inexact path. The dynamic filter pushdown can stats-prune *subsequent* row groups once the threshold tightens, but inside the row group `TopK` is currently reading the sort column has to be fully decoded so the filter can be evaluated row by row. `LIMIT 10` on a 1M-row row group is still ~1M sort-column decodes regardless of N. LIMIT only saves work on non-sort columns inside that row group and on whole subsequent row groups the threshold prunes. - `SortExec` stays on top of `Inexact`, so the final ordering pass and per-row heap maintenance are both extra costs the `Exact` path (which deletes `SortExec`) does not pay. Then explain why we run RG-level `Exact` in production but did not upstream it: parquet does not allow partial row-group reads, so any RG-level `Exact` implementation peaks at one whole row group (~128 MB) of decoded data in memory — the same constraint that closed `#18817`. Our runtime advantage comes from skipping heap / filter / `SortExec` overhead, not from decoding less. Frame the page-level `Exact` reverse work in arrow-rs `#9937` as the path that keeps the runtime win we measured while bringing peak memory back into the streaming regime via `OffsetIndex` page-level seek.

…tion Two corrections in the empirical-note / reverse-scans section: 1. The internal RG-level `Exact` reverse path was incorrectly described as "walks pages from the back, decodes each batch, reverses row-wise, stops the moment K rows have been delivered." That is actually the page-level `Exact` shape (arrow-rs #9937), not the in-house RG-level implementation. Parquet does not allow partial row-group reads, so the in-house path has to decode the entire target row group, reverse the buffer in memory, take the first K rows, and stop — same memory profile as #18817's proposal. The runtime advantage over `Inexact` + `TopK` comes from removing the per-row heap maintenance, dynamic-filter evaluation, and `SortExec` final ordering pass, not from decoding less sort-column data. Sort col decode on the target row group is the same on both paths. 2. The arrow-rs #9937 paragraph I previously added duplicated the technical detail already present in the long-standing "That primitive is the page-level reverse traversal..." paragraph. Replaced with a one-sentence bridge ("The fix that keeps both the runtime win and a streaming memory profile is page-level `Exact` reverse via arrow-rs #9937, described next.") so the existing paragraph carries the explanation without repetition.

All three sort-pushdown PRs have now landed, so the chronological 'Phase 1/2/3' framing is less useful for readers than a capability breakdown. Sections are now: - The PushdownSort Rule (#19064) - Sort Elimination via Statistics (#21182) - Runtime Reorder for TopK Convergence (#21956) - Reverse Scans for ORDER BY ... DESC (unchanged) In-body Phase references replaced with PR numbers or capability names; anchor links updated; references section restructured.

…ction' The previous wording 'nest by construction' could be read as a code- enforced property. It's actually a logical consequence of using the same sort key (min) at both file and row-group level: a file's min(col) is just the minimum over its row groups' min(col) values, so the most-promising file contains the most-promising row group. The rewritten paragraph spells that out and ties it to why TopK's dynamic filter tightens fast.

Match the dynamic-filter blog's style: narrative talks about capabilities/mechanisms, not 'PR #21956 did X / PR #19064 introduced Y'. The 81 inline PR/issue references in the body were dropping the reader out of the narrative; they belong in a single Issues-and-PRs list at the end. Changes: - TL;DR: drop 6 inline PR refs from the 4 capability bullets - Body sections (PushdownSort Rule, Sort Elimination, Runtime Reorder, Reverse Scans, Empirical Note): drop ~30 inline refs to historical PRs; replace with capability names or 'the rule' / 'the runtime reorder path' style descriptions - What's Next: switch from [#NNNNN] format to named markdown links (matching dynamic-filter's Future Work style) - Issues for new contributors: same conversion - References section: rewrite using full URLs (no link-ref indirection); split into 'Landed' vs 'In flight / open' for clarity Net: ~90 lines removed, all PR/issue numbers now consolidated at the bottom of the post.

The previous draft mixed merged work, in-flight work, and runtime-cost analysis into a single 'Reverse Scans' section and a sprawling 'What's Next' section. Reorganize so the post answers three clear questions in sequence: 1. What's merged today? (Sort Elimination via Statistics + benchmark, Runtime Reorder for TopK Convergence, Reverse Scans for DESC) — unchanged content, just kept tight. 2. Where do those merged features still leave performance on the table? New 'Current Bottlenecks' section with three explicitly numbered bottlenecks: SortExec stays / sort column fully decoded inside open RG / file-granular scheduling can't close the tap mid-file. Pulls in the runtime-cost content that used to be buried in an 'Empirical note' subsection. 3. How does each next-step optimization remove a specific bottleneck? New 'Roadmap' section maps page-level Exact reverse to bottlenecks 1+2, row-group-level dynamic early termination to bottleneck 3, and shows the in-flight 17x-60x pipeline benchmark as a preview of what stacking these mechanisms can deliver. Smaller follow-ups (EnsureRequirements, OFFSET pushdown, multi-column reorder) collected at the end of the roadmap section as a short 'Other follow-ups' bullet list.

- 'Why not full Exact reverse' paragraph: cut reviewer attribution and forward-pointers that were already in the bottleneck/roadmap sections that follow. - TL;DR: trim Runtime Reorder + Reverse Scans bullets to capability and impact; drop implementation mechanics like 'stamps two flags' and 'three-step pipeline'. - 'The PushdownSort Rule' section: cut three paragraphs of 'covered in X below' forward-references that were repeating the section TOC. - Function-wrapped parenthetical in Sort Elimination: 4 lines to 2. - Single-partition vs multi-partition edge case: drop the trailing 'which is why the example is drawn that way' tangent. - 'What this change does not affect' note: trimmed redundant prose. - Remove all references to the limit-pruning blog (intro mention, link definition, References section bullet) — that work is about static LIMIT as an I/O optimization, separate problem from sort ordering.

…mmetry explicit Per code in datasource-parquet/src/source.rs:849-870, the reversed- satisfies branch is 'strictly more powerful than the column-in-schema check' — it runs the request through EquivalenceProperties's full reasoning machinery and handles function monotonicity, constants from filters, equivalence relationships, and multi-column composite orderings. The blog had been treating reverse as just step 3 of the runtime pipeline, which undersold its standalone reach. Structural changes: - Drop the standalone 'Reverse Scans for ORDER BY DESC' H2 section; reverse is now a case of the Inexact runtime reorder path. - Rename Runtime Reorder section to 'Runtime Reorder for TopK and DESC Queries'; intro now lists three classes that fall outside Exact (unsorted, overlapping, DESC). - 'try_pushdown_sort' subsection rewritten as 'Two independent triggers for Inexact', describing column-in-schema vs reversed- satisfies as separate signals with the latter being strictly more powerful. - 'Three runtime steps' subsection: step 3 now explicitly notes when steps 1-2 are skipped and only the iteration reverse runs. - New 'ORDER BY DESC in practice' subsection right after the 3-step pipeline, explaining the RGs-descending-x-rows-ascending stream. - Move reverse-scan.svg from the deleted Reverse Scans section into the Roadmap > Page-level Exact reverse subsection where it illustrates the 128 MB vs 1 MB peak comparison directly. Accuracy fix: - 'Multi-column reorder follow-ups' bullet was inaccurate — said the reorder 'only fires on plain columns'. The reverse path does handle function-wrapped and multi-column cases via EquivalenceProperties; only the stats reorder step is restricted. Updated wording to scope the limitation correctly.

…shdown EnsureRequirements (#21976) is a rule-unification effort for EnforceDistribution+EnforceSorting. Touches the same area but isn't a sort pushdown optimization. OFFSET pushdown (#21828) is about LIMIT/OFFSET pruning. Same kind of tangent as the limit-pruning blog reference removed earlier — it's LIMIT optimization, not sort pushdown. The remaining 'Multi-column and function-wrapped reorder follow-ups' bullet is actually directly about sort pushdown's reorder step (#22198), so it stays. With the other two removed, 'Other follow-ups' collapsed to a single point — promoted to its own subsection 'Extending the stats reorder step' for clarity. Also dropped the corresponding entries from the References section.

… (k-way merge) The 'switch work queue from PartitionedFile to RG descriptors' fix is sufficient for non-overlapping ranges (post-reorder), where the first file globally has the smallest values and subsequent files are already stats-pruned. For overlapping ranges, the next smallest value could sit in any of several open files — matching the non-overlap efficiency requires explicit k-way merge across open files' next-RG mins. The dynamic filter does this implicitly (RGs with max < threshold are dropped), but explicit comparison closes the tap earlier when the filter tightens slowly.

Restructure the post so the merged Runtime Row-Group Dynamic Pruning (#22450) is the centerpiece, matching the community-talk narrative: - TL;DR now leads with the three-layer pruning stack (file + RG + row) and the topk_tpch headline (5/11 queries 3-4x, 0 regressions, -44% total). - Exact path keeps credit attribution explicit: EnforceSorting already handled the simple on-disk-sorted case; Phase 2 closes the wrong-on-disk-order gap. BufferExec story expanded with the SPM stall diagnosis. - Inexact path split into Tier 1 (file-level, already had early stop) vs Tier 2 (RG-level, no early stop until #22450) so the gap that #22450 closes is named explicitly. - New section: #22450 mechanics - architecture-3-eras, transition() drain/decide/drive, RowGroupPruner watch/rebuild/prune, cascading prune walkthrough. - New section: Three-layer pruning stack with the same DynamicFilter driving all three layers, plus the topk_tpch benchmark table. - Old "Current Bottlenecks" + "Roadmap" merged into "Future Directions": A) page-level Exact reverse (arrow-rs #9937), B) page-level dynamic prune at RG boundary (#23216). - Updated landed-PRs list to include #20839 (push-based decoder, the prerequisite for #22450) and #22450 itself. Pulls in 9 PNGs from the community-call deck for the #22450-specific diagrams (architecture, transition, pruner, cascade, three-layer stack, topk_tpch results, page-level future).

Copilot

Pull request overview

Adds a new long-form blog post documenting Apache DataFusion’s multi-release “sort pushdown” work (v52→v55+), centered on runtime row-group dynamic pruning (#22450), and introduces a set of new diagrams (SVGs + referenced PNGs) to support the narrative.

Changes:

Added a new blog post: Sort Pushdown in DataFusion: Skip Sorts, Skip Decode, Skip I/O (scheduled-dated 2026-07-05).
Added multiple new SVG diagrams under content/images/sort-pushdown/ illustrating sort pushdown phases, decision flow, runtime pipeline, buffering, and benchmarks.
Wired the post to those assets via /blog/images/sort-pushdown/... image references.

Reviewed changes

Copilot reviewed 1 out of 19 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
content/blog/2026-07-05-sort-pushdown.md	New blog post covering sort pushdown and runtime pruning, referencing the accompanying diagrams/assets.
content/images/sort-pushdown/reverse-scan.svg	New diagram contrasting row-group reverse vs page-level reverse scan.
content/images/sort-pushdown/pr21956-runtime-pipeline.svg	New diagram of the runtime reorder pipeline driven by ParquetSource flags.
content/images/sort-pushdown/pr21956-decision.svg	New decision-tree diagram for `try_pushdown_sort` outcomes.
content/images/sort-pushdown/plan-diff.svg	New before/after EXPLAIN diagram showing sort elimination.
content/images/sort-pushdown/phase2-stats-overlap.svg	New diagram explaining non-overlap vs overlap using min/max stats.
content/images/sort-pushdown/phase1-file-reorder.svg	New diagram showing file reorder by min/max stats to enable sort elimination.
content/images/sort-pushdown/buffer-exec.svg	New diagram explaining `BufferExec` as a bounded per-partition buffer.
content/images/sort-pushdown/buffer-exec-stall.svg	New diagram illustrating SPM stalls without buffering vs behavior with `BufferExec`.
content/images/sort-pushdown/benchmark.svg	New benchmark bar-chart diagram for the `sort_pushdown` suite results.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

alamb · 2026-06-29T15:03:43Z

@zhuqi-lucas is this blog ready for review?

zhuqi-lucas · 2026-06-29T15:08:44Z

sort-pushdown-community-call.pdf

@zhuqi-lucas is this blog ready for review?

Thanks @alamb , it should be ready, and mostly i copied content from my recent China Meetup slide above.

alamb · 2026-07-03T18:08:34Z

Starting to check this one out

alamb

Thanks for this blog @zhuqi-lucas - I got about half way thought it in detail and I think it could be published as is so I am approving it.

However, I think we could potentially make it a much stronger narrative so I spent quite a while providing a bunch of stylistic feedback for your review

Suggestion: Make the writing more concise

There are several places in this blog (I highlighted some of them) where I think you can get across the same idea with about 1/3 of the text -- this will make the blog much easier to read and understand in my opinion.

LLMs are pretty good at helping reduce the content length if you give them that as an explicit goal and help identify unnecessary content (I left several inline suggestions)

Suggestion: reframe as a general Database technology

My biggest suggestion is to change how the blog presents the information (the information is great).

I think this post is currently structured as a "what did we do" narrative arc and present the way the work unfolded over time. However, I think that may not be as interesting to people as a post that teaches them some general principles that may be applicable to their own systems (using our work on DataFusion as an example)

So concretely that would mean something like

Explain the problem (data is often partly sorted and the query engine doesn't know about the existing sortedness )
Explain techniques to take advantage of pre-existing sort orders
present results about how well it works

Suggestion: any time you refer to the internals of DataFUsion, link to the docs

I think most readers will not know what a SortPreservingMergeExec is. So I recommend either

refer to this in more general terms ("sorted merge opertator")
link to the docs.rs or source code so it is clear what you are talking about

Some links not working

for some reason links with ``` are not working. I think this is due to the asf pelican rendering system

alamb · 2026-07-03T18:25:27Z

+[Apache Parquet]: https://parquet.apache.org/
+
+[Apache DataFusion] has always been able to skip the sort in the
+*trivial* case: when the catalog declares an ordering (`WITH ORDER`


I am not sure I would call the code for the existing sort orderings "trivial" 😆 -- I would probably use the term "exact" -- so like "been able to skip the sort in the exact case"

Maybe you could also link to this blog from @akurmustafa that explains what the existing function is : https://datafusion.apache.org/blog/2025/03/11/ordering-analysis/

I also recommend linking to the docs about WITH ORDER as it isn't SQL standard (it is a datafusion specific thing)

alamb · 2026-07-03T18:27:47Z

+or parquet `sorting_columns`) **and** the on-disk file listing
+already matches that order, the `EnsureRequirements` rule sees that
+the scan's `output_ordering` already satisfies the request and


This is kind of a lot of detail -- I know what it means but many readers might not -- it might help to write this in non-datafusion technical terms. So instead of "when the catalog declares ..." something more like

"when the data on disk is perfectly sorted and table definition includes the ordering, DataFusion can avoid redundant sorts, as explained in the blog ...."

alamb · 2026-07-03T18:29:25Z

+or parquet `sorting_columns`) **and** the on-disk file listing
+already matches that order, the `EnsureRequirements` rule sees that
+the scan's `output_ordering` already satisfies the request and
+**removes the redundant `SortExec`**. The hard cases were


And then in this next sentence, rather than calling it "hard" maybe call it "another" case and describe in terms of user visible behavior: "This blog is about taking advantage of sorting even when the files may not be perfectly sorted and DataFusion is not told about any ordering up front"

alamb · 2026-07-03T18:32:41Z

+
+*Qi Zhu, [Massive](https://www.massive.com/)*
+
+Many [Apache Parquet] datasets are already sorted on disk. Time-series


I suggest you start this blog with a 2-3 sentence abstract / summary of its contents, so that a potential reader can quickly decide if they want to read it or not

Maybe something like

"DataFusion now automatically takes advantage of sortedess in the data, even when the data is only partially sorted and DataFusion does not know the sort order ahead of time. This blog explains why this is an important usecase and how Datafusion achieves this, through a combination of pushing down sort information, and adaptive runtime reordering and pruning"

alamb · 2026-07-03T18:44:16Z

+*Qi Zhu, [Massive](https://www.massive.com/)*
+
+Many [Apache Parquet] datasets are already sorted on disk. Time-series
+files are usually written in ingestion-time order. Event logs are sharded


Here are a few other motivating observations that might be worth working in

Data often has a partial sort order that is either not entirely sorted (e.g. ingestion order vs data order) or else the database doesn't know about the existing ordering

Modern data lakes based on Apache Icerberg and others often have to work with data as it is / was written and don't have the luxury of making a second copy, or resorting -- so being able to automatically exploit existing sortedness is important

alamb · 2026-07-03T20:22:30Z

+[#19064]: https://github.com/apache/datafusion/pull/19064
+[#21182]: https://github.com/apache/datafusion/pull/21182
+
+* Three files: `a.parquet`, `b.parquet`, `c.parquet`.


this is a good example

alamb · 2026-07-03T20:23:40Z

+**strips the ordering entirely**. From the optimizer's point of view
+the scan now has no declared ordering. Physical planning translates
+the user's `ORDER BY` into a `SortExec`, and the earlier pipeline
+rule [`EnsureRequirements`] (which consolidates the old


BTW this doesnt' seem to be a link

alamb · 2026-07-03T20:24:42Z

+  in the **wrong order** on disk (e.g. alphabetical by name, not by
+  time).
+
+`validated_output_ordering()` looks at this, sees that the


I think this is a lot of implementation detail that will not mean much to anyone else -- just saying "previously DataFusion could not recognize the sort order, but now it can" is probably enough, with a link to the code for anyone who wants more details

alamb · 2026-07-03T20:25:58Z

+
+[`EnsureRequirements`]: https://github.com/apache/datafusion/blob/main/datafusion/physical-optimizer/src/ensure_requirements/mod.rs
+
+Stats-based sort elimination fixes this in `PushdownSort`, which


This is also full of implementation details -- I suggest you focus on just explaining the conditions under which the transformation can be done (non overlapping min/max boundaries)

alamb · 2026-07-03T20:28:56Z

+sorts via monotonicity inference — but stats-based sort elimination
+still needs a plain column.)
+
+<img src="/blog/images/sort-pushdown/phase2-stats-overlap.svg" alt="Detecting non-overlapping ranges via min/max statistics" width="100%" class="img-fluid"/>


The image is great -- I recommend adding a figure label to explain it to someone who is just skimming (not reading hte article linerly)

zhuqi-lucas added 14 commits June 29, 2026 10:29

Push draft date to 2026-05-25

f137266

Mark #21956 as landed (merged via merge queue)

0fc60d4

zhuqi-lucas force-pushed the blog-sort-pushdown branch from 970b6d6 to 03d1e79 Compare June 29, 2026 02:29

zhuqi-lucas changed the title ~~Add blog: Sort Pushdown in DataFusion: Skip Sorts, Skip I/O~~ Add blog: Sort Pushdown in DataFusion: Skip Sorts, Skip Row Groups, Skip I/O Jun 29, 2026

zhuqi-lucas force-pushed the blog-sort-pushdown branch from 03d1e79 to fb489ff Compare June 29, 2026 02:31

zhuqi-lucas changed the title ~~Add blog: Sort Pushdown in DataFusion: Skip Sorts, Skip Row Groups, Skip I/O~~ Add blog: Sort Pushdown in DataFusion: Skip Sorts, Skip Decode, Skip I/O Jun 29, 2026

zhuqi-lucas marked this pull request as ready for review June 29, 2026 02:31

zhuqi-lucas requested review from adriangb, alamb and Copilot June 29, 2026 02:31

Copilot started reviewing on behalf of zhuqi-lucas June 29, 2026 02:32 View session

Copilot AI reviewed Jun 29, 2026

View reviewed changes

alamb reviewed Jul 3, 2026

View reviewed changes


		Qi Zhu, [Massive](https://www.massive.com/)

		Many [Apache Parquet] datasets are already sorted on disk. Time-series


		[`EnsureRequirements`]: https://github.com/apache/datafusion/blob/main/datafusion/physical-optimizer/src/ensure_requirements/mod.rs

		Stats-based sort elimination fixes this in `PushdownSort`, which

Uh oh!

Conversation

zhuqi-lucas commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmarks documented

What changed since the original draft

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

alamb commented Jun 29, 2026

Uh oh!

zhuqi-lucas commented Jun 29, 2026

Uh oh!

alamb commented Jul 3, 2026

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Suggestion: Make the writing more concise

Suggestion: reframe as a general Database technology

Suggestion: any time you refer to the internals of DataFUsion, link to the docs

Some links not working

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zhuqi-lucas commented May 14, 2026 •

edited

Loading