perf(bulker) snowflake: sort dedup CTAS by timestamp#1347
Conversation
The dedup CTAS used to emit rows in window-function-output order
(roughly PK-grouped), which meant INSERT later wrote new T
micro-partitions whose TO_DATE(ts) ranges spanned the whole batch.
T is clustered by TO_DATE(timestamp) (sfAlterClusteringKeyTemplate),
so those wide new micro-partitions force auto-clustering to re-sort
them later — billable warehouse work.
Add ORDER BY {ts} to the dedup template (gated on a new DedupOrderBy
QueryPayload field) so the dedup rows hit storage in timestamp order.
INSERT then writes new micro-partitions whose TO_DATE(ts) min/max are
already tight along the cluster key; auto-clustering has almost
nothing to do.
UPDATE is unaffected (rewriting micro-partitions preserves T's existing
clustering); the join cost is unaffected (PK-keyed, not ts-keyed).
ORDER BY adds an O(N log N) sort on the already-deduped row set —
sub-second for typical batch sizes.
Only enabled when targetTable.TimestampColumn != ""; templates that
don't set DedupOrderBy keep the old behaviour.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Reviewed the Snowflake split-merge changes in snowflake.go plus the QueryPayload update. I found one correctness risk worth addressing: the new dedup CTAS ORDER BY can reference a timestamp column that is present on the destination table metadata but absent from the current source batch schema, which would fail the dedup stage at runtime.
| // strictly an INSERT-side optimisation; no benefit (and no harm) | ||
| // when there is no timestamp column. | ||
| var dedupOrderBy string | ||
| if targetTable.TimestampColumn != "" { |
There was a problem hiding this comment.
Possible runtime regression: this uses targetTable.TimestampColumn unconditionally when the target has one, but the dedup CTAS reads from sourceTable columns for the current batch. If a batch doesn’t carry that timestamp field (while destination metadata still has it), the generated ORDER BY references a missing column and dedup fails with an invalid identifier. Should we gate this with sourceTable.Columns.Get(targetTable.TimestampColumn) before setting DedupOrderBy?
Follow-up to #1346. Asked whether the dedup TEMPORARY table is already presorted by timestamp — it isn't (QUALIFY emits rows in window-function-output order, roughly PK-grouped) — and whether presorting would help. It does, but only for INSERT's write path.
Summary
Add an optional `ORDER BY {timestamp_col}` clause to the dedup CTAS template (gated on a new `QueryPayload.DedupOrderBy` field; set in `copyOrMergeSplit` only when `targetTable.TimestampColumn != ""`).
Why
Target tables created by bulker use `CLUSTER BY (TO_DATE(timestamp))` (`sfAlterClusteringKeyTemplate`). When the INSERT stage writes new rows into T, Snowflake creates new micro-partitions whose `TO_DATE(ts)` min/max are determined by the input row order:
UPDATE is unaffected — rewriting a micro-partition preserves T's existing clustering layout regardless of source row order. The join itself is unaffected — hash-join doesn't care about input order on either side.
Cost: O(N log N) sort on the already-deduped row set during the CTAS. Sub-second for typical batch sizes; trivial compared to even one auto-clustering pass on T.
Test plan
🤖 Generated with Claude Code