Describe the bug
datafusion-sql raises recursive's minimum stack size (the "red zone") via StackGuard (datafusion/sql/src/stack.rs) at two production sites: query_to_plan (datafusion/sql/src/query.rs) and the public expr_to_sql (datafusion/sql/src/unparser/expr.rs). Both call StackGuard::new(256 * 1024), which sets the minimum on new() and restores the previous value on drop().
That minimum is a process-global (recursive's MINIMUM_STACK_SIZE, default 128 KiB). The save/restore is therefore not thread-safe. When two threads plan or unparse deep expressions at the same time, a guard that captured the 128 KiB default and drops while another thread is mid-recursion restores 128 KiB globally. The in-flight recursion then observes a red zone smaller than its per-level stack cost, stacker does not grow in time, and the OS stack overflows (the process aborts with SIGABRT).
The interleaving:
- global = 128 KiB (default, no guard held)
- thread B:
StackGuard::new(256K) captures previous = 128K, sets 256K
- thread A:
StackGuard::new(256K) captures previous = 256K, sets 256K, begins a deep recursion that relies on 256K
- thread B drops: restores
previous = 128K
- thread A's next
recursive checkpoint reads 128K; a single heavy hop (e.g. the unparser's scalar-function-override path) does not fit, and the stack overflows
Both guard sites use 256 * 1024, so any pair of concurrent deep plan/unparse calls can hit this.
To Reproduce
On x86_64 in a debug / ci profile (larger frames), with recursive_protection enabled (on by default via the datafusion crate):
// `recursive::set_minimum_stack_size` = the global the `StackGuard` mutates.
recursive::set_minimum_stack_size(128 * 1024); // what a concurrent StackGuard drop restores
std::thread::Builder::new()
.stack_size(2 * 1024 * 1024)
.spawn(|| {
// ~2000-deep array_has(...) chain, unparsed through the recursion entry that
// does not re-install a StackGuard (Unparser::expr_to_sql_with_nesting).
// Overflows at 128 KiB; with `256 * 1024` it returns cleanly.
})
.unwrap()
.join()
.unwrap();
In production the 128 KiB state in step 1 is not reached by an explicit call: it is what a concurrent StackGuard on another thread restores on drop (the interleaving above) while this recursion is in flight. So a realistic trigger is two threads, one repeatedly unparsing/planning shallow expressions (churning the guard) and one unparsing/planning a deep expression, on x86_64.
Expected behavior
Concurrent deep planning/unparsing should not overflow the stack. A red zone that one thread is relying on should not be lowerable by another thread's unrelated StackGuard drop.
Additional context
Describe the bug
datafusion-sqlraisesrecursive's minimum stack size (the "red zone") viaStackGuard(datafusion/sql/src/stack.rs) at two production sites:query_to_plan(datafusion/sql/src/query.rs) and the publicexpr_to_sql(datafusion/sql/src/unparser/expr.rs). Both callStackGuard::new(256 * 1024), which sets the minimum onnew()and restores the previous value ondrop().That minimum is a process-global (
recursive'sMINIMUM_STACK_SIZE, default 128 KiB). The save/restore is therefore not thread-safe. When two threads plan or unparse deep expressions at the same time, a guard that captured the 128 KiB default and drops while another thread is mid-recursion restores 128 KiB globally. The in-flight recursion then observes a red zone smaller than its per-level stack cost,stackerdoes not grow in time, and the OS stack overflows (the process aborts with SIGABRT).The interleaving:
StackGuard::new(256K)capturesprevious = 128K, sets256KStackGuard::new(256K)capturesprevious = 256K, sets256K, begins a deep recursion that relies on256Kprevious = 128Krecursivecheckpoint reads128K; a single heavy hop (e.g. the unparser's scalar-function-override path) does not fit, and the stack overflowsBoth guard sites use
256 * 1024, so any pair of concurrent deep plan/unparse calls can hit this.To Reproduce
On
x86_64in a debug /ciprofile (larger frames), withrecursive_protectionenabled (on by default via thedatafusioncrate):In production the 128 KiB state in step 1 is not reached by an explicit call: it is what a concurrent
StackGuardon another thread restores on drop (the interleaving above) while this recursion is in flight. So a realistic trigger is two threads, one repeatedly unparsing/planning shallow expressions (churning the guard) and one unparsing/planning a deep expression, onx86_64.Expected behavior
Concurrent deep planning/unparsing should not overflow the stack. A red zone that one thread is relying on should not be lowerable by another thread's unrelated
StackGuarddrop.Additional context
cargo test hash collisions (amd64)failure in fix: drop a deepBinaryExprchain iteratively to avoid a stack overflow #23198 (https://github.com/apache/datafusion/actions/runs/28334863782/job/83948663538?pr=23198): the deep-unparse regression test overflowed because sibling unparser tests in the same process churned the global red zone as their guards dropped. That PR isolates the test in its own binary so the suite is stable, but it does not fix the underlying race.StackGuardwas added for) and Add stacker and recursive #13310 (the plannerStackGuard).recursive'sMINIMUM_STACK_SIZE/STACK_ALLOC_SIZEare global atomics, so any scoped, restore-on-drop mutation of them is inherently racy across threads.stacker::maybe_growwith a local red zone), since the alternative of raising the global and never restoring it leaks the larger red zone into every otherrecursiveuser in the process.