Skip to content

StackGuard races recursive's process-global stack-size minimum across threads #23246

Description

@mdashti

Describe the bug

datafusion-sql raises recursive's minimum stack size (the "red zone") via StackGuard (datafusion/sql/src/stack.rs) at two production sites: query_to_plan (datafusion/sql/src/query.rs) and the public expr_to_sql (datafusion/sql/src/unparser/expr.rs). Both call StackGuard::new(256 * 1024), which sets the minimum on new() and restores the previous value on drop().

That minimum is a process-global (recursive's MINIMUM_STACK_SIZE, default 128 KiB). The save/restore is therefore not thread-safe. When two threads plan or unparse deep expressions at the same time, a guard that captured the 128 KiB default and drops while another thread is mid-recursion restores 128 KiB globally. The in-flight recursion then observes a red zone smaller than its per-level stack cost, stacker does not grow in time, and the OS stack overflows (the process aborts with SIGABRT).

The interleaving:

  1. global = 128 KiB (default, no guard held)
  2. thread B: StackGuard::new(256K) captures previous = 128K, sets 256K
  3. thread A: StackGuard::new(256K) captures previous = 256K, sets 256K, begins a deep recursion that relies on 256K
  4. thread B drops: restores previous = 128K
  5. thread A's next recursive checkpoint reads 128K; a single heavy hop (e.g. the unparser's scalar-function-override path) does not fit, and the stack overflows

Both guard sites use 256 * 1024, so any pair of concurrent deep plan/unparse calls can hit this.

To Reproduce

On x86_64 in a debug / ci profile (larger frames), with recursive_protection enabled (on by default via the datafusion crate):

// `recursive::set_minimum_stack_size` = the global the `StackGuard` mutates.
recursive::set_minimum_stack_size(128 * 1024); // what a concurrent StackGuard drop restores

std::thread::Builder::new()
    .stack_size(2 * 1024 * 1024)
    .spawn(|| {
        // ~2000-deep array_has(...) chain, unparsed through the recursion entry that
        // does not re-install a StackGuard (Unparser::expr_to_sql_with_nesting).
        // Overflows at 128 KiB; with `256 * 1024` it returns cleanly.
    })
    .unwrap()
    .join()
    .unwrap();

In production the 128 KiB state in step 1 is not reached by an explicit call: it is what a concurrent StackGuard on another thread restores on drop (the interleaving above) while this recursion is in flight. So a realistic trigger is two threads, one repeatedly unparsing/planning shallow expressions (churning the guard) and one unparsing/planning a deep expression, on x86_64.

Expected behavior

Concurrent deep planning/unparsing should not overflow the stack. A red zone that one thread is relying on should not be lowerable by another thread's unrelated StackGuard drop.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions