Skip to content

👌 Short-circuit simple filter expressions to avoid eval() overhead#1677

Merged
chrisjsewell merged 10 commits intomasterfrom
copilot/add-benchmark-for-issue-1676
Mar 31, 2026
Merged

👌 Short-circuit simple filter expressions to avoid eval() overhead#1677
chrisjsewell merged 10 commits intomasterfrom
copilot/add-benchmark-for-issue-1676

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 17, 2026

Short-circuit simple filter expressions to avoid eval() overhead

Closes #1676

Summary

This PR introduces an AST-based fast path (sphinx_needs.ubquery) that compiles simple filter expressions into native Python callables, bypassing eval() entirely for common patterns. The filter string is parsed once via ast.parse, and if the expression is a supported pattern (comparisons, membership tests, boolean and/or/not, search()), a lightweight predicate closure is returned and cached (LRU, 256 entries). Complex expressions that can't be short-circuited (those referencing needs, current_need, or c) fall through to the existing eval() path unchanged.

The fast path is applied at two levels:

  1. Batch filtering (filter_needs_and_parts) — used by needtable, needlist, needflow, and all fallback paths from filter_needs_view/filter_needs_parts. One predicate build, N native lambda applications.
  2. Single-need filtering (filter_single_need) — used by link condition evaluation, constraints, gantt, graphviz, plantuml, sequence diagrams, and dynamic functions. LRU-cached predicates avoid per-call overhead.

Both layers compose with the existing index-based pre-filtering in _analyze_and_apply_expr, which narrows the candidate set via NeedsView indexes before the predicate loop runs.

Benchmark results

Using tox -e py314-benchmark -- tests/benchmarks/test_querying.py --benchmark-columns=mean --benchmark-time-unit=ms -k 1000:

Before (AST fast path disabled):

Benchmark Mean (ms)
test_filter_single_need_simple[1000] 11.45
test_filter_needs_and_parts_simple[1000] 6.57
test_filter_needs_and_parts_compound[1000] 5.59
test_filter_needs_and_parts_complex_eval[1000] 15.36
test_resolve_links_constrained[1000] 33.02
test_resolve_links_unique_conditions[1000] 69.77

After, without LRU caching (AST fast path enabled, but predicate rebuilt each call):

Benchmark Mean (ms)
test_filter_single_need_simple[1000] 3.13
test_filter_needs_and_parts_simple[1000] 0.17
test_filter_needs_and_parts_compound[1000] 0.68
test_filter_needs_and_parts_complex_eval[1000] 16.84
test_resolve_links_constrained[1000] 13.48
test_resolve_links_unique_conditions[1000] 37.51

After (AST fast path enabled):

Benchmark Mean (ms)
test_filter_single_need_simple[1000] 0.34
test_filter_needs_and_parts_simple[1000] 0.18
test_filter_needs_and_parts_compound[1000] 0.65
test_filter_needs_and_parts_complex_eval[1000] 14.44
test_resolve_links_constrained[1000] 5.90
test_resolve_links_unique_conditions[1000] 50.80

Batch filtering with the fast path is ~37x faster for simple filters and ~9x faster for compound and expressions. Per-need filter_single_need is ~34x faster. End-to-end resolve_links with constrained links is ~5.6x faster. The complex_eval benchmark (which forces the eval() fallback via current_need) is unaffected, confirming the slow path is unchanged.

The "without LRU caching" column isolates the benefit of native closures vs eval() from the caching benefit. Batch functions (filter_needs_and_parts) build the predicate once regardless, so caching has no effect (0.18ms vs 0.17ms). Per-need functions (filter_single_need) pay the ast.parse cost on every call without caching (3.13ms vs 0.34ms), but are still ~3.7x faster than eval() (11.45ms). resolve_links with repeated conditions benefits most from caching (5.90ms vs 13.48ms).

Pathway to per-type fields

Beyond the speed win, this architecture opens a pathway to per-type fields (discussion #1646). Because field names are resolved lazily — only when the predicate actually reaches them at runtime — short-circuit evaluation in and/or expressions means a filter like:

type == "spec" and spec_field == "x"

will never access spec_field for needs whose type is not "spec". This is not possible with eval(), which eagerly populates the entire namespace with all field names before evaluation begins, requiring every field to exist on every need type.

What changed

  • New module sphinx_needs/ubquery.py — AST-based filter compiler with try_build_simple_predicate() as the public entry point. Supports ==, !=, <, <=, >, >=, in, not in, search(), bare names, not, and, or. Bails out (returns None) for expressions referencing context-only names (needs, current_need, c) or unsupported AST patterns.
  • sphinx_needs/filter_common.py:
    • filter_needs_and_parts() — new fast path: tries try_build_simple_predicate() before the per-item compile()+eval() loop. Falls through to eval() only for complex expressions.
    • filter_single_need() — always attempts the AST-based fast path before falling back to eval(). All callers (constraints, gantt, graphviz, plantuml, sequence, dynamic functions, link conditions) benefit automatically.
  • sphinx_needs/directives/need.pyresolve_links link condition evaluation now benefits from the always-on fast path (removed the previous simple_filter=True parameter which is no longer needed).
  • New test module tests/test_ubquery.py — Comprehensive tests for predicate building, field resolution, context-only name blocklist, filter_data fallback, and filter_single_need integration.
  • New benchmark tests/benchmarks/test_constrained_links_benchmark.py — Benchmarks for filter_single_need, filter_needs_and_parts (simple, compound, and eval-fallback), and resolve_links with constrained links.

Copilot AI and others added 2 commits March 17, 2026 06:47
Co-authored-by: chrisjsewell <2997570+chrisjsewell@users.noreply.github.com>
Co-authored-by: chrisjsewell <2997570+chrisjsewell@users.noreply.github.com>
Copilot AI changed the title [WIP] Add benchmark tests for issue 1676 Short-circuit simple expressions in filter_single_need to avoid eval() overhead Mar 17, 2026
Copilot AI requested a review from chrisjsewell March 17, 2026 06:49
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 17, 2026

Codecov Report

❌ Patch coverage is 94.40000% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.30%. Comparing base (4e10030) to head (73e7220).
⚠️ Report is 277 commits behind head on master.

Files with missing lines Patch % Lines
sphinx_needs/filter_common.py 86.48% 5 Missing ⚠️
sphinx_needs/ubquery.py 97.72% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1677      +/-   ##
==========================================
+ Coverage   86.87%   89.30%   +2.42%     
==========================================
  Files          56       72      +16     
  Lines        6532    10324    +3792     
==========================================
+ Hits         5675     9220    +3545     
- Misses        857     1104     +247     
Flag Coverage Δ
pytests 89.30% <94.40%> (+2.42%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@chrisjsewell chrisjsewell marked this pull request as ready for review March 31, 2026 08:54
@chrisjsewell chrisjsewell changed the title Short-circuit simple expressions in filter_single_need to avoid eval() overhead 👌 Short-circuit simple filter expressions to avoid eval() overhead Mar 31, 2026
Copy link
Copy Markdown
Member

@ubmarco ubmarco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Performance is always important.

@chrisjsewell chrisjsewell merged commit df81a5c into master Mar 31, 2026
24 checks passed
@chrisjsewell chrisjsewell deleted the copilot/add-benchmark-for-issue-1676 branch March 31, 2026 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Short-circuit simple expressions in filter_single_need to avoid eval() overhead

3 participants