Skip to content

perf: reduce TreeNodeFilter allocations via IReadOnlyList and ReadOnlySpan#8238

Open
abdelghani-moussaid wants to merge 6 commits into
microsoft:mainfrom
abdelghani-moussaid:optimize-tree-node-filter
Open

perf: reduce TreeNodeFilter allocations via IReadOnlyList and ReadOnlySpan#8238
abdelghani-moussaid wants to merge 6 commits into
microsoft:mainfrom
abdelghani-moussaid:optimize-tree-node-filter

Conversation

@abdelghani-moussaid
Copy link
Copy Markdown

@abdelghani-moussaid abdelghani-moussaid commented May 14, 2026

Performance optimization for TreeNodeFilter to eliminate allocations during test discovery.

Key Changes:

  • Widened SubExpressions to IReadOnlyList to avoid IEnumerator boxing.
  • Implemented ReadOnlySpan-based recursion for .NET 8.0+ to eliminate O(N) string fragment allocations.
  • Refactored Span path to use foreach loops to satisfy the CS9108 ref-struct capture constraint.

Verification Results:

  • BenchmarkDotNet: 160 B -> 0 B allocations on .NET 8.0. Execution time reduced by ~45%.

Unit Tests: All local TreeNodeFilter tests passed.


Related

…ySpan

- Widen SubExpressions to IReadOnlyList to enable optimized LINQ paths.
- Implement ReadOnlySpan-based matching for .NET 8.0+ to eliminate
  string fragment allocations.
- Use procedural loops in Span path to avoid ref-struct capture (CS9108).
Copilot AI review requested due to automatic review settings May 14, 2026 21:36
@Evangelink
Copy link
Copy Markdown
Member

Code Review

Thanks for the perf work! The intent is solid, but a few things in the implementation don't match the "zero allocations" claim, and the PR description has a couple of inaccuracies worth fixing. Detailed findings below.

🟠 Major

1. foreach (FilterExpression expr in subexprs) over IReadOnlyList<T> allocates an enumerator.

OperatorExpression.SubExpressions is statically typed as IReadOnlyList<FilterExpression> (even though the backing field is FilterExpression[]). IReadOnlyList<T> exposes no struct enumerator, so foreach binds to IEnumerable<T>.GetEnumerator() which returns a heap-allocated enumerator. This happens once per Or/And node, on a hot path — directly contradicting the "0 B" goal.

The existing string overload deliberately avoids this with an indexed for loop:

for (int i = 0; i < subexprs.Count; i++)
{
    if (MatchFilterPattern(subexprs[i], testNodeFragment, properties)) { ... }
}

Please use the same pattern in the Span overload. As a side note, the PR description says the foreach change was needed "to satisfy the CS9108 ref-struct capture constraint" — CS9108 is about lambda/local-function capture of ref struct parameters, and there is no lambda or local function here. The original for loop pattern compiles fine with ReadOnlySpan<char> arguments.

2. subexprs.Single() for the Not case.

case OperatorExpression { Op: FilterOperator.Not, SubExpressions: var subexprs }:
    return !MatchFilterPattern(subexprs.Single(), testNodeFragment, properties);

vs. the string overload:

case OperatorExpression { Op: FilterOperator.Not, SubExpressions: [var singleSubExpr] }:
    return !MatchFilterPattern(singleSubExpr, testNodeFragment, properties);

Issues:

  • Unnecessary LINQ call where [var singleSubExpr] (or subexprs[0]) is allocation-free and trivially intent-revealing.
  • Changes the exception contract: InvalidOperationException from LINQ vs. ApplicationStateGuard.Unreachable(). Upstream ParseFilter validation makes this practically unreachable, but having two overloads diverge in exception contract for the same logical case is a future trap.

Please restore the list pattern so the two overloads match.

🟡 Moderate

3. Code duplication. After fixing #1 and #2 the two overloads will be near-identical; consider either:

  • keeping them byte-identical except for the fragment type and adding a short comment cross-referencing each other, or
  • having the string overload simply call .AsSpan() and forward to the Span overload on NET8_0_OR_GREATER, falling back to the original impl only on netstandard2.0.

The second option eliminates the duplication on the TFMs where it actually matters.

4. Tests. The PR adds no tests. With both overloads shipped (string on netstandard2.0, Span on net8+), a parity regression would not be caught on one TFM. Suggest adding a small test that exercises Or (multi-sub), And (multi-sub), Not (1 sub), ValueAndPropertyExpression, and NopExpression so both code paths are validated against the same inputs. An allocation guard on the Span path (AllocatedBytes style) would also lock in the zero-allocation claim once #1 and #2 are fixed.

5. PR description nits.

  • "Widened SubExpressions to IReadOnlyList to avoid IEnumerator boxing" isn't in this diff — OperatorExpression.SubExpressions was already widened in [Efficiency Improver] perf: eliminate LINQ closure allocations in TreeNodeFilter #8035.
  • "Refactored Span path to use foreach loops to satisfy the CS9108 ref-struct capture constraint" — as noted above, CS9108 doesn't apply here.
  • The "~45% time / 160 B → 0 B" benchmark numbers are plausible for a single-fragment ValueExpression input but won't generalize to Or/And/Not trees (which currently regress on allocation, per Initial commit! 🎉 #1). Worth scoping that statement.

✅ Looks good

  • TFM gating with #if NET8_0_OR_GREATER is correct (Regex.IsMatch(ROS<char>) is available on .NET 7+; the only other TFM is netstandard2.0).
  • No public API changes; method stays private static.
  • Regex is thread-safe; no shared mutable state introduced.
  • MatchProperties correctly left out of scope — it operates on PropertyBag and has its own struct-enumerator optimization.

Once #1 and #2 are addressed, this becomes a solid micro-optimization that genuinely lands at zero allocations across all expression shapes.

@Evangelink Evangelink added the needs/author-feedback Waiting on the original author. label May 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs/author-feedback Waiting on the original author.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants