FIX: Make RenderContext parent-queries answer is_a? semantics#42
Merged
Conversation
gschlager
added a commit
that referenced
this pull request
May 12, 2026
The cache existed to make has_parent? / find_parent / count_parents O(1) instead of O(depth). After PR #42 it was only earning its keep on has_parent?'s exact-class fast-path; find_parent and count_parents already scan @Parents to answer is_a? semantics. Benchmarking shows the cache cost more than it saved at realistic depths: End-to-end (bench/bench.rb --isolated, YJIT, Ruby 3.4.8): report before after delta simple 36.8k/s 39.3k/s +6.7% nested 38.3k/s 41.4k/s +8.2% list 35.3k/s 38.4k/s +8.6% table 11.3k/s 12.1k/s +6.9% quote_nested 26.0k/s 27.7k/s +6.4% code 83.7k/s 88.9k/s +6.1% url 44.9k/s 47.0k/s +4.8% mixed 13.1k/s 14.0k/s +6.7% large_doc 744/s 797/s +7.2% (escape_plain / escape_mixed unchanged — they don't touch RenderContext.) Micro (RenderContext push + query, depth 3 / 8): - with_parent push: 2.1× faster (no hash dup, no array alloc) - has_parent? exact hit: -15% (43 ns vs 36 ns) - has_parent? miss: +9–12% (the key? probe was strictly extra work before the scan ran anyway) - find_parent / count_parents: unchanged (already scanned) Real workloads hit the miss path more than the hit path (has_parent?(Code) / (Url) / (Email) on every Text node — most text isn't inside those), so the miss-side win dominates. with_parent's allocation savings compound across every nested element. The two effects together produce the 5–8% E2E gain. The class is also significantly simpler: no parent_cache: kwarg, no build_cache, no incremental cache update in with_parent, no documentation of the cache's fast-path semantics.
07da9f0 to
a8945bc
Compare
`#has_parent?`, `#find_parent`, and `#count_parents` were keyed on
exact class via `parent_cache`, so a user-defined subclass like
`class CustomUrl < AST::Url` slipped past `has_parent?(AST::Url)`.
That's surprising — callers reasonably expect "do I have a Url
ancestor" to include subclasses — and it makes built-in checks
like Renderer's in_link_label? (which drives label-aware ]
escaping) silently miss custom AST nodes.
Fix: the three predicates now use `is_a?` semantics.
While verifying that the cache could still earn its keep on the
has_parent? fast-path, benchmarking showed it didn't: at realistic
depths (3–8) the linear scan is faster than maintaining the hash.
Dropped the cache entirely. The class loses parent_cache:,
build_cache, and the incremental cache-update logic in with_parent.
End-to-end (bench/bench.rb --isolated, YJIT, Ruby 3.4.8):
report before after delta
simple 36.8k/s 39.3k/s +6.7%
nested 38.3k/s 41.4k/s +8.2%
list 35.3k/s 38.4k/s +8.6%
table 11.3k/s 12.1k/s +6.9%
quote_nested 26.0k/s 27.7k/s +6.4%
code 83.7k/s 88.9k/s +6.1%
url 44.9k/s 47.0k/s +4.8%
mixed 13.1k/s 14.0k/s +6.7%
large_doc 744/s 797/s +7.2%
(escape_plain / escape_mixed unchanged — they don't touch
RenderContext.)
Micro (RenderContext push + query, depth 3 / 8):
- with_parent push: 2.1× faster (no hash dup, no array alloc)
- has_parent? exact hit: -15% (43 ns vs 36 ns)
- has_parent? miss: +9–12% (the key? probe was strictly extra
work before the scan ran anyway)
- find_parent / count_parents: unchanged (already scanned)
Real workloads hit the miss path more than the hit path
(has_parent?(Code) / (Url) / (Email) on every Text node — most
text isn't inside those), so the miss-side win dominates.
with_parent's allocation savings compound across every nested
element. The two effects together produce the 5–8% E2E gain.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two RenderContext changes, second motivated by benchmarking the first:
1. Make parent-queries answer
is_a?semantics.RenderContext#has_parent?,#find_parent, and#count_parentswere keyed on exact class viaparent_cache, so a user-defined subclass likeclass CustomUrl < AST::Urlslipped pasthas_parent?(AST::Url). That's surprising — callers reasonably expect "do I have a Url ancestor" to include subclasses — and it makes built-in checks like the renderer'sin_link_label?(which drives label-aware]escaping) silently miss custom AST nodes. The three predicates now useis_a?semantics.2. Drop the
parent_cache. After (1), the cache was only earning its keep onhas_parent?'s exact-class fast-path;find_parentandcount_parentsalready had to scan@parentsto answer polymorphic queries. Benchmarking showed the cache cost more than it saved — the linear scan over realistic parent depths (3–8) is faster than maintaining the hash. The class loses theparent_cache:kwarg,build_cache, and the incremental cache-update logic inwith_parent.Benchmarks
End-to-end (
bench/bench.rb --isolated, YJIT, Ruby 3.4.8):simplenestedlisttablequote_nestedcodeurlmixedlarge_docescape_plainescape_mixedMicro-bench (RenderContext push + query at depth 3 / 8):
with_parentpush: 2.1× faster — no hash dup, no array allocationhas_parent?exact hit: -15% (43 ns vs 36 ns)has_parent?miss: +9–12% faster — thekey?probe was strictly extra work before the scan ran anywayfind_parent/count_parents: unchanged (already scanned)Real workloads hit the miss path more than the hit path (
has_parent?(Code)/(Url)/(Email)on every Text node — most text isn't inside those), so the miss-side win dominates.with_parent's allocation savings compound across every nested element.Test plan
bundle exec rspec— 3122 examples, 0 failuresbin/mutant runon all RenderContext methods plus the now-dependent renderer/escape subjects — 100% mutation coverage (93/93 + 391/391 kills)bin/lint— no offensesRelated
]-escape fix)MarkdownEscaper; needs this PR to correctly trigger forUrlsubclasses)