feat: Subqueries in SELECT for hierarchical data (includes)#1294
feat: Subqueries in SELECT for hierarchical data (includes)#1294
Conversation
🦋 Changeset detectedLatest commit: 6343522 The changes in this PR will be included in the next version bump. This PR includes changesets to release 14 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
More templates
@tanstack/angular-db
@tanstack/db
@tanstack/db-ivm
@tanstack/electric-db-collection
@tanstack/offline-transactions
@tanstack/powersync-db-collection
@tanstack/query-db-collection
@tanstack/react-db
@tanstack/rxdb-db-collection
@tanstack/solid-db
@tanstack/svelte-db
@tanstack/trailbase-db-collection
@tanstack/vue-db
commit: |
|
Size Change: +5.38 kB (+5.78%) 🔍 Total Size: 98.6 kB
ℹ️ View Unchanged
|
|
Size Change: 0 B Total Size: 3.85 kB ℹ️ View Unchanged
|
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
260157e to
d117fea
Compare
Replace O(n) parent collection scans with a reverse index (correlationKey → Set<parentKey>) for attaching child Collections to parent rows. The index is populated during parent INSERTs and cleaned up on parent DELETEs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
samwillis
left a comment
There was a problem hiding this comment.
This is looking really great. Awesome work!
Depending on if we are going to release before or after followup PRs it may make sense to add some defensive errors for unsupported queries (groupBy, referencing multiple fields on the parent)
| ? where.expression | ||
| : where | ||
|
|
||
| // Look for eq(a, b) where one side references parent and other references child |
There was a problem hiding this comment.
I believe this is finding the first expression that references both sides, this is correct. We should consider what something like this does:
q.from({ p: projects }).select(({ p }) => ({
id: p.id,
name: p.name,
issues: q
.from({ i: issues })
.where(({ i }) => and(eq(i.projectId, p.id)), eq(i.createdBy, p.createdBy))
.select(({ i }) => ({
id: i.id,
title: i.title,
})),
})),
)I suspect it breaks at the moment, and so we may want to throw if there is more than one expression matching both sources.
I think it's possible to make this work though by pulling the parent project value temporarily into the child issue pipeline.
There was a problem hiding this comment.
Indeed, it breaks right now because the parent row is not in the child pipeline. I added support for this in this PR: #1307
| // Re-add project Alpha — should get a fresh child collection | ||
| projects.utils.begin() | ||
| projects.utils.write({ | ||
| type: `insert`, | ||
| value: { id: 1, name: `Alpha Reborn` }, | ||
| }) | ||
| projects.utils.commit() | ||
|
|
||
| const alpha = collection.get(1) as any | ||
| expect(alpha).toMatchObject({ id: 1, name: `Alpha Reborn` }) | ||
| expect(childItems(alpha.issues)).toEqual([ | ||
| { id: 10, title: `Bug in Alpha` }, | ||
| { id: 11, title: `Feature for Alpha` }, | ||
| ]) |
|
ChatGPT review: Here’s my review of TanStack/db PR #1294 (adds “includes” subqueries / nested child collections). ([GitHub]1) What this PR is doing (as I understand it)
Overall: the API is very ergonomic, and the nested routing solution is clever. Things I like
Key concerns / suggested changes1) Alias collisions between parent and child queries (correctness bug risk)
Suggestion
2) Correlation extraction only matches top-level
|
|
@samwillis response to codex' review:
Valid concern but I don't think it's a real risk in practice. The child query is built via the builder API where the user explicitly declares aliases in
Fixed in #1307
This is a fair observation but the compiler already runs on the output of Still a valid code hygiene concern, but it's out of scope for this PR. If it were to be fixed, it should be its own cleanup.
The correlation field is almost always the parent's primary key (e.g., Could a user correlate on a non-PK field? Technically yes, but it would be semantically wrong — the correlation field determines how children are grouped to parents. If it's not stable, the entire grouping model is broken, not just the cleanup logic. I'd lean toward the first suggestion: document/validate that the correlation field should be a stable key (which it naturally is in every real use case). The "move" case handling would add complexity for a scenario that doesn't really make sense to support.
True — the correlation field doesn't have to be the PK (which are restricted to Using
This is fine. We have a couple of these reserved properties around the code. It's unlikely someone would use this name. Regarding the additional tests:
|
…hIncludesState reads from that stamp. The stamp is cleaned up at the end of flush so it never leaks to the user
Code ReviewBugs1. Fix: strip 2. const selectCopy = { ...query.select }
query = { ...query, select: selectCopy }3. Nested Test gaps4. No test for updating an existing child row — Insert and delete are covered, but the update branch in 5. No test for the error on missing/invalid correlation — 6. No test for multiple sibling includes on the same parent — Every test uses one includes field. A parent with both Minor7. Child collections not explicitly cleaned up on teardown — When the parent sync ends (collection-config-builder.ts:636), 8. Type system has no awareness of includes — |
* feat: add toArray() for includes subqueries toArray() wraps an includes subquery so the parent row contains Array<T> instead of Collection<T>. When children change, the parent row is re-emitted with a fresh array snapshot. - Add ToArrayWrapper class and toArray() function - Add materializeAsArray flag to IncludesSubquery IR node - Detect ToArrayWrapper in builder, pass flag through compiler - Re-emit parent rows on child changes for toArray entries - Add SelectValue type support for ToArrayWrapper - Add tests for basic toArray, reactivity, ordering, and limits Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Removed obsolete test * Small fix * Tests for changes to deeply nested queries * Fix changes being emitted on deeply nested collections * ci: apply automated fixes * Changeset * Add type-level tests for toArray() includes subqueries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: apply automated fixes * Rename Expected types in includes type tests to descriptive names Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix toArray() type inference in includes subqueries Make ToArrayWrapper generic so it carries the child query result type, and add a ToArrayWrapper branch in ResultTypeFromSelect to unwrap it to Array<T>. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix toArray re-emit to emit change events for subscribers The toArray re-emit in flushIncludesState mutated parent items in-place before writing them through parentSyncMethods.begin/write/commit. Since commitPendingTransactions captures "previous visible state" by reading syncedData.get(key) — which returns the already-mutated object — deepEquals always returned true and suppressed the change event. Replace the sync methods pattern with direct event emission: capture a shallow copy before mutation (for previousValue), mutate in-place (so collection.get() works), and emit UPDATE events directly via the parent collection's changes manager. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add change propagation tests for includes subqueries Test the reactive model difference between Collection and toArray includes: - Collection includes: child change does NOT re-emit the parent row (the child Collection updates in place) - toArray includes: child change DOES re-emit the parent row (the parent row is re-emitted with the updated array snapshot) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: apply automated fixes --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Resolve conflict in ResultTypeFromSelect by combining toArray branch's ToArrayWrapper handling with main's improved nullable ref typing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a child includes query has no explicit .select(), the raw row (including the internal __correlationKey stamp) becomes the final result. Strip this internal property before returning so it doesn't leak to users. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
replaceIncludesInSelect mutates query.select in-place, but the optimizer copies select by reference, so rawQuery.select === query.select. This violates the immutable-IR convention. Shallow-clone select when includes entries are found so the original IR is preserved. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
extractIncludesFromSelect only checked top-level select entries. If a
user placed an includes subquery inside a nested select object (e.g.
select({ info: { issues: childQuery } })), the IncludesSubquery would
never be extracted and the child pipeline would never compile, silently
producing null. Now recursively checks nested objects and throws a clear
error when a nested includes is detected.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The update branch in flushIncludesState was untested. This test verifies that updating a child's title is reflected in the parent's child collection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Test updating an existing child row (exercises the update branch in flushIncludesState) - Test error on missing WHERE, non-eq WHERE, and self-referencing eq - Test multiple sibling includes (issues + milestones on same parent) verifying independent child collections and independent reactivity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Skip __SPREAD_SENTINEL__ entries when checking for nested includes to avoid infinite recursion on RefProxy objects. Add non-null assertion for query.select after shallow clone (TypeScript loses narrowing after reassignment). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add 4 type tests that verify the expected types for non-toArray includes:
- includes with select → Collection<{id, title}>
- includes without select → Collection<Issue>
- multiple sibling includes → independent Collection types
- nested includes → Collection<{..., comments: Collection<{...}>}>
These tests currently fail because SelectValue doesn't include QueryBuilder
and ResultTypeFromSelect has no branch for it, so includes fields resolve
to `never`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add QueryBuilder<any> to the SelectValue union so TypeScript accepts bare QueryBuilder instances in select callbacks. Add a branch in ResultTypeFromSelect that maps QueryBuilder<TContext> to Collection<GetResult<TContext>>, giving users proper autocomplete and type safety on child collection fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Addressing review feedback from #1294 (comment) Bugs fixed1. 2. 3. Nested Test gaps filled4. No test for updating an existing child row — Added in 29c34f5. Exercises the update branch in 5. No test for the error on missing/invalid correlation — Added in 7f1066a. Tests: no WHERE at all, WHERE without 6. No test for multiple sibling includes on the same parent — Added in 7f1066a. Parent with both Minor items7. Child collections not explicitly cleaned up on teardown — Agreed this is fine via GC for now, something to watch. 8. Type system has no awareness of includes — Fixed in 5db9ae0 and 50abd99. Added |
…1307) * Unit tests for filtering on parent fields in child query * ci: apply automated fixes * Support parent-referencing WHERE filters in includes child queries Allow child queries to have additional WHERE clauses that reference parent fields (e.g., eq(i.createdBy, p.createdBy)) beyond the single correlation eq(). Parent-referencing WHEREs are detected in the builder, parent fields are projected into the key stream, and filters are re-injected into the child query where parent context is available. When no parent-referencing filters exist, behavior is unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: apply automated fixes * Extract correlation condition from inside and() WHERE clauses When users write a single .where() with and(eq(i.projectId, p.id), ...), the correlation eq() is now found and extracted from inside the and(). The remaining args stay as WHERE clauses. This means users don't need to know that the correlation must be a separate .where() call. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: apply automated fixes * Some more tests * changeset * Add failing test for shared correlation key with distinct parent filter values Two parents share the same correlation key (groupId) but have different values for a parent-referenced filter field (createdBy). The test verifies that each parent receives its own filtered child set rather than a shared union. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: apply automated fixes * Key child collections by composite routing key to fix shared correlation key collision When multiple parents share the same correlation key but have different parent-referenced filter values, child collections were incorrectly shared. Fix by keying child collections by (correlationKey, parentFilterValues) composite, and using composite child keys in the D2 stream to prevent collisions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add failing test for shared correlation key with orderBy + limit Reproduces the bug where grouped ordering for limit uses the raw correlation key instead of the composite routing key, causing parents that share a correlation key but differ on parent-referenced filters to have their children merged before the limit is applied. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: apply automated fixes * Use composite routing key for grouped ordering with limit/offset The includesGroupKeyFn for orderBy + limit/offset was grouping by raw correlationKey, causing parents sharing a correlation key but differing on parent-referenced filters to have their children merged before the limit was applied. Use the same composite key as the routing layer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add failing test for nested includes with parent-referencing filters at both levels When both the child and grandchild includes use parent-referencing filters, the grandchild collection comes back empty because the nested routing index uses a different key than the nested buffer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Use composite routing key in nested routing index to match nested buffer keys The nested routing index was keyed by raw correlationKey while nested buffers use computeRoutingKey(correlationKey, parentContext). This mismatch caused drainNestedBuffers lookups to fail, leaving grandchild collections empty when parent-referencing filters exist at both levels. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add test for three levels of nested includes with parent-referencing filters Verifies that composite routing keys work at arbitrary nesting depth, not just the first two levels. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: apply automated fixes * Add test for deleting one parent preserving sibling parent's child collection Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix shared correlation key: deduplicate parentKeyStream and defer child cleanup Two fixes for when multiple parents share the same correlation key: 1. Add reduce operator on parentKeyStream to clamp multiplicities to 1, preventing the inner join from producing duplicate child entries that cause incorrect deletions when one parent is removed. 2. In Phase 5, only delete child registry entry when the last parent referencing it is removed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add test for spread select on child not leaking internal properties Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Strip internal __correlationKey and __parentContext from child results These routing properties leak into user-visible results when the child query uses a spread select (e.g. { ...i }). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: fix prettier formatting in compiler Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add toArray() for includes subqueries toArray() wraps an includes subquery so the parent row contains Array<T> instead of Collection<T>. When children change, the parent row is re-emitted with a fresh array snapshot. - Add ToArrayWrapper class and toArray() function - Add materializeAsArray flag to IncludesSubquery IR node - Detect ToArrayWrapper in builder, pass flag through compiler - Re-emit parent rows on child changes for toArray entries - Add SelectValue type support for ToArrayWrapper - Add tests for basic toArray, reactivity, ordering, and limits Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Removed obsolete test * Tests for changes to deeply nested queries * Add type-level tests for toArray() includes subqueries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add change propagation tests for includes subqueries Test the reactive model difference between Collection and toArray includes: - Collection includes: child change does NOT re-emit the parent row (the child Collection updates in place) - toArray includes: child change DOES re-emit the parent row (the parent row is re-emitted with the updated array snapshot) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Test aggregates inside subqueries * Take into account correlation key when aggregating in subqueries * changeset * ci: apply automated fixes --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
6aa4f8e to
6343522
Compare
Summary
.select()that produce hierarchical results — each parent row gets a child Collection (e.g., projects with nested issues, issues with nested comments)Closes #288
Test plan
🤖 Generated with Claude Code