-
Notifications
You must be signed in to change notification settings - Fork 113
feature: TableStrategy #5640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: TableStrategy #5640
Conversation
Codecov Report❌ Patch coverage is
☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
518a2e6 to
157ee48
Compare
gatesn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah really nice idea. Definitely should just immediately deprecate the StructStrategy.
| /// A set of leaf field overrides, e.g. to force one column to be compact-compressed. | ||
| leaf_writers: HashMap<FieldPath, Arc<dyn LayoutStrategy>>, | ||
| /// The writer for any validity arrays that may be present | ||
| validity: Arc<dyn LayoutStrategy>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only for non-leaf struct validity? Or all validity? Do we just need a special type of FieldPath that allows us to index into the nulls etc?
e.g. enum PathComponent { Field(FieldName), ListElements, Validity } maybe? Not sure... might be gross.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
structs are the only case right now where validity is extracted out and compressed into its own child layout, all of the other strategies just store it alongside the array data
| ) -> VortexResult<LayoutRef> { | ||
| let dtype = stream.dtype().clone(); | ||
|
|
||
| // Fallback: if the array is not a struct, fallback to writing a single array. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think you actually want to check the leaf_writers here for a root FieldPath, that would allow overriding intermediate struct strategies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually to maintain compatibility with behavior in StructStrategy, which lets you serialize non-struct flat arrays. Not sure why we let you do that but it's the current behavior and we have unit tests for it 🤷
c6dc373 to
15b8c9e
Compare
| /// This is now deprecated, users are encouraged to instead use the | ||
| /// [`TableStrategy`][crate::layouts::table::TableStrategy]. | ||
| #[derive(Clone)] | ||
| #[deprecated(since = "0.57.0", note = "Use the `TableStrategy` instead.")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - 58 is the latest release
vortex-layout/src/layouts/table.rs
Outdated
| } | ||
|
|
||
| impl TableStrategy { | ||
| // Descend into a subfield for the writer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - rustdoc
AdamGS
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, a big part of this is pulled from StructLayout
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
- add integration test for vortex-file - fix bug in PathStrategy descend - return Writer back from the write pathway Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
Signed-off-by: Andrew Duffy <[email protected]>
7114762 to
0fb7cf2
Compare
Signed-off-by: Andrew Duffy <[email protected]>
Extracting this out of my hack week project. This adds a new
TableStrategythat is like the StructStrategy, except that it allows users to override the write strategy at particular field paths in the schema.For example, if you have a schema like this:
Before you'd be stuck with choosing a single strategy for all fields of the Struct.
TableStrategy allows you to override particular field paths anywhere in the field tree. For example, if you wanted to allow uncompressed struct validity, default BtrBlocks compression but ZSTD compression just for the
request.body.bytesfield, you'd do:I've replaced StructStrategy with this in the default WriteBuilder.
If we like this, we might want to consider marking
StructStrategyas deprecated.