Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Often people want to store arrow data in memory, and in some cases are potentially willing to pay a premium for reduced memory usage.
Currently various array types can have "bloated" memory footprints, for example:
- Arrays can be sliced, with potentially unreferenced data buffers and child arrays
- Dictionaries can contain duplicate entries
- View arrays can contain unreferenced data
Describe the solution you'd like
I would like to propose a minify kernel in arrow-select that allows performing this minification. It should take a non-exhaustive/build-pattern MinifyOptions struct to allow controlling how this logic is performed. Ideally it would be possible to move the logic that currently resides in the IPCWriter over to use this kernel.
Describe alternatives you've considered
Additional context
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Often people want to store arrow data in memory, and in some cases are potentially willing to pay a premium for reduced memory usage.
Currently various array types can have "bloated" memory footprints, for example:
Describe the solution you'd like
I would like to propose a minify kernel in arrow-select that allows performing this minification. It should take a non-exhaustive/build-pattern
MinifyOptionsstruct to allow controlling how this logic is performed. Ideally it would be possible to move the logic that currently resides in the IPCWriter over to use this kernel.Describe alternatives you've considered
Additional context