fn gc_string_view_batch(batch: &RecordBatch) -> RecordBatch
Expand description
Heuristically compact StringViewArray
s to reduce memory usage, if needed
Decides when to consolidate the StringView into a new buffer to reduce memory usage and improve string locality for better performance.
This differs from StringViewArray::gc
because:
- It may not compact the array depending on a heuristic.
- It uses a precise block size to reduce the number of buffers to track.
ยงHeuristic
If the average size of each view is larger than 32 bytes, we compact the array.
StringViewArray
include pointers to buffer that hold the underlying data.
One of the great benefits of StringViewArray
is that many operations
(e.g., filter
) can be done without copying the underlying data.
However, after a while (e.g., after FilterExec
or HashJoinExec
) the
StringViewArray
may only refer to a small portion of the buffer,
significantly increasing memory usage.