Expand description
Dictionary utilities for Arrow arrays
Macrosยง
Structsยง
- Interner ๐
- A best effort interner that maintains a fixed number of buckets and interns keys based on their hash value
- Merged
Dictionaries ๐
Functionsยง
- bytes_
ptr_ ๐eq - Performs a cheap, pointer-based comparison of two byte array
- compute_
values_ ๐mask - Return a mask identifying the values that are referenced by keys in
dictionary
at the positions indicated byselection
- garbage_
collect_ any_ dictionary - Equivalent to
garbage_collect_dictionary
but without requiring casting to a specific key type. - garbage_
collect_ dictionary - Garbage collects a [DictionaryArray] by removing unreferenced values.
- get_
masked_ ๐values - Return a Vec containing for each set index in
mask
, the index and byte value of that index - masked_
bytes ๐ - Compute
get_masked_values
for a [GenericByteArray
] - masked_
primitives_ ๐to_ bytes - Process primitive array values to bytes
- merge_
dictionary_ ๐values - Given an array of dictionaries and an optional key mask compute a values array
containing referenced values, along with mappings from the [
DictionaryArray
] keys to the new keys within this values array. Best-effort will be made to ensure that the dictionary values are unique - should_
merge_ ๐dictionary_ values - A weak heuristic of whether to merge dictionary values that aims to only perform the expensive merge computation when it is likely to yield at least some return over the naive approach used by MutableArrayData
Type Aliasesยง
- Interner
Bucket ๐ - A single bucket in
Interner
. - PtrEq ๐
- A type-erased function that compares two array for pointer equality