pub enum DictionaryHandling {
Hydrate,
Resend,
}
Expand description
Defines how a FlightDataEncoder
encodes DictionaryArray
s
In the arrow flight protocol dictionary values and keys are sent as two separate messages.
When a sender is encoding a [RecordBatch
] containing [‘DictionaryArray’] columns, it will
first send a dictionary batch (a batch with header MessageHeader::DictionaryBatch
) containing
the dictionary values. The receiver is responsible for reading this batch and maintaining state that associates
those dictionary values with the corresponding array using the dict_id
as a key.
After sending the dictionary batch the sender will send the array data in a batch with header MessageHeader::RecordBatch
.
For any dictionary array batches in this message, the encoded flight message will only contain the dictionary keys. The receiver
is then responsible for rebuilding the DictionaryArray
on the client side using the dictionary values from the DictionaryBatch message
and the keys from the RecordBatch message.
For example, if we have a batch with a TypedDictionaryArray<'_, UInt32Type, Utf8Type>
(a dictionary array where they keys are u32
and the
values are String
), then the DictionaryBatch will contain a StringArray
and the RecordBatch will contain a UInt32Array
.
Note that since dict_id
defined in the Schema
is used as a key to associate dictionary values to their arrays it is required that each
DictionaryArray
in a RecordBatch
have a unique dict_id
.
The current implementation does not support “delta” dictionaries so a new dictionary batch will be sent each time the encoder sees a
dictionary which is not pointer-equal to the previously observed dictionary for a given dict_id
.
For clients which may not support DictionaryEncoding
, the DictionaryHandling::Hydrate
method will bypass the process defined above
and “hydrate” any DictionaryArray
in the batch to their underlying value type (e.g. TypedDictionaryArray<'_, UInt32Type, Utf8Type>
will
be sent as a StringArray
). With this method all data will be sent in ``MessageHeader::RecordBatch` messages and the batch schema
will be adjusted so that all dictionary encoded fields are changed to fields of the dictionary value type.
Variants§
Hydrate
Expands to the underlying type (default). This likely sends more data
over the network but requires less memory (dictionaries are not tracked)
and is more compatible with other arrow flight client implementations
that may not support DictionaryEncoding
See also:
Resend
Send dictionary FlightData with every RecordBatch that contains a
DictionaryArray
. See Self::Hydrate
for more tradeoffs. No
attempt is made to skip sending the same (logical) dictionary values
twice.
This requires identifying the different dictionaries in use and assigning
Trait Implementations§
Source§impl Debug for DictionaryHandling
impl Debug for DictionaryHandling
Source§impl PartialEq for DictionaryHandling
impl PartialEq for DictionaryHandling
impl StructuralPartialEq for DictionaryHandling
Auto Trait Implementations§
impl Freeze for DictionaryHandling
impl RefUnwindSafe for DictionaryHandling
impl Send for DictionaryHandling
impl Sync for DictionaryHandling
impl Unpin for DictionaryHandling
impl UnwindSafe for DictionaryHandling
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
Source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T
in a tonic::Request