pub enum DictionaryHandling {
Hydrate,
Resend,
}Expand description
Defines how a FlightDataEncoder encodes DictionaryArrays
In the arrow flight protocol dictionary values and keys are sent as two separate messages.
When a sender is encoding a RecordBatch containing [‘DictionaryArray’] columns, it will
first send a dictionary batch (a batch with header MessageHeader::DictionaryBatch) containing
the dictionary values. The receiver is responsible for reading this batch and maintaining state that associates
those dictionary values with the corresponding array using the dict_id as a key.
After sending the dictionary batch the sender will send the array data in a batch with header MessageHeader::RecordBatch.
For any dictionary array batches in this message, the encoded flight message will only contain the dictionary keys. The receiver
is then responsible for rebuilding the DictionaryArray on the client side using the dictionary values from the DictionaryBatch message
and the keys from the RecordBatch message.
For example, if we have a batch with a TypedDictionaryArray<'_, UInt32Type, Utf8Type> (a dictionary array where they keys are u32 and the
values are String), then the DictionaryBatch will contain a StringArray and the RecordBatch will contain a UInt32Array.
Note that since dict_id defined in the Schema is used as a key to associate dictionary values to their arrays it is required that each
DictionaryArray in a RecordBatch have a unique dict_id.
The current implementation does not support “delta” dictionaries so a new dictionary batch will be sent each time the encoder sees a
dictionary which is not pointer-equal to the previously observed dictionary for a given dict_id.
For clients which may not support DictionaryEncoding, the DictionaryHandling::Hydrate method will bypass the process defined above
and “hydrate” any DictionaryArray in the batch to their underlying value type (e.g. TypedDictionaryArray<'_, UInt32Type, Utf8Type> will
be sent as a StringArray). With this method all data will be sent in ``MessageHeader::RecordBatch` messages and the batch schema
will be adjusted so that all dictionary encoded fields are changed to fields of the dictionary value type.
Variants§
Hydrate
Expands to the underlying type (default). This likely sends more data
over the network but requires less memory (dictionaries are not tracked)
and is more compatible with other arrow flight client implementations
that may not support DictionaryEncoding
See also:
Resend
Send dictionary FlightData with every RecordBatch that contains a
DictionaryArray. See Self::Hydrate for more tradeoffs. No
attempt is made to skip sending the same (logical) dictionary values
twice.
This requires identifying the different dictionaries in use and assigning
Trait Implementations§
Source§impl Debug for DictionaryHandling
impl Debug for DictionaryHandling
Source§impl PartialEq for DictionaryHandling
impl PartialEq for DictionaryHandling
impl StructuralPartialEq for DictionaryHandling
Auto Trait Implementations§
impl Freeze for DictionaryHandling
impl RefUnwindSafe for DictionaryHandling
impl Send for DictionaryHandling
impl Sync for DictionaryHandling
impl Unpin for DictionaryHandling
impl UnwindSafe for DictionaryHandling
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T in a tonic::Request§impl<L> LayerExt<L> for L
impl<L> LayerExt<L> for L
§fn named_layer<S>(&self, service: S) -> Layered<<L as Layer<S>>::Service, S>where
L: Layer<S>,
fn named_layer<S>(&self, service: S) -> Layered<<L as Layer<S>>::Service, S>where
L: Layer<S>,
Layered].