arrow_flight::encode

Enum DictionaryHandling

Source
pub enum DictionaryHandling {
    Hydrate,
    Resend,
}
Expand description

Defines how a FlightDataEncoder encodes DictionaryArrays

In the arrow flight protocol dictionary values and keys are sent as two separate messages. When a sender is encoding a [RecordBatch] containing [‘DictionaryArray’] columns, it will first send a dictionary batch (a batch with header MessageHeader::DictionaryBatch) containing the dictionary values. The receiver is responsible for reading this batch and maintaining state that associates those dictionary values with the corresponding array using the dict_id as a key.

After sending the dictionary batch the sender will send the array data in a batch with header MessageHeader::RecordBatch. For any dictionary array batches in this message, the encoded flight message will only contain the dictionary keys. The receiver is then responsible for rebuilding the DictionaryArray on the client side using the dictionary values from the DictionaryBatch message and the keys from the RecordBatch message.

For example, if we have a batch with a TypedDictionaryArray<'_, UInt32Type, Utf8Type> (a dictionary array where they keys are u32 and the values are String), then the DictionaryBatch will contain a StringArray and the RecordBatch will contain a UInt32Array.

Note that since dict_id defined in the Schema is used as a key to associate dictionary values to their arrays it is required that each DictionaryArray in a RecordBatch have a unique dict_id.

The current implementation does not support “delta” dictionaries so a new dictionary batch will be sent each time the encoder sees a dictionary which is not pointer-equal to the previously observed dictionary for a given dict_id.

For clients which may not support DictionaryEncoding, the DictionaryHandling::Hydrate method will bypass the process defined above and “hydrate” any DictionaryArray in the batch to their underlying value type (e.g. TypedDictionaryArray<'_, UInt32Type, Utf8Type> will be sent as a StringArray). With this method all data will be sent in ``MessageHeader::RecordBatch` messages and the batch schema will be adjusted so that all dictionary encoded fields are changed to fields of the dictionary value type.

Variants§

§

Hydrate

Expands to the underlying type (default). This likely sends more data over the network but requires less memory (dictionaries are not tracked) and is more compatible with other arrow flight client implementations that may not support DictionaryEncoding

See also:

§

Resend

Send dictionary FlightData with every RecordBatch that contains a DictionaryArray. See Self::Hydrate for more tradeoffs. No attempt is made to skip sending the same (logical) dictionary values twice.

This requires identifying the different dictionaries in use and assigning

Trait Implementations§

Source§

impl Debug for DictionaryHandling

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl PartialEq for DictionaryHandling

Source§

fn eq(&self, other: &DictionaryHandling) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl StructuralPartialEq for DictionaryHandling

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

§

impl<T> Instrument for T

§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more
§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoRequest<T> for T

Source§

fn into_request(self) -> Request<T>

Wrap the input message T in a tonic::Request
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

§

impl<T> WithSubscriber for T

§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,