Struct RecordBatchDecoder

Source
struct RecordBatchDecoder<'a> {
    batch: RecordBatch<'a>,
    schema: SchemaRef,
    dictionaries_by_id: &'a HashMap<i64, ArrayRef>,
    compression: Option<CompressionCodec>,
    version: MetadataVersion,
    data: &'a Buffer,
    nodes: VectorIter<'a, FieldNode>,
    buffers: VectorIter<'a, Buffer>,
    projection: Option<&'a [usize]>,
    require_alignment: bool,
    skip_validation: UnsafeFlag,
}
Expand description

State for decoding Arrow arrays from an IPC RecordBatch structure to [RecordBatch]

Fields§

§batch: RecordBatch<'a>

The flatbuffers encoded record batch

§schema: SchemaRef

The output schema

§dictionaries_by_id: &'a HashMap<i64, ArrayRef>

Decoded dictionaries indexed by dictionary id

§compression: Option<CompressionCodec>

Optional compression codec

§version: MetadataVersion

The format version

§data: &'a Buffer

The raw data buffer

§nodes: VectorIter<'a, FieldNode>

The fields comprising this array

§buffers: VectorIter<'a, Buffer>

The buffers comprising this array

§projection: Option<&'a [usize]>

Projection (subset of columns) to read, if any See RecordBatchDecoder::with_projection for details

§require_alignment: bool

Are buffers required to already be aligned? See RecordBatchDecoder::with_require_alignment for details

§skip_validation: UnsafeFlag

Should validation be skipped when reading data? Defaults to false.

See FileDecoder::with_skip_validation for details.

Implementations§

Source§

impl RecordBatchDecoder<'_>

Source

fn create_array( &mut self, field: &Field, variadic_counts: &mut VecDeque<i64>, ) -> Result<ArrayRef, ArrowError>

Coordinates reading arrays based on data types.

variadic_counts encodes the number of buffers to read for variadic types (e.g., Utf8View, BinaryView) When encounter such types, we pop from the front of the queue to get the number of buffers to read.

Notes:

  • In the IPC format, null buffers are always set, but may be empty. We discard them if an array has 0 nulls
  • Numeric values inside list arrays are often stored as 64-bit values regardless of their data type size. We thus:
    • check if the bit width of non-64-bit numbers is 64, and
    • read the buffer as 64-bit (signed integer or float), and
    • cast the 64-bit array to the appropriate data type
Source

fn create_primitive_array( &self, field_node: &FieldNode, data_type: &DataType, buffers: &[Buffer], ) -> Result<ArrayRef, ArrowError>

Reads the correct number of buffers based on data type and null_count, and creates a primitive array ref

Source

fn create_array_from_builder( &self, builder: ArrayDataBuilder, ) -> Result<ArrayRef, ArrowError>

Update the ArrayDataBuilder based on settings in this decoder

Source

fn create_list_array( &self, field_node: &FieldNode, data_type: &DataType, buffers: &[Buffer], child_array: ArrayRef, ) -> Result<ArrayRef, ArrowError>

Reads the correct number of buffers based on list type and null_count, and creates a list array ref

Source

fn create_struct_array( &self, struct_node: &FieldNode, null_buffer: Buffer, struct_fields: &Fields, struct_arrays: Vec<ArrayRef>, ) -> Result<ArrayRef, ArrowError>

Source

fn create_dictionary_array( &self, field_node: &FieldNode, data_type: &DataType, buffers: &[Buffer], value_array: ArrayRef, ) -> Result<ArrayRef, ArrowError>

Reads the correct number of buffers based on list type and null_count, and creates a list array ref

Source§

impl<'a> RecordBatchDecoder<'a>

Source

fn try_new( buf: &'a Buffer, batch: RecordBatch<'a>, schema: SchemaRef, dictionaries_by_id: &'a HashMap<i64, ArrayRef>, metadata: &'a MetadataVersion, ) -> Result<Self, ArrowError>

Create a reader for decoding arrays from an encoded [RecordBatch]

Source

pub fn with_projection(self, projection: Option<&'a [usize]>) -> Self

Set the projection (default: None)

If set, the projection is the list of column indices that will be read

Source

pub fn with_require_alignment(self, require_alignment: bool) -> Self

Set require_alignment (default: false)

If true, buffers must be aligned appropriately or error will result. If false, buffers will be copied to aligned buffers if necessary.

Source

pub(crate) fn with_skip_validation(self, skip_validation: UnsafeFlag) -> Self

Specifies if validation should be skipped when reading data (defaults to false)

Note this API is somewhat “funky” as it allows the caller to skip validation without having to use unsafe code. If this is ever made public it should be made clearer that this is a potentially unsafe by using an unsafe function that takes a boolean flag.

§Safety

Relies on the caller only passing a flag with true value if they are certain that the data is valid

Source

fn read_record_batch(self) -> Result<RecordBatch, ArrowError>

Read the record batch, consuming the reader

Source

fn next_buffer(&mut self) -> Result<Buffer, ArrowError>

Source

fn skip_buffer(&mut self)

Source

fn next_node(&mut self, field: &Field) -> Result<&'a FieldNode, ArrowError>

Source

fn skip_field( &mut self, field: &Field, variadic_count: &mut VecDeque<i64>, ) -> Result<(), ArrowError>

Auto Trait Implementations§

§

impl<'a> Freeze for RecordBatchDecoder<'a>

§

impl<'a> !RefUnwindSafe for RecordBatchDecoder<'a>

§

impl<'a> Send for RecordBatchDecoder<'a>

§

impl<'a> Sync for RecordBatchDecoder<'a>

§

impl<'a> Unpin for RecordBatchDecoder<'a>

§

impl<'a> !UnwindSafe for RecordBatchDecoder<'a>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.