arrow_array::array::union_array

Struct UnionArray

Source
pub struct UnionArray {
    data_type: DataType,
    type_ids: ScalarBuffer<i8>,
    offsets: Option<ScalarBuffer<i32>>,
    fields: Vec<Option<ArrayRef>>,
}
Expand description

An array of values of varying types

Each slot in a UnionArray can have a value chosen from a number of types. Each of the possible types are named like the fields of a StructArray. A UnionArray can have two possible memory layouts, “dense” or “sparse”. For more information on please see the specification.

UnionBuilder can be used to create UnionArray’s of primitive types. UnionArray’s of nested types are also supported but not via UnionBuilder, see the tests for examples.

§Examples

§Create a dense UnionArray [1, 3.2, 34]

use arrow_buffer::ScalarBuffer;
use arrow_schema::*;
use std::sync::Arc;
use arrow_array::{Array, Int32Array, Float64Array, UnionArray};

let int_array = Int32Array::from(vec![1, 34]);
let float_array = Float64Array::from(vec![3.2]);
let type_ids = [0, 1, 0].into_iter().collect::<ScalarBuffer<i8>>();
let offsets = [0, 0, 1].into_iter().collect::<ScalarBuffer<i32>>();

let union_fields = [
    (0, Arc::new(Field::new("A", DataType::Int32, false))),
    (1, Arc::new(Field::new("B", DataType::Float64, false))),
].into_iter().collect::<UnionFields>();

let children = vec![
    Arc::new(int_array) as Arc<dyn Array>,
    Arc::new(float_array),
];

let array = UnionArray::try_new(
    union_fields,
    type_ids,
    Some(offsets),
    children,
).unwrap();

let value = array.value(0).as_any().downcast_ref::<Int32Array>().unwrap().value(0);
assert_eq!(1, value);

let value = array.value(1).as_any().downcast_ref::<Float64Array>().unwrap().value(0);
assert!(3.2 - value < f64::EPSILON);

let value = array.value(2).as_any().downcast_ref::<Int32Array>().unwrap().value(0);
assert_eq!(34, value);

§Create a sparse UnionArray [1, 3.2, 34]

use arrow_buffer::ScalarBuffer;
use arrow_schema::*;
use std::sync::Arc;
use arrow_array::{Array, Int32Array, Float64Array, UnionArray};

let int_array = Int32Array::from(vec![Some(1), None, Some(34)]);
let float_array = Float64Array::from(vec![None, Some(3.2), None]);
let type_ids = [0_i8, 1, 0].into_iter().collect::<ScalarBuffer<i8>>();

let union_fields = [
    (0, Arc::new(Field::new("A", DataType::Int32, false))),
    (1, Arc::new(Field::new("B", DataType::Float64, false))),
].into_iter().collect::<UnionFields>();

let children = vec![
    Arc::new(int_array) as Arc<dyn Array>,
    Arc::new(float_array),
];

let array = UnionArray::try_new(
    union_fields,
    type_ids,
    None,
    children,
).unwrap();

let value = array.value(0).as_any().downcast_ref::<Int32Array>().unwrap().value(0);
assert_eq!(1, value);

let value = array.value(1).as_any().downcast_ref::<Float64Array>().unwrap().value(0);
assert!(3.2 - value < f64::EPSILON);

let value = array.value(2).as_any().downcast_ref::<Int32Array>().unwrap().value(0);
assert_eq!(34, value);

Fields§

§data_type: DataType§type_ids: ScalarBuffer<i8>§offsets: Option<ScalarBuffer<i32>>§fields: Vec<Option<ArrayRef>>

Implementations§

Source§

impl UnionArray

Source

pub unsafe fn new_unchecked( fields: UnionFields, type_ids: ScalarBuffer<i8>, offsets: Option<ScalarBuffer<i32>>, children: Vec<ArrayRef>, ) -> Self

Creates a new UnionArray.

Accepts type ids, child arrays and optionally offsets (for dense unions) to create a new UnionArray. This method makes no attempt to validate the data provided by the caller and assumes that each of the components are correct and consistent with each other. See try_new for an alternative that validates the data provided.

§Safety

The type_ids values should be positive and must match one of the type ids of the fields provided in fields. These values are used to index into the children arrays.

The offsets is provided in the case of a dense union, sparse unions should use None. If provided the offsets values should be positive and must be less than the length of the corresponding array.

In both cases above we use signed integer types to maintain compatibility with other Arrow implementations.

Source

pub fn try_new( fields: UnionFields, type_ids: ScalarBuffer<i8>, offsets: Option<ScalarBuffer<i32>>, children: Vec<ArrayRef>, ) -> Result<Self, ArrowError>

Attempts to create a new UnionArray, validating the inputs provided.

The order of child arrays child array order must match the fields order

Source

pub fn child(&self, type_id: i8) -> &ArrayRef

Accesses the child array for type_id.

§Panics

Panics if the type_id provided is not present in the array’s DataType in the Union.

Source

pub fn type_id(&self, index: usize) -> i8

Returns the type_id for the array slot at index.

§Panics

Panics if index is greater than or equal to the number of child arrays

Source

pub fn type_ids(&self) -> &ScalarBuffer<i8>

Returns the type_ids buffer for this array

Source

pub fn offsets(&self) -> Option<&ScalarBuffer<i32>>

Returns the offsets buffer if this is a dense array

Source

pub fn value_offset(&self, index: usize) -> usize

Returns the offset into the underlying values array for the array slot at index.

§Panics

Panics if index is greater than or equal the length of the array.

Source

pub fn value(&self, i: usize) -> ArrayRef

Returns the array’s value at index i.

§Panics

Panics if index i is out of bounds

Source

pub fn type_names(&self) -> Vec<&str>

Returns the names of the types in the union.

Source

fn is_dense(&self) -> bool

Returns whether the UnionArray is dense (or sparse if false).

Source

pub fn slice(&self, offset: usize, length: usize) -> Self

Returns a zero-copy slice of this array with the indicated offset and length.

Source

pub fn into_parts( self, ) -> (UnionFields, ScalarBuffer<i8>, Option<ScalarBuffer<i32>>, Vec<ArrayRef>)

Deconstruct this array into its constituent parts

§Example
let mut builder = UnionBuilder::new_dense();
builder.append::<Int32Type>("a", 1).unwrap();
let union_array = builder.build()?;

// Deconstruct into parts
let (union_fields, type_ids, offsets, children) = union_array.into_parts();

// Reconstruct from parts
let union_array = UnionArray::try_new(
    union_fields,
    type_ids,
    offsets,
    children,
);
Source

fn mask_sparse_skip_without_nulls( &self, nulls: Vec<(i8, NullBuffer)>, ) -> BooleanBuffer

Computes the logical nulls for a sparse union, optimized for when there’s a lot of fields without nulls

Source

fn mask_sparse_skip_fully_null( &self, nulls: Vec<(i8, NullBuffer)>, ) -> BooleanBuffer

Computes the logical nulls for a sparse union, optimized for when there’s a lot of fields fully null

Source

fn mask_sparse_all_with_nulls_skip_one( &self, nulls: Vec<(i8, NullBuffer)>, ) -> BooleanBuffer

Computes the logical nulls for a sparse union, optimized for when all fields contains nulls

Source

fn mask_sparse_helper( &self, nulls: Vec<(i8, NullBuffer)>, mask_chunk: impl FnMut(&[i8; 64], &mut [(i8, BitChunkIterator<'_>)]) -> u64, mask_remainder: impl FnOnce(&[i8], &[(i8, BitChunks<'_>)]) -> u64, ) -> BooleanBuffer

Maps nulls to BitChunk's and then to BitChunkIterator's, then divides self.type_ids into exact chunks of 64 values, calling mask_chunk for every exact chunk, and mask_remainder for the remainder, if any, collecting the result in a BooleanBuffer

Source

fn gather_nulls(&self, nulls: Vec<(i8, NullBuffer)>) -> BooleanBuffer

Computes the logical nulls for a sparse or dense union, by gathering individual bits from the null buffer of the selected field

Trait Implementations§

Source§

impl Array for UnionArray

Source§

fn as_any(&self) -> &dyn Any

Returns the array as Any so that it can be downcasted to a specific implementation. Read more
Source§

fn to_data(&self) -> ArrayData

Returns the underlying data of this array
Source§

fn into_data(self) -> ArrayData

Returns the underlying data of this array Read more
Source§

fn data_type(&self) -> &DataType

Returns a reference to the [DataType] of this array. Read more
Source§

fn slice(&self, offset: usize, length: usize) -> ArrayRef

Returns a zero-copy slice of this array with the indicated offset and length. Read more
Source§

fn len(&self) -> usize

Returns the length (i.e., number of elements) of this array. Read more
Source§

fn is_empty(&self) -> bool

Returns whether this array is empty. Read more
Source§

fn shrink_to_fit(&mut self)

Shrinks the capacity of any exclusively owned buffer as much as possible Read more
Source§

fn offset(&self) -> usize

Returns the offset into the underlying data used by this array(-slice). Note that the underlying data can be shared by many arrays. This defaults to 0. Read more
Source§

fn nulls(&self) -> Option<&NullBuffer>

Returns the null buffer of this array if any. Read more
Source§

fn logical_nulls(&self) -> Option<NullBuffer>

Returns a potentially computed [NullBuffer] that represents the logical null values of this array, if any. Read more
Source§

fn is_nullable(&self) -> bool

Returns false if the array is guaranteed to not contain any logical nulls Read more
Source§

fn get_buffer_memory_size(&self) -> usize

Returns the total number of bytes of memory pointed to by this array. The buffers store bytes in the Arrow memory format, and include the data as well as the validity map. Note that this does not always correspond to the exact memory usage of an array, since multiple arrays can share the same buffers or slices thereof.
Source§

fn get_array_memory_size(&self) -> usize

Returns the total number of bytes of memory occupied physically by this array. This value will always be greater than returned by get_buffer_memory_size() and includes the overhead of the data structures that contain the pointers to the various buffers.
Source§

fn is_null(&self, index: usize) -> bool

Returns whether the element at index is null according to Array::nulls Read more
Source§

fn is_valid(&self, index: usize) -> bool

Returns whether the element at index is not null, the opposite of Self::is_null. Read more
Source§

fn null_count(&self) -> usize

Returns the total number of physical null values in this array. Read more
Source§

fn logical_null_count(&self) -> usize

Returns the total number of logical null values in this array. Read more
Source§

impl Clone for UnionArray

Source§

fn clone(&self) -> UnionArray

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for UnionArray

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl From<ArrayData> for UnionArray

Source§

fn from(data: ArrayData) -> Self

Converts to this type from the input type.
Source§

impl From<UnionArray> for ArrayData

Source§

fn from(array: UnionArray) -> Self

Converts to this type from the input type.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dst: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
Source§

impl<T> Datum for T
where T: Array,

Source§

fn get(&self) -> (&dyn Array, bool)

Returns the value for this Datum and a boolean indicating if the value is scalar
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.