pub struct MutableArrayData<'a> {
arrays: Vec<&'a ArrayData>,
data: _MutableArrayData<'a>,
dictionary: Option<ArrayData>,
variadic_data_buffers: Vec<Buffer>,
extend_values: Vec<Box<dyn Fn(&mut _MutableArrayData<'_>, usize, usize, usize) + 'a>>,
extend_null_bits: Vec<Box<dyn Fn(&mut _MutableArrayData<'_>, usize, usize) + 'a>>,
extend_nulls: Box<dyn Fn(&mut _MutableArrayData<'_>, usize)>,
}
Expand description
Efficiently create an ArrayData from one or more existing ArrayDatas by copying chunks.
The main use case of this struct is to perform unary operations to arrays of
arbitrary types, such as filter
and take
.
§Example
use arrow_buffer::Buffer;
use arrow_data::ArrayData;
use arrow_data::transform::MutableArrayData;
use arrow_schema::DataType;
fn i32_array(values: &[i32]) -> ArrayData {
ArrayData::try_new(DataType::Int32, 5, None, 0, vec![Buffer::from_slice_ref(values)], vec![]).unwrap()
}
let arr1 = i32_array(&[1, 2, 3, 4, 5]);
let arr2 = i32_array(&[6, 7, 8, 9, 10]);
// Create a mutable array for copying values from arr1 and arr2, with a capacity for 6 elements
let capacity = 3 * std::mem::size_of::<i32>();
let mut mutable = MutableArrayData::new(vec![&arr1, &arr2], false, 10);
// Copy the first 3 elements from arr1
mutable.extend(0, 0, 3);
// Copy the last 3 elements from arr2
mutable.extend(1, 2, 4);
// Complete the MutableArrayData into a new ArrayData
let frozen = mutable.freeze();
assert_eq!(frozen, i32_array(&[1, 2, 3, 8, 9, 10]));
Fields§
§arrays: Vec<&'a ArrayData>
Input arrays: the data being read FROM.
Note this is “dead code” because all actual references to the arrays are stored in closures for extending values and nulls.
data: _MutableArrayData<'a>
In progress output array: The data being written TO
Note these fields are in a separate struct, _MutableArrayData, as they cannot be in MutableArrayData itself due to mutability invariants (interior mutability): MutableArrayData contains a function that can only mutate _MutableArrayData, not MutableArrayData itself
dictionary: Option<ArrayData>
The child data of the Array
in Dictionary arrays.
This is not stored in _MutableArrayData
because these values are
constant and only needed at the end, when freezing _MutableArrayData.
variadic_data_buffers: Vec<Buffer>
Variadic data buffers referenced by views.
Note this this is not stored in _MutableArrayData
because these values
are constant and only needed at the end, when freezing
_MutableArrayData
extend_values: Vec<Box<dyn Fn(&mut _MutableArrayData<'_>, usize, usize, usize) + 'a>>
function used to extend output array with values from input arrays.
This function’s lifetime is bound to the input arrays because it reads values from them.
extend_null_bits: Vec<Box<dyn Fn(&mut _MutableArrayData<'_>, usize, usize) + 'a>>
function used to extend the output array with nulls from input arrays.
This function’s lifetime is bound to the input arrays because it reads nulls from it.
extend_nulls: Box<dyn Fn(&mut _MutableArrayData<'_>, usize)>
function used to extend the output array with null elements.
This function is independent of the arrays and therefore has no lifetime.
Implementations§
Source§impl<'a> MutableArrayData<'a>
impl<'a> MutableArrayData<'a>
Sourcepub fn new(arrays: Vec<&'a ArrayData>, use_nulls: bool, capacity: usize) -> Self
pub fn new(arrays: Vec<&'a ArrayData>, use_nulls: bool, capacity: usize) -> Self
Returns a new MutableArrayData with capacity to capacity
slots and
specialized to create an ArrayData from multiple arrays
.
§Arguments
arrays
- the source arrays to copy fromuse_nulls
- a flag used to optimize insertionsfalse
if the only source of nulls are the arrays themselvestrue
if the user plans to call MutableArrayData::extend_nulls.
- capacity - the preallocated capacity of the output array, in bytes
Thus, if use_nulls
is false
, calling
MutableArrayData::extend_nulls should not be used.
Sourcepub fn with_capacities(
arrays: Vec<&'a ArrayData>,
use_nulls: bool,
capacities: Capacities,
) -> Self
pub fn with_capacities( arrays: Vec<&'a ArrayData>, use_nulls: bool, capacities: Capacities, ) -> Self
Similar to MutableArrayData::new, but lets users define the preallocated capacities of the array with more granularity.
See MutableArrayData::new for more information on the arguments.
§Panics
This function panics if the given capacities
don’t match the data type
of arrays
. Or when a Capacities variant is not yet supported.
Sourcepub fn extend(&mut self, index: usize, start: usize, end: usize)
pub fn extend(&mut self, index: usize, start: usize, end: usize)
Extends the in progress array with a region of the input arrays
§Arguments
index
- the index of array that you what to copy values fromstart
- the start index of the chunk (inclusive)end
- the end index of the chunk (exclusive)
§Panic
This function panics if there is an invalid index,
i.e. index
>= the number of source arrays
or end
> the length of the index
th array
Sourcepub fn extend_nulls(&mut self, len: usize)
pub fn extend_nulls(&mut self, len: usize)
Extends the in progress array with null elements, ignoring the input arrays.
§Panics
Panics if MutableArrayData
not created with use_nulls
or nullable source arrays
Sourcepub fn null_count(&self) -> usize
pub fn null_count(&self) -> usize
Returns the current null count
Sourcepub fn freeze(self) -> ArrayData
pub fn freeze(self) -> ArrayData
Creates a ArrayData from the in progress array, consuming self
.
Sourcepub fn into_builder(self) -> ArrayDataBuilder
pub fn into_builder(self) -> ArrayDataBuilder
Consume self and returns the in progress array as ArrayDataBuilder
.
This is useful for extending the default behavior of MutableArrayData.