pub struct StructArray {
len: usize,
data_type: DataType,
nulls: Option<NullBuffer>,
fields: Vec<ArrayRef>,
}
Expand description
An array of structs
Each child (called field) is represented by a separate array.
§Comparison with RecordBatch
Both RecordBatch
and StructArray
represent a collection of columns / arrays with the
same length.
However, there are a couple of key differences:
StructArray
can be nested within otherArray
, including itselfRecordBatch
can contain top-level metadata on its associated [Schema
][arrow_schema::Schema]StructArray
can contain top-level nulls, i.e.null
RecordBatch
can only represent nulls in its child columns, i.e.{"field": null}
StructArray
is therefore a more general data container than RecordBatch
, and as such
code that needs to handle both will typically share an implementation in terms of
StructArray
and convert to/from RecordBatch
as necessary.
From
implementations are provided to facilitate this conversion, however, converting
from a StructArray
containing top-level nulls to a RecordBatch
will panic, as there
is no way to preserve them.
§Example: Create an array from a vector of fields
use std::sync::Arc;
use arrow_array::{Array, ArrayRef, BooleanArray, Int32Array, StructArray};
use arrow_schema::{DataType, Field};
let boolean = Arc::new(BooleanArray::from(vec![false, false, true, true]));
let int = Arc::new(Int32Array::from(vec![42, 28, 19, 31]));
let struct_array = StructArray::from(vec![
(
Arc::new(Field::new("b", DataType::Boolean, false)),
boolean.clone() as ArrayRef,
),
(
Arc::new(Field::new("c", DataType::Int32, false)),
int.clone() as ArrayRef,
),
]);
assert_eq!(struct_array.column(0).as_ref(), boolean.as_ref());
assert_eq!(struct_array.column(1).as_ref(), int.as_ref());
assert_eq!(4, struct_array.len());
assert_eq!(0, struct_array.null_count());
assert_eq!(0, struct_array.offset());
Fields§
§len: usize
§data_type: DataType
§nulls: Option<NullBuffer>
§fields: Vec<ArrayRef>
Implementations§
Source§impl StructArray
impl StructArray
Sourcepub fn new(
fields: Fields,
arrays: Vec<ArrayRef>,
nulls: Option<NullBuffer>,
) -> Self
pub fn new( fields: Fields, arrays: Vec<ArrayRef>, nulls: Option<NullBuffer>, ) -> Self
Create a new StructArray
from the provided parts, panicking on failure
§Panics
Panics if Self::try_new
returns an error
Sourcepub fn try_new(
fields: Fields,
arrays: Vec<ArrayRef>,
nulls: Option<NullBuffer>,
) -> Result<Self, ArrowError>
pub fn try_new( fields: Fields, arrays: Vec<ArrayRef>, nulls: Option<NullBuffer>, ) -> Result<Self, ArrowError>
Create a new StructArray
from the provided parts, returning an error on failure
§Errors
Errors if
fields.len() != arrays.len()
fields[i].data_type() != arrays[i].data_type()
arrays[i].len() != arrays[j].len()
arrays[i].len() != nulls.len()
!fields[i].is_nullable() && !nulls.contains(arrays[i].nulls())
Sourcepub fn new_null(fields: Fields, len: usize) -> Self
pub fn new_null(fields: Fields, len: usize) -> Self
Create a new StructArray
of length len
where all values are null
Sourcepub unsafe fn new_unchecked(
fields: Fields,
arrays: Vec<ArrayRef>,
nulls: Option<NullBuffer>,
) -> Self
pub unsafe fn new_unchecked( fields: Fields, arrays: Vec<ArrayRef>, nulls: Option<NullBuffer>, ) -> Self
Create a new StructArray
from the provided parts without validation
§Safety
Safe if Self::new
would not panic with the given arguments
Sourcepub fn new_empty_fields(len: usize, nulls: Option<NullBuffer>) -> Self
pub fn new_empty_fields(len: usize, nulls: Option<NullBuffer>) -> Self
Sourcepub fn into_parts(self) -> (Fields, Vec<ArrayRef>, Option<NullBuffer>)
pub fn into_parts(self) -> (Fields, Vec<ArrayRef>, Option<NullBuffer>)
Deconstruct this array into its constituent parts
Sourcepub fn num_columns(&self) -> usize
pub fn num_columns(&self) -> usize
Return the number of fields in this struct array
Sourcepub fn column_names(&self) -> Vec<&str>
pub fn column_names(&self) -> Vec<&str>
Return field names in this struct array
Sourcepub fn fields(&self) -> &Fields
pub fn fields(&self) -> &Fields
Returns the [Fields
] of this StructArray
Sourcepub fn column_by_name(&self, column_name: &str) -> Option<&ArrayRef>
pub fn column_by_name(&self, column_name: &str) -> Option<&ArrayRef>
Return child array whose field name equals to column_name
Note: A schema can currently have duplicate field names, in which case the first field will always be selected. This issue will be addressed in ARROW-11178
Trait Implementations§
Source§impl Array for StructArray
impl Array for StructArray
Source§fn data_type(&self) -> &DataType
fn data_type(&self) -> &DataType
DataType
] of this array. Read moreSource§fn slice(&self, offset: usize, length: usize) -> ArrayRef
fn slice(&self, offset: usize, length: usize) -> ArrayRef
Source§fn shrink_to_fit(&mut self)
fn shrink_to_fit(&mut self)
Source§fn offset(&self) -> usize
fn offset(&self) -> usize
0
. Read moreSource§fn nulls(&self) -> Option<&NullBuffer>
fn nulls(&self) -> Option<&NullBuffer>
Source§fn logical_null_count(&self) -> usize
fn logical_null_count(&self) -> usize
Source§fn get_buffer_memory_size(&self) -> usize
fn get_buffer_memory_size(&self) -> usize
Source§fn get_array_memory_size(&self) -> usize
fn get_array_memory_size(&self) -> usize
get_buffer_memory_size()
and
includes the overhead of the data structures that contain the pointers to the various buffers.Source§fn logical_nulls(&self) -> Option<NullBuffer>
fn logical_nulls(&self) -> Option<NullBuffer>
NullBuffer
] that represents the logical
null values of this array, if any. Read moreSource§fn null_count(&self) -> usize
fn null_count(&self) -> usize
Source§fn is_nullable(&self) -> bool
fn is_nullable(&self) -> bool
false
if the array is guaranteed to not contain any logical nulls Read moreSource§impl Clone for StructArray
impl Clone for StructArray
Source§fn clone(&self) -> StructArray
fn clone(&self) -> StructArray
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moreSource§impl Debug for StructArray
impl Debug for StructArray
Source§impl From<&StructArray> for RecordBatch
impl From<&StructArray> for RecordBatch
Source§fn from(struct_array: &StructArray) -> Self
fn from(struct_array: &StructArray) -> Self
Source§impl From<ArrayData> for StructArray
impl From<ArrayData> for StructArray
Source§impl From<RecordBatch> for StructArray
impl From<RecordBatch> for StructArray
Source§fn from(value: RecordBatch) -> Self
fn from(value: RecordBatch) -> Self
Source§impl From<StructArray> for ArrayData
impl From<StructArray> for ArrayData
Source§fn from(array: StructArray) -> Self
fn from(array: StructArray) -> Self
Source§impl From<StructArray> for RecordBatch
impl From<StructArray> for RecordBatch
Source§fn from(value: StructArray) -> Self
fn from(value: StructArray) -> Self
Source§impl Index<&str> for StructArray
impl Index<&str> for StructArray
Source§fn index(&self, name: &str) -> &Self::Output
fn index(&self, name: &str) -> &Self::Output
Get a reference to a column’s array by name.
Note: A schema can currently have duplicate field names, in which case the first field will always be selected. This issue will be addressed in ARROW-11178
§Panics
Panics if the name is not in the schema.