Arrays¶

class
Array
¶ Array base type Immutable data array with some logical type and some length.
Any memory is owned by the respective Buffer instance (or its parents).
The base class is only required to have a null bitmap buffer if the null count is greater than 0
If known, the null count can be provided in the base Array constructor. If the null count is not known, pass 1 to indicate that the null count is to be computed on the first call to null_count()
Subclassed by arrow::DictionaryArray, arrow::ExtensionArray, arrow::FixedSizeListArray, arrow::FlatArray, arrow::ListArray, arrow::StructArray, arrow::UnionArray
Public Functions

bool
IsNull
(int64_t i) const¶ Return true if value at index is null. Does not boundscheck.

bool
IsValid
(int64_t i) const¶ Return true if value at index is valid (not null).
Does not boundscheck

int64_t
length
() const¶ Size in the number of elements this array contains.

int64_t
offset
() const¶ A relative position into another array’s data, to enable zerocopy slicing.
This value defaults to zero

int64_t
null_count
() const¶ The number of null entries in the array.
If the null count was not known at time of construction (and set to a negative value), then the null count will be computed and cached on the first invocation of this function

std::shared_ptr<Buffer>
null_bitmap
() const¶ Buffer for the null bitmap.
Note that for
null_count == 0
, this can be null. This buffer does not account for any slice offset

const uint8_t *
null_bitmap_data
() const¶ Raw pointer to the null bitmap.
Note that for
null_count == 0
, this can be null. This buffer does not account for any slice offset

bool
Equals
(const Array &arr, const EqualOptions& = EqualOptions::Defaults()) const¶ Equality comparison with another array.
Approximate equality comparison with another array.
epsilon is only used if this is FloatArray or DoubleArray

bool
RangeEquals
(int64_t start_idx, int64_t end_idx, int64_t other_start_idx, const Array &other) const¶ Compare if the range of slots specified are equal for the given array and this array.
end_idx exclusive. This methods does not bounds check.
Construct a zerocopy view of this array with the given type.
This method checks if the types are layoutcompatible. Nested types are traversed in depthfirst order. Data buffers must have the same item sizes, even though the logical types may be different. An error is returned if the types are not layoutcompatible.

std::shared_ptr<Array>
Slice
(int64_t offset, int64_t length) const¶ Construct a zerocopy slice of the array with the indicated offset and length.
 Return
a new object wrapped in std::shared_ptr<Array>
 Parameters
[in] offset
: the position of the first element in the constructed slice[in] length
: the length of the slice. If there are not enough elements in the array, the length will be adjusted accordingly

std::string
ToString
() const¶  Return
PrettyPrint representation of array suitable for debugging

bool
Concrete array subclasses¶

class
DictionaryArray
: public arrow::Array¶ Array type for dictionaryencoded data with a datadependent dictionary.
A dictionary array contains an array of nonnegative integers (the “dictionary indices”) along with a data type containing a “dictionary” corresponding to the distinct values represented in the data.
For example, the array
[“foo”, “bar”, “foo”, “bar”, “foo”, “bar”]
with dictionary [“bar”, “foo”], would have dictionary array representation
indices: [1, 0, 1, 0, 1, 0] dictionary: [“bar”, “foo”]
The indices in principle may have any integer type (signed or unsigned), though presently data in IPC exchanges must be signed int32.
Public Functions
Transpose this DictionaryArray.
This method constructs a new dictionary array with the given dictionary type, transposing indices using the transpose map. The type and the transpose map are typically computed using DictionaryType::Unify.
 Parameters
[in] pool
: a pool to allocate the array data from[in] type
: the new type object[in] dictionary
: the new dictionary[in] transpose_map
: a vector transposing this array’s indices into the target array’s indices[out] out
: the resulting DictionaryArray instance
Public Static Functions
Construct DictionaryArray from dictionary and indices array and validate.
This function does the validation of the indices and input type. It checks if all indices are nonnegative and smaller than the size of the dictionary
 Parameters
[in] type
: a dictionary type[in] dictionary
: the dictionary with same value type as the type object[in] indices
: an array of nonnegative signed integers smaller than the size of the dictionary[out] out
: the resulting DictionaryArray instance
Nonnested¶

class
FlatArray
: public arrow::Array¶ Base class for nonnested arrays.
Subclassed by arrow::BinaryArray, arrow::NullArray, arrow::PrimitiveArray

class
BinaryArray
: public arrow::FlatArray¶ Concrete Array class for variablesize binary data.
Subclassed by arrow::StringArray
Public Functions

const uint8_t *
GetValue
(int64_t i, int32_t *out_length) const¶ Return the pointer to the given elements bytes.

util::string_view
GetView
(int64_t i) const¶ Get binary value as a string_view.
 Return
the view over the selected value
 Parameters
i
: the value index

std::string
GetString
(int64_t i) const¶ Get binary value as a std::string.
 Return
the value copied into a std::string
 Parameters
i
: the value index

const uint8_t *

class
StringArray
: public arrow::BinaryArray¶ Concrete Array class for variablesize string (utf8) data.

class
PrimitiveArray
: public arrow::FlatArray¶ Base class for arrays of fixedsize logical types.
Subclassed by arrow::BooleanArray, arrow::DayTimeIntervalArray, arrow::FixedSizeBinaryArray, arrow::NumericArray< TYPE >

class
BooleanArray
: public arrow::PrimitiveArray¶ Concrete Array class for boolean data.

class
FixedSizeBinaryArray
: public arrow::PrimitiveArray¶ Concrete Array class for fixedsize binary data.
Subclassed by arrow::Decimal128Array

class
Decimal128Array
: public arrow::FixedSizeBinaryArray¶ Concrete Array class for 128bit decimal data.
Public Functions
Construct Decimal128Array from ArrayData instance.

template<typename
TYPE
>
classNumericArray
: public arrow::PrimitiveArray¶ Concrete Array class for numeric data.
Nested¶

class
UnionArray
: public arrow::Array¶ Concrete Array class for union data.
Public Functions
Public Static Functions
Construct Dense UnionArray from types_ids, value_offsets and children.
This function does the bare minimum of validation of the offsets and input types. The value_offsets are assumed to be wellformed.
 Parameters
[in] type_ids
: An array of 8bit signed integers, enumerated from 0 corresponding to each type.[in] value_offsets
: An array of signed int32 values indicating the relative offset into the respective child array for the type in a given slot. The respective offsets for each child value array must be in order / increasing.[in] children
: Vector of children Arrays containing the data for each type.[in] field_names
: Vector of strings containing the name of each field.[in] type_codes
: Vector of type codes.[out] out
: Will have length equal to value_offsets.length()
Construct Dense UnionArray from types_ids, value_offsets and children.
This function does the bare minimum of validation of the offsets and input types. The value_offsets are assumed to be wellformed.
 Parameters
[in] type_ids
: An array of 8bit signed integers, enumerated from 0 corresponding to each type.[in] value_offsets
: An array of signed int32 values indicating the relative offset into the respective child array for the type in a given slot. The respective offsets for each child value array must be in order / increasing.[in] children
: Vector of children Arrays containing the data for each type.[in] field_names
: Vector of strings containing the name of each field.[out] out
: Will have length equal to value_offsets.length()
Construct Dense UnionArray from types_ids, value_offsets and children.
This function does the bare minimum of validation of the offsets and input types. The value_offsets are assumed to be wellformed.
 Parameters
[in] type_ids
: An array of 8bit signed integers, enumerated from 0 corresponding to each type.[in] value_offsets
: An array of signed int32 values indicating the relative offset into the respective child array for the type in a given slot. The respective offsets for each child value array must be in order / increasing.[in] children
: Vector of children Arrays containing the data for each type.[in] type_codes
: Vector of type codes.[out] out
: Will have length equal to value_offsets.length()
Construct Dense UnionArray from types_ids, value_offsets and children.
This function does the bare minimum of validation of the offsets and input types. The value_offsets are assumed to be wellformed.
The name of each field is filled by the index of the field.
 Parameters
[in] type_ids
: An array of 8bit signed integers, enumerated from 0 corresponding to each type.[in] value_offsets
: An array of signed int32 values indicating the relative offset into the respective child array for the type in a given slot. The respective offsets for each child value array must be in order / increasing.[in] children
: Vector of children Arrays containing the data for each type.[out] out
: Will have length equal to value_offsets.length()
Construct Sparse UnionArray from type_ids and children.
This function does the bare minimum of validation of the offsets and input types.
 Parameters
[in] type_ids
: An array of 8bit signed integers, enumerated from 0 corresponding to each type.[in] children
: Vector of children Arrays containing the data for each type.[in] field_names
: Vector of strings containing the name of each field.[in] type_codes
: Vector of type codes.[out] out
: Will have length equal to type_ids.length()
Construct Sparse UnionArray from type_ids and children.
This function does the bare minimum of validation of the offsets and input types.
 Parameters
[in] type_ids
: An array of 8bit signed integers, enumerated from 0 corresponding to each type.[in] children
: Vector of children Arrays containing the data for each type.[in] field_names
: Vector of strings containing the name of each field.[out] out
: Will have length equal to type_ids.length()
Construct Sparse UnionArray from type_ids and children.
This function does the bare minimum of validation of the offsets and input types.
 Parameters
[in] type_ids
: An array of 8bit signed integers, enumerated from 0 corresponding to each type.[in] children
: Vector of children Arrays containing the data for each type.[in] type_codes
: Vector of type codes.[out] out
: Will have length equal to type_ids.length()
Construct Sparse UnionArray from type_ids and children.
This function does the bare minimum of validation of the offsets and input types.
The name of each field is filled by the index of the field.
 Parameters
[in] type_ids
: An array of 8bit signed integers, enumerated from 0 corresponding to each type.[in] children
: Vector of children Arrays containing the data for each type.[out] out
: Will have length equal to type_ids.length()

class
ListArray
: public arrow::Array¶ Concrete Array class for list data.
Subclassed by arrow::MapArray
Public Functions

std::shared_ptr<Buffer>
value_offsets
() const¶ Note that this buffer does not account for any slice offset.

const int32_t *
raw_value_offsets
() const¶ Return pointer to raw value offsets accounting for any slice offset.
Public Static Functions
Construct ListArray from array of offsets and child value array.
This function does the bare minimum of validation of the offsets and input types, and will allocate a new offsets array if necessary (i.e. if the offsets contain any nulls). If the offsets do not have nulls, they are assumed to be wellformed
 Parameters
[in] offsets
: Array containing n + 1 offsets encoding length and size. Must be of int32 type[in] values
: Array containing[in] pool
: MemoryPool in case new offsets array needs to be allocated because of null values[out] out
: Will have length equal to offsets.length()  1

std::shared_ptr<Buffer>

class
StructArray
: public arrow::Array¶ Concrete Array class for struct data.
Public Functions

std::shared_ptr<Array>
GetFieldByName
(const std::string &name) const¶ Returns null if name not found.

Status
Flatten
(MemoryPool *pool, ArrayVector *out) const¶ Flatten this array as a vector of arrays, one for each field.
 Parameters
[in] pool
: The pool to allocate null bitmaps from, if necessary[out] out
: The resulting vector of arrays
Public Static Functions
Return a StructArray from child arrays and field names.
The length and data type are automatically inferred from the arguments. There should be at least one child array.

std::shared_ptr<Array>
Chunked Arrays¶

class
ChunkedArray
¶ A data structure managing a list of primitive Arrow arrays logically as one large array.
Public Functions

ChunkedArray
(const ArrayVector &chunks)¶ Construct a chunked array from a vector of arrays.
The vector should be nonempty and all its elements should have the same data type.
Construct a chunked array from a single Array.
Construct a chunked array from a vector of arrays and a data type.
As the data type is passed explicitly, the vector may be empty.

int64_t
length
() const¶  Return
the total length of the chunked array; computed on construction

int64_t
null_count
() const¶  Return
the total number of nulls among all chunks

std::shared_ptr<ChunkedArray>
Slice
(int64_t offset, int64_t length) const¶ Construct a zerocopy slice of the chunked array with the indicated offset and length.
 Return
a new object wrapped in std::shared_ptr<ChunkedArray>
 Parameters
[in] offset
: the position of the first element in the constructed slice[in] length
: the length of the slice. If there are not enough elements in the chunked array, the length will be adjusted accordingly

std::shared_ptr<ChunkedArray>
Slice
(int64_t offset) const¶ Slice from offset until end of the chunked array.
Flatten this chunked array as a vector of chunked arrays, one for each struct field.
 Parameters
[in] pool
: The pool for buffer allocations, if any[out] out
: The resulting vector of arrays

bool
Equals
(const ChunkedArray &other) const¶ Determine if two chunked arrays are equal.
Two chunked arrays can be equal only if they have equal datatypes. However, they may be equal even if they have different chunkings.
Determine if two chunked arrays are equal.
