Array Builders#

class ArrayBuilder#

Base class for all data array builders.

This class provides a facilities for incrementally building the null bitmap (see Append methods) and as a side effect the current number of slots and the null count.

Note

Users are expected to use builders as one of the concrete types below. For example, ArrayBuilder* pointing to BinaryBuilder should be downcast before use.

Subclassed by arrow::BaseBinaryBuilder< LargeBinaryType >, arrow::BaseBinaryBuilder< BinaryType >, arrow::BaseListBuilder< LargeListType >, arrow::BaseListBuilder< ListType >, arrow::NumericBuilder< DayTimeIntervalType >, arrow::NumericBuilder< MonthDayNanoIntervalType >, arrow::internal::DictionaryBuilderBase< Int32Builder, T >, arrow::internal::DictionaryBuilderBase< AdaptiveIntBuilder, T >, arrow::BaseBinaryBuilder< TYPE >, arrow::BaseListBuilder< TYPE >, arrow::BasicUnionBuilder, arrow::BooleanBuilder, arrow::FixedSizeBinaryBuilder, arrow::FixedSizeListBuilder, arrow::MapBuilder, arrow::NullBuilder, arrow::NumericBuilder< T >, arrow::RunEndEncodedBuilder, arrow::StructBuilder, arrow::internal::AdaptiveIntBuilderBase, arrow::internal::DictionaryBuilderBase< BuilderType, T >, arrow::internal::DictionaryBuilderBase< BuilderType, NullType >, arrow::internal::RunCompressorBuilder

Public Functions

inline ArrayBuilder *child(int i)#

For nested types.

Since the objects are owned by this class instance, we skip shared pointers and just return a raw pointer

virtual Status Resize(int64_t capacity)#

Ensure that enough memory has been allocated to fit the indicated number of total elements in the builder, including any that have already been appended.

Does not account for reallocations that may be due to variable size data, like binary values. To make space for incremental appends, use Reserve instead.

Parameters:

capacity[in] the minimum number of total array values to accommodate. Must be greater than the current capacity.

Returns:

Status

inline Status Reserve(int64_t additional_capacity)#

Ensure that there is enough space allocated to append the indicated number of elements without any further reallocation.

Overallocation is used in order to minimize the impact of incremental Reserve() calls. Note that additional_capacity is relative to the current number of elements rather than to the current capacity, so calls to Reserve() which are not interspersed with addition of new elements may not increase the capacity.

Parameters:

additional_capacity[in] the number of additional array values

Returns:

Status

virtual void Reset()#

Reset the builder.

virtual Status AppendNull() = 0#

Append a null value to builder.

virtual Status AppendNulls(int64_t length) = 0#

Append a number of null values to builder.

virtual Status AppendEmptyValue() = 0#

Append a non-null value to builder.

The appended value is an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending a null value to a parent nested type.

virtual Status AppendEmptyValues(int64_t length) = 0#

Append a number of non-null values to builder.

The appended values are an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending null values to a parent nested type.

inline Status AppendScalar(const Scalar &scalar)#

Append a value from a scalar.

inline virtual Status AppendArraySlice(const ArraySpan &array, int64_t offset, int64_t length)#

Append a range of values from an array.

The given array must be the same type as the builder.

virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) = 0#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

Status Finish(std::shared_ptr<Array> *out)#

Return result of builder as an Array object.

The builder is reset except for DictionaryBuilder.

Parameters:

out[out] the finalized Array object

Returns:

Status

Result<std::shared_ptr<Array>> Finish()#

Return result of builder as an Array object.

The builder is reset except for DictionaryBuilder.

Returns:

The finalized Array object

virtual std::shared_ptr<DataType> type() const = 0#

Return the type of the built Array.

Concrete builder subclasses#

Primitive#

class NullBuilder : public arrow::ArrayBuilder#

Public Functions

inline virtual Status AppendNulls(int64_t length) final#

Append the specified number of null elements.

inline virtual Status AppendNull() final#

Append a single null element.

inline virtual Status AppendEmptyValues(int64_t length) final#

Append a number of non-null values to builder.

The appended values are an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending null values to a parent nested type.

inline virtual Status AppendEmptyValue() final#

Append a non-null value to builder.

The appended value is an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending a null value to a parent nested type.

inline virtual Status AppendArraySlice(const ArraySpan&, int64_t, int64_t length) override#

Append a range of values from an array.

The given array must be the same type as the builder.

virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) override#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

inline virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

class BooleanBuilder : public arrow::ArrayBuilder, public arrow::internal::ArrayBuilderExtraOps<BooleanBuilder, bool>#

Public Functions

inline virtual Status AppendNulls(int64_t length) final#

Write nulls as uint8_t* (0 value indicates null) into pre-allocated memory.

inline virtual Status AppendNull() final#

Append a null value to builder.

inline virtual Status AppendEmptyValue() final#

Append a non-null value to builder.

The appended value is an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending a null value to a parent nested type.

inline virtual Status AppendEmptyValues(int64_t length) final#

Append a number of non-null values to builder.

The appended values are an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending null values to a parent nested type.

inline Status Append(const bool val)#

Scalar append.

inline void UnsafeAppend(const bool val)#

Scalar append, without checking for capacity.

Status AppendValues(const uint8_t *values, int64_t length, const uint8_t *valid_bytes = NULLPTR)#

Append a sequence of elements in one shot.

Parameters:
  • values[in] a contiguous array of bytes (non-zero is 1)

  • length[in] the number of values to append

  • valid_bytes[in] an optional sequence of bytes where non-zero indicates a valid (non-null) value

Returns:

Status

Status AppendValues(const uint8_t *values, int64_t length, const uint8_t *validity, int64_t offset)#

Append a sequence of elements in one shot.

Parameters:
  • values[in] a bitmap of values

  • length[in] the number of values to append

  • validity[in] a validity bitmap to copy (may be null)

  • offset[in] an offset into the values and validity bitmaps

Returns:

Status

Status AppendValues(const uint8_t *values, int64_t length, const std::vector<bool> &is_valid)#

Append a sequence of elements in one shot.

Parameters:
  • values[in] a contiguous C array of values

  • length[in] the number of values to append

  • is_valid[in] an std::vector<bool> indicating valid (1) or null (0). Equal in length to values

Returns:

Status

Status AppendValues(const std::vector<uint8_t> &values, const std::vector<bool> &is_valid)#

Append a sequence of elements in one shot.

Parameters:
  • values[in] a std::vector of bytes

  • is_valid[in] an std::vector<bool> indicating valid (1) or null (0). Equal in length to values

Returns:

Status

Status AppendValues(const std::vector<uint8_t> &values)#

Append a sequence of elements in one shot.

Parameters:

values[in] a std::vector of bytes

Returns:

Status

Status AppendValues(const std::vector<bool> &values, const std::vector<bool> &is_valid)#

Append a sequence of elements in one shot.

Parameters:
  • values[in] an std::vector<bool> indicating true (1) or false

  • is_valid[in] an std::vector<bool> indicating valid (1) or null (0). Equal in length to values

Returns:

Status

Status AppendValues(const std::vector<bool> &values)#

Append a sequence of elements in one shot.

Parameters:

values[in] an std::vector<bool> indicating true (1) or false

Returns:

Status

template<typename ValuesIter>
inline Status AppendValues(ValuesIter values_begin, ValuesIter values_end)#

Append a sequence of elements in one shot.

Parameters:
  • values_begin[in] InputIterator to the beginning of the values

  • values_end[in] InputIterator pointing to the end of the values or null(0) values

Returns:

Status

template<typename ValuesIter, typename ValidIter>
inline enable_if_t<!std::is_pointer<ValidIter>::value, Status> AppendValues(ValuesIter values_begin, ValuesIter values_end, ValidIter valid_begin)#

Append a sequence of elements in one shot, with a specified nullmap.

Parameters:
  • values_begin[in] InputIterator to the beginning of the values

  • values_end[in] InputIterator pointing to the end of the values

  • valid_begin[in] InputIterator with elements indication valid(1) or null(0) values

Returns:

Status

inline virtual Status AppendArraySlice(const ArraySpan &array, int64_t offset, int64_t length) override#

Append a range of values from an array.

The given array must be the same type as the builder.

virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) override#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

virtual void Reset() override#

Reset the builder.

virtual Status Resize(int64_t capacity) override#

Ensure that enough memory has been allocated to fit the indicated number of total elements in the builder, including any that have already been appended.

Does not account for reallocations that may be due to variable size data, like binary values. To make space for incremental appends, use Reserve instead.

Parameters:

capacity[in] the minimum number of total array values to accommodate. Must be greater than the current capacity.

Returns:

Status

inline virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

using DecimalBuilder = Decimal128Builder#
using UInt8Builder = NumericBuilder<UInt8Type>#
using UInt16Builder = NumericBuilder<UInt16Type>#
using UInt32Builder = NumericBuilder<UInt32Type>#
using UInt64Builder = NumericBuilder<UInt64Type>#
using Int8Builder = NumericBuilder<Int8Type>#
using Int16Builder = NumericBuilder<Int16Type>#
using Int32Builder = NumericBuilder<Int32Type>#
using Int64Builder = NumericBuilder<Int64Type>#
using HalfFloatBuilder = NumericBuilder<HalfFloatType>#
using FloatBuilder = NumericBuilder<FloatType>#
using DoubleBuilder = NumericBuilder<DoubleType>#
class AdaptiveUIntBuilder : public arrow::internal::AdaptiveIntBuilderBase#
#include <arrow/array/builder_adaptive.h>

Public Functions

inline Status Append(const uint64_t val)#

Scalar append.

Status AppendValues(const uint64_t *values, int64_t length, const uint8_t *valid_bytes = NULLPTR)#

Append a sequence of elements in one shot.

Parameters:
  • values[in] a contiguous C array of values

  • length[in] the number of values to append

  • valid_bytes[in] an optional sequence of bytes where non-zero indicates a valid (non-null) value

Returns:

Status

virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) override#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

class AdaptiveIntBuilder : public arrow::internal::AdaptiveIntBuilderBase#
#include <arrow/array/builder_adaptive.h>

Public Functions

inline Status Append(const int64_t val)#

Scalar append.

Status AppendValues(const int64_t *values, int64_t length, const uint8_t *valid_bytes = NULLPTR)#

Append a sequence of elements in one shot.

Parameters:
  • values[in] a contiguous C array of values

  • length[in] the number of values to append

  • valid_bytes[in] an optional sequence of bytes where non-zero indicates a valid (non-null) value

Returns:

Status

virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) override#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

class Decimal128Builder : public arrow::FixedSizeBinaryBuilder#
#include <arrow/array/builder_decimal.h>

Public Functions

virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) override#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

inline virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

virtual void Reset() override#

Reset the builder.

class Decimal256Builder : public arrow::FixedSizeBinaryBuilder#
#include <arrow/array/builder_decimal.h>

Public Functions

virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) override#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

inline virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

virtual void Reset() override#

Reset the builder.

template<typename T>
class NumericBuilder : public arrow::ArrayBuilder, public arrow::internal::ArrayBuilderExtraOps<NumericBuilder<T>, T::c_type>#
#include <arrow/array/builder_primitive.h>

Base class for all Builders that emit an Array of a scalar numerical type.

Public Functions

inline Status Append(const value_type val)#

Append a single scalar and increase the size if necessary.

inline virtual Status AppendNulls(int64_t length) final#

Write nulls as uint8_t* (0 value indicates null) into pre-allocated memory The memory at the corresponding data slot is set to 0 to prevent uninitialized memory access.

inline virtual Status AppendNull() final#

Append a single null element.

inline virtual Status AppendEmptyValue() final#

Append a empty element.

inline virtual Status AppendEmptyValues(int64_t length) final#

Append several empty elements.

inline virtual void Reset() override#

Reset the builder.

inline virtual Status Resize(int64_t capacity) override#

Ensure that enough memory has been allocated to fit the indicated number of total elements in the builder, including any that have already been appended.

Does not account for reallocations that may be due to variable size data, like binary values. To make space for incremental appends, use Reserve instead.

Parameters:

capacity[in] the minimum number of total array values to accommodate. Must be greater than the current capacity.

Returns:

Status

inline Status AppendValues(const value_type *values, int64_t length, const uint8_t *valid_bytes = NULLPTR)#

Append a sequence of elements in one shot.

Parameters:
  • values[in] a contiguous C array of values

  • length[in] the number of values to append

  • valid_bytes[in] an optional sequence of bytes where non-zero indicates a valid (non-null) value

Returns:

Status

inline Status AppendValues(const value_type *values, int64_t length, const uint8_t *bitmap, int64_t bitmap_offset)#

Append a sequence of elements in one shot.

Parameters:
  • values[in] a contiguous C array of values

  • length[in] the number of values to append

  • bitmap[in] a validity bitmap to copy (may be null)

  • bitmap_offset[in] an offset into the validity bitmap

Returns:

Status

inline Status AppendValues(const value_type *values, int64_t length, const std::vector<bool> &is_valid)#

Append a sequence of elements in one shot.

Parameters:
  • values[in] a contiguous C array of values

  • length[in] the number of values to append

  • is_valid[in] an std::vector<bool> indicating valid (1) or null (0). Equal in length to values

Returns:

Status

inline Status AppendValues(const std::vector<value_type> &values, const std::vector<bool> &is_valid)#

Append a sequence of elements in one shot.

Parameters:
  • values[in] a std::vector of values

  • is_valid[in] an std::vector<bool> indicating valid (1) or null (0). Equal in length to values

Returns:

Status

inline Status AppendValues(const std::vector<value_type> &values)#

Append a sequence of elements in one shot.

Parameters:

values[in] a std::vector of values

Returns:

Status

inline virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) override#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

template<typename ValuesIter>
inline Status AppendValues(ValuesIter values_begin, ValuesIter values_end)#

Append a sequence of elements in one shot.

Parameters:
  • values_begin[in] InputIterator to the beginning of the values

  • values_end[in] InputIterator pointing to the end of the values

Returns:

Status

template<typename ValuesIter, typename ValidIter>
inline enable_if_t<!std::is_pointer<ValidIter>::value, Status> AppendValues(ValuesIter values_begin, ValuesIter values_end, ValidIter valid_begin)#

Append a sequence of elements in one shot, with a specified nullmap.

Parameters:
  • values_begin[in] InputIterator to the beginning of the values

  • values_end[in] InputIterator pointing to the end of the values

  • valid_begin[in] InputIterator with elements indication valid(1) or null(0) values.

Returns:

Status

inline virtual Status AppendArraySlice(const ArraySpan &array, int64_t offset, int64_t length) override#

Append a range of values from an array.

The given array must be the same type as the builder.

inline void UnsafeAppend(const value_type val)#

Append a single scalar under the assumption that the underlying Buffer is large enough.

This method does not capacity-check; make sure to call Reserve beforehand.

inline virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

Temporal#

typedef NumericBuilder<Date32Type> Date32Builder#
typedef NumericBuilder<Date64Type> Date64Builder#
typedef NumericBuilder<Time32Type> Time32Builder#
typedef NumericBuilder<Time64Type> Time64Builder#
typedef NumericBuilder<TimestampType> TimestampBuilder#
typedef NumericBuilder<MonthIntervalType> MonthIntervalBuilder#
typedef NumericBuilder<DurationType> DurationBuilder#
class DayTimeIntervalBuilder : public arrow::NumericBuilder<DayTimeIntervalType>#
#include <arrow/array/builder_time.h>
class MonthDayNanoIntervalBuilder : public arrow::NumericBuilder<MonthDayNanoIntervalType>#
#include <arrow/array/builder_time.h>

Binary-like#

template<typename TYPE>
class BaseBinaryBuilder : public arrow::ArrayBuilder, public arrow::internal::ArrayBuilderExtraOps<BaseBinaryBuilder<TYPE>, std::string_view>#
#include <arrow/array/builder_binary.h>

Public Functions

inline Status ExtendCurrent(const uint8_t *value, offset_type length)#

Extend the last appended value by appending more data at the end.

Unlike Append, this does not create a new offset.

inline virtual Status AppendNulls(int64_t length) final#

Append a number of null values to builder.

inline virtual Status AppendNull() final#

Append a null value to builder.

inline virtual Status AppendEmptyValue() final#

Append a non-null value to builder.

The appended value is an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending a null value to a parent nested type.

inline virtual Status AppendEmptyValues(int64_t length) final#

Append a number of non-null values to builder.

The appended values are an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending null values to a parent nested type.

inline void UnsafeAppend(const uint8_t *value, offset_type length)#

Append without checking capacity.

Offsets and data should have been presized using Reserve() and ReserveData(), respectively.

inline void UnsafeExtendCurrent(const uint8_t *value, offset_type length)#

Like ExtendCurrent, but do not check capacity.

inline Status AppendValues(const std::vector<std::string> &values, const uint8_t *valid_bytes = NULLPTR)#

Append a sequence of strings in one shot.

Parameters:
  • values[in] a vector of strings

  • valid_bytes[in] an optional sequence of bytes where non-zero indicates a valid (non-null) value

Returns:

Status

inline Status AppendValues(const char **values, int64_t length, const uint8_t *valid_bytes = NULLPTR)#

Append a sequence of nul-terminated strings in one shot.

If one of the values is NULL, it is processed as a null value even if the corresponding valid_bytes entry is 1.

Parameters:
  • values[in] a contiguous C array of nul-terminated char *

  • length[in] the number of values to append

  • valid_bytes[in] an optional sequence of bytes where non-zero indicates a valid (non-null) value

Returns:

Status

inline virtual Status AppendArraySlice(const ArraySpan &array, int64_t offset, int64_t length) override#

Append a range of values from an array.

The given array must be the same type as the builder.

inline virtual void Reset() override#

Reset the builder.

inline virtual Status Resize(int64_t capacity) override#

Ensure that enough memory has been allocated to fit the indicated number of total elements in the builder, including any that have already been appended.

Does not account for reallocations that may be due to variable size data, like binary values. To make space for incremental appends, use Reserve instead.

Parameters:

capacity[in] the minimum number of total array values to accommodate. Must be greater than the current capacity.

Returns:

Status

inline Status ReserveData(int64_t elements)#

Ensures there is enough allocated capacity to append the indicated number of bytes to the value data buffer without additional allocations.

inline virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) override#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

inline const uint8_t *value_data() const#
Returns:

data pointer of the value date builder

inline int64_t value_data_length() const#
Returns:

size of values buffer so far

inline int64_t value_data_capacity() const#
Returns:

capacity of values buffer

inline const offset_type *offsets_data() const#
Returns:

data pointer of the value date builder

inline const uint8_t *GetValue(int64_t i, offset_type *out_length) const#

Temporary access to a value.

This pointer becomes invalid on the next modifying operation.

inline std::string_view GetView(int64_t i) const#

Temporary access to a value.

This view becomes invalid on the next modifying operation.

class BinaryBuilder : public arrow::BaseBinaryBuilder<BinaryType>#
#include <arrow/array/builder_binary.h>

Builder class for variable-length binary data.

Subclassed by arrow::StringBuilder

Public Functions

inline virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

class StringBuilder : public arrow::BinaryBuilder#
#include <arrow/array/builder_binary.h>

Builder class for UTF8 strings.

Public Functions

inline virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

class LargeBinaryBuilder : public arrow::BaseBinaryBuilder<LargeBinaryType>#
#include <arrow/array/builder_binary.h>

Builder class for large variable-length binary data.

Subclassed by arrow::LargeStringBuilder

Public Functions

inline virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

class LargeStringBuilder : public arrow::LargeBinaryBuilder#
#include <arrow/array/builder_binary.h>

Builder class for large UTF8 strings.

Public Functions

inline virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

class FixedSizeBinaryBuilder : public arrow::ArrayBuilder#
#include <arrow/array/builder_binary.h>

Subclassed by arrow::Decimal128Builder, arrow::Decimal256Builder

Public Functions

virtual Status AppendNull() final#

Append a null value to builder.

virtual Status AppendNulls(int64_t length) final#

Append a number of null values to builder.

virtual Status AppendEmptyValue() final#

Append a non-null value to builder.

The appended value is an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending a null value to a parent nested type.

virtual Status AppendEmptyValues(int64_t length) final#

Append a number of non-null values to builder.

The appended values are an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending null values to a parent nested type.

inline virtual Status AppendArraySlice(const ArraySpan &array, int64_t offset, int64_t length) override#

Append a range of values from an array.

The given array must be the same type as the builder.

inline Status ReserveData(int64_t elements)#

Ensures there is enough allocated capacity to append the indicated number of bytes to the value data buffer without additional allocations.

virtual void Reset() override#

Reset the builder.

virtual Status Resize(int64_t capacity) override#

Ensure that enough memory has been allocated to fit the indicated number of total elements in the builder, including any that have already been appended.

Does not account for reallocations that may be due to variable size data, like binary values. To make space for incremental appends, use Reserve instead.

Parameters:

capacity[in] the minimum number of total array values to accommodate. Must be greater than the current capacity.

Returns:

Status

virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) override#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

inline int64_t value_data_length() const#
Returns:

size of values buffer so far

const uint8_t *GetValue(int64_t i) const#

Temporary access to a value.

This pointer becomes invalid on the next modifying operation.

std::string_view GetView(int64_t i) const#

Temporary access to a value.

This view becomes invalid on the next modifying operation.

inline virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

Nested#

template<typename TYPE>
class BaseListBuilder : public arrow::ArrayBuilder#
#include <arrow/array/builder_nested.h>

Public Functions

inline BaseListBuilder(MemoryPool *pool, std::shared_ptr<ArrayBuilder> const &value_builder, const std::shared_ptr<DataType> &type, int64_t alignment = kDefaultBufferAlignment)#

Use this constructor to incrementally build the value array along with offsets and null bitmap.

inline virtual Status Resize(int64_t capacity) override#

Ensure that enough memory has been allocated to fit the indicated number of total elements in the builder, including any that have already been appended.

Does not account for reallocations that may be due to variable size data, like binary values. To make space for incremental appends, use Reserve instead.

Parameters:

capacity[in] the minimum number of total array values to accommodate. Must be greater than the current capacity.

Returns:

Status

inline virtual void Reset() override#

Reset the builder.

inline Status AppendValues(const offset_type *offsets, int64_t length, const uint8_t *valid_bytes = NULLPTR)#

Vector append.

If passed, valid_bytes is of equal length to values, and any zero byte will be considered as a null for that slot

inline Status Append(bool is_valid = true)#

Start a new variable-length list slot.

This function should be called before beginning to append elements to the value builder

inline virtual Status AppendNull() final#

Append a null value to builder.

inline virtual Status AppendNulls(int64_t length) final#

Append a number of null values to builder.

inline virtual Status AppendEmptyValue() final#

Append a non-null value to builder.

The appended value is an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending a null value to a parent nested type.

inline virtual Status AppendEmptyValues(int64_t length) final#

Append a number of non-null values to builder.

The appended values are an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending null values to a parent nested type.

inline virtual Status AppendArraySlice(const ArraySpan &array, int64_t offset, int64_t length) override#

Append a range of values from an array.

The given array must be the same type as the builder.

inline virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) override#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

inline virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

class ListBuilder : public arrow::BaseListBuilder<ListType>#
#include <arrow/array/builder_nested.h>

Builder class for variable-length list array value types.

To use this class, you must append values to the child array builder and use the Append function to delimit each distinct list value (once the values have been appended to the child array) or use the bulk API to append a sequence of offsets and null values.

A note on types. Per arrow/type.h all types in the c++ implementation are logical so even though this class always builds list array, this can represent multiple different logical types. If no logical type is provided at construction time, the class defaults to List<T> where t is taken from the value_builder/values that the object is constructed with.

Public Functions

inline BaseListBuilder(MemoryPool *pool, std::shared_ptr<ArrayBuilder> const &value_builder, const std::shared_ptr<DataType> &type, int64_t alignment = kDefaultBufferAlignment)#

Use this constructor to incrementally build the value array along with offsets and null bitmap.

class LargeListBuilder : public arrow::BaseListBuilder<LargeListType>#
#include <arrow/array/builder_nested.h>

Builder class for large variable-length list array value types.

Like ListBuilder, but to create large list arrays (with 64-bit offsets).

Public Functions

inline BaseListBuilder(MemoryPool *pool, std::shared_ptr<ArrayBuilder> const &value_builder, const std::shared_ptr<DataType> &type, int64_t alignment = kDefaultBufferAlignment)#

Use this constructor to incrementally build the value array along with offsets and null bitmap.

class MapBuilder : public arrow::ArrayBuilder#
#include <arrow/array/builder_nested.h>

Builder class for arrays of variable-size maps.

To use this class, you must append values to the key and item array builders and use the Append function to delimit each distinct map (once the keys and items have been appended) or use the bulk API to append a sequence of offsets and null maps.

Key uniqueness and ordering are not validated.

Public Functions

MapBuilder(MemoryPool *pool, const std::shared_ptr<ArrayBuilder> &key_builder, const std::shared_ptr<ArrayBuilder> &item_builder, const std::shared_ptr<DataType> &type)#

Use this constructor to define the built array’s type explicitly.

If key_builder or item_builder has indeterminate type, this builder will also.

MapBuilder(MemoryPool *pool, const std::shared_ptr<ArrayBuilder> &key_builder, const std::shared_ptr<ArrayBuilder> &item_builder, bool keys_sorted = false)#

Use this constructor to infer the built array’s type.

If key_builder or item_builder has indeterminate type, this builder will also.

virtual Status Resize(int64_t capacity) override#

Ensure that enough memory has been allocated to fit the indicated number of total elements in the builder, including any that have already been appended.

Does not account for reallocations that may be due to variable size data, like binary values. To make space for incremental appends, use Reserve instead.

Parameters:

capacity[in] the minimum number of total array values to accommodate. Must be greater than the current capacity.

Returns:

Status

virtual void Reset() override#

Reset the builder.

virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) override#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

Status AppendValues(const int32_t *offsets, int64_t length, const uint8_t *valid_bytes = NULLPTR)#

Vector append.

If passed, valid_bytes is of equal length to values, and any zero byte will be considered as a null for that slot

Status Append()#

Start a new variable-length map slot.

This function should be called before beginning to append elements to the key and item builders

virtual Status AppendNull() final#

Append a null value to builder.

virtual Status AppendNulls(int64_t length) final#

Append a number of null values to builder.

virtual Status AppendEmptyValue() final#

Append a non-null value to builder.

The appended value is an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending a null value to a parent nested type.

virtual Status AppendEmptyValues(int64_t length) final#

Append a number of non-null values to builder.

The appended values are an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending null values to a parent nested type.

inline virtual Status AppendArraySlice(const ArraySpan &array, int64_t offset, int64_t length) override#

Append a range of values from an array.

The given array must be the same type as the builder.

inline ArrayBuilder *key_builder() const#

Get builder to append keys.

Append a key with this builder should be followed by appending an item or null value with item_builder().

inline ArrayBuilder *item_builder() const#

Get builder to append items.

Appending an item with this builder should have been preceded by appending a key with key_builder().

inline ArrayBuilder *value_builder() const#

Get builder to add Map entries as struct values.

This is used instead of key_builder()/item_builder() and allows the Map to be built as a list of struct values.

inline virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

class FixedSizeListBuilder : public arrow::ArrayBuilder#
#include <arrow/array/builder_nested.h>

Builder class for fixed-length list array value types.

Public Functions

FixedSizeListBuilder(MemoryPool *pool, std::shared_ptr<ArrayBuilder> const &value_builder, int32_t list_size)#

Use this constructor to define the built array’s type explicitly.

If value_builder has indeterminate type, this builder will also.

FixedSizeListBuilder(MemoryPool *pool, std::shared_ptr<ArrayBuilder> const &value_builder, const std::shared_ptr<DataType> &type)#

Use this constructor to infer the built array’s type.

If value_builder has indeterminate type, this builder will also.

virtual Status Resize(int64_t capacity) override#

Ensure that enough memory has been allocated to fit the indicated number of total elements in the builder, including any that have already been appended.

Does not account for reallocations that may be due to variable size data, like binary values. To make space for incremental appends, use Reserve instead.

Parameters:

capacity[in] the minimum number of total array values to accommodate. Must be greater than the current capacity.

Returns:

Status

virtual void Reset() override#

Reset the builder.

virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) override#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

Status Append()#

Append a valid fixed length list.

This function affects only the validity bitmap; the child values must be appended using the child array builder.

Status AppendValues(int64_t length, const uint8_t *valid_bytes = NULLPTR)#

Vector append.

If passed, valid_bytes wil be read and any zero byte will cause the corresponding slot to be null

This function affects only the validity bitmap; the child values must be appended using the child array builder. This includes appending nulls for null lists. XXX this restriction is confusing, should this method be omitted?

virtual Status AppendNull() final#

Append a null fixed length list.

The child array builder will have the appropriate number of nulls appended automatically.

virtual Status AppendNulls(int64_t length) final#

Append length null fixed length lists.

The child array builder will have the appropriate number of nulls appended automatically.

virtual Status AppendEmptyValue() final#

Append a non-null value to builder.

The appended value is an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending a null value to a parent nested type.

virtual Status AppendEmptyValues(int64_t length) final#

Append a number of non-null values to builder.

The appended values are an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending null values to a parent nested type.

inline virtual Status AppendArraySlice(const ArraySpan &array, int64_t offset, int64_t length) final#

Append a range of values from an array.

The given array must be the same type as the builder.

inline virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

class StructBuilder : public arrow::ArrayBuilder#
#include <arrow/array/builder_nested.h>

Append, Resize and Reserve methods are acting on StructBuilder.

Please make sure all these methods of all child-builders’ are consistently called to maintain data-structure consistency.

Public Functions

StructBuilder(const std::shared_ptr<DataType> &type, MemoryPool *pool, std::vector<std::shared_ptr<ArrayBuilder>> field_builders)#

If any of field_builders has indeterminate type, this builder will also.

virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) override#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

inline Status AppendValues(int64_t length, const uint8_t *valid_bytes)#

Null bitmap is of equal length to every child field, and any zero byte will be considered as a null for that field, but users must using app- end methods or advance methods of the child builders’ independently to insert data.

inline Status Append(bool is_valid = true)#

Append an element to the Struct.

All child-builders’ Append method must be called independently to maintain data-structure consistency.

inline virtual Status AppendNull() final#

Append a null value.

Automatically appends an empty value to each child builder.

inline virtual Status AppendNulls(int64_t length) final#

Append multiple null values.

Automatically appends empty values to each child builder.

inline virtual Status AppendEmptyValue() final#

Append a non-null value to builder.

The appended value is an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending a null value to a parent nested type.

inline virtual Status AppendEmptyValues(int64_t length) final#

Append a number of non-null values to builder.

The appended values are an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending null values to a parent nested type.

inline virtual Status AppendArraySlice(const ArraySpan &array, int64_t offset, int64_t length) override#

Append a range of values from an array.

The given array must be the same type as the builder.

virtual void Reset() override#

Reset the builder.

virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

class BasicUnionBuilder : public arrow::ArrayBuilder#
#include <arrow/array/builder_union.h>

Base class for union array builds.

Note that while we subclass ArrayBuilder, as union types do not have a validity bitmap, the bitmap builder member of ArrayBuilder is not used.

Subclassed by arrow::DenseUnionBuilder, arrow::SparseUnionBuilder

Public Functions

virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) override#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

int8_t AppendChild(const std::shared_ptr<ArrayBuilder> &new_child, const std::string &field_name = "")#

Make a new child builder available to the UnionArray.

Parameters:
  • new_child[in] the child builder

  • field_name[in] the name of the field in the union array type if type inference is used

Returns:

child index, which is the “type” argument that needs to be passed to the “Append” method to add a new element to the union array.

virtual std::shared_ptr<DataType> type() const override#

Return the type of the built Array.

class DenseUnionBuilder : public arrow::BasicUnionBuilder#
#include <arrow/array/builder_union.h>

This API is EXPERIMENTAL.

Public Functions

inline explicit DenseUnionBuilder(MemoryPool *pool, int64_t alignment = kDefaultBufferAlignment)#

Use this constructor to initialize the UnionBuilder with no child builders, allowing type to be inferred.

You will need to call AppendChild for each of the children builders you want to use.

inline DenseUnionBuilder(MemoryPool *pool, const std::vector<std::shared_ptr<ArrayBuilder>> &children, const std::shared_ptr<DataType> &type, int64_t alignment = kDefaultBufferAlignment)#

Use this constructor to specify the type explicitly.

You can still add child builders to the union after using this constructor

inline virtual Status AppendNull() final#

Append a null value to builder.

inline virtual Status AppendNulls(int64_t length) final#

Append a number of null values to builder.

inline virtual Status AppendEmptyValue() final#

Append a non-null value to builder.

The appended value is an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending a null value to a parent nested type.

inline virtual Status AppendEmptyValues(int64_t length) final#

Append a number of non-null values to builder.

The appended values are an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending null values to a parent nested type.

inline Status Append(int8_t next_type)#

Append an element to the UnionArray.

This must be followed by an append to the appropriate child builder.

The corresponding child builder must be appended to independently after this method is called.

Parameters:

next_type[in] type_id of the child to which the next value will be appended.

virtual Status AppendArraySlice(const ArraySpan &array, int64_t offset, int64_t length) override#

Append a range of values from an array.

The given array must be the same type as the builder.

virtual Status FinishInternal(std::shared_ptr<ArrayData> *out) override#

Return result of builder as an internal generic ArrayData object.

Resets builder except for dictionary builder

Parameters:

out[out] the finalized ArrayData object

Returns:

Status

class SparseUnionBuilder : public arrow::BasicUnionBuilder#
#include <arrow/array/builder_union.h>

This API is EXPERIMENTAL.

Public Functions

inline explicit SparseUnionBuilder(MemoryPool *pool, int64_t alignment = kDefaultBufferAlignment)#

Use this constructor to initialize the UnionBuilder with no child builders, allowing type to be inferred.

You will need to call AppendChild for each of the children builders you want to use.

inline SparseUnionBuilder(MemoryPool *pool, const std::vector<std::shared_ptr<ArrayBuilder>> &children, const std::shared_ptr<DataType> &type, int64_t alignment = kDefaultBufferAlignment)#

Use this constructor to specify the type explicitly.

You can still add child builders to the union after using this constructor

inline virtual Status AppendNull() final#

Append a null value.

A null is appended to the first child, empty values to the other children.

inline virtual Status AppendNulls(int64_t length) final#

Append multiple null values.

Nulls are appended to the first child, empty values to the other children.

inline virtual Status AppendEmptyValue() final#

Append a non-null value to builder.

The appended value is an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending a null value to a parent nested type.

inline virtual Status AppendEmptyValues(int64_t length) final#

Append a number of non-null values to builder.

The appended values are an implementation detail, but the corresponding memory slot is guaranteed to be initialized. This method is useful when appending null values to a parent nested type.

inline Status Append(int8_t next_type)#

Append an element to the UnionArray.

This must be followed by an append to the appropriate child builder.

The corresponding child builder must be appended to independently after this method is called, and all other child builders must have null or empty value appended.

Parameters:

next_type[in] type_id of the child to which the next value will be appended.

virtual Status AppendArraySlice(const ArraySpan &array, int64_t offset, int64_t length) override#

Append a range of values from an array.

The given array must be the same type as the builder.

Dictionary-encoded#

template<typename T>
class DictionaryBuilder : public arrow::internal::DictionaryBuilderBase<AdaptiveIntBuilder, T>#

A DictionaryArray builder that uses AdaptiveIntBuilder to return the smallest index size that can accommodate the dictionary indices.

Public Functions

inline Status AppendIndices(const int64_t *values, int64_t length, const uint8_t *valid_bytes = NULLPTR)#

Append dictionary indices directly without modifying memo.

NOTE: Experimental API

Record Batch Builder#

class RecordBatchBuilder#

Helper class for creating record batches iteratively given a known schema.

Public Functions

inline ArrayBuilder *GetField(int i)#

Get base pointer to field builder.

Parameters:

i – the field index

Returns:

pointer to ArrayBuilder

template<typename T>
inline T *GetFieldAs(int i)#

Return field builder casted to indicated specific builder type.

Parameters:

i – the field index

Returns:

pointer to template type

Result<std::shared_ptr<RecordBatch>> Flush(bool reset_builders)#

Finish current batch and optionally reset.

Parameters:

reset_builders[in] the resulting RecordBatch

Returns:

the resulting RecordBatch

Result<std::shared_ptr<RecordBatch>> Flush()#

Finish current batch and reset.

Returns:

the resulting RecordBatch

void SetInitialCapacity(int64_t capacity)#

Set the initial capacity for new builders.

inline int64_t initial_capacity() const#

The initial capacity for builders.

inline int num_fields() const#

The number of fields in the schema.

inline std::shared_ptr<Schema> schema() const#

The number of fields in the schema.

Public Static Functions

static Result<std::unique_ptr<RecordBatchBuilder>> Make(const std::shared_ptr<Schema> &schema, MemoryPool *pool)#

Create and initialize a RecordBatchBuilder.

Parameters:
  • schema[in] The schema for the record batch

  • pool[in] A MemoryPool to use for allocations

Returns:

the created builder instance

static Result<std::unique_ptr<RecordBatchBuilder>> Make(const std::shared_ptr<Schema> &schema, MemoryPool *pool, int64_t initial_capacity)#

Create and initialize a RecordBatchBuilder.

Parameters:
  • schema[in] The schema for the record batch

  • pool[in] A MemoryPool to use for allocations

  • initial_capacity[in] The initial capacity for the builders

Returns:

the created builder instance