Apache Arrow (C++)
A columnar in-memory analytics layer designed to accelerate big data.
Public Member Functions | List of all members
arrow::RecordBatch Class Reference

Collection of equal-length arrays matching a particular Schema. More...

#include <arrow/table.h>

Public Member Functions

 RecordBatch (const std::shared_ptr< Schema > &schema, int64_t num_rows, const std::vector< std::shared_ptr< Array >> &columns)
 
 RecordBatch (const std::shared_ptr< Schema > &schema, int64_t num_rows, std::vector< std::shared_ptr< Array >> &&columns)
 Move-based constructor for a vector of Array instances. More...
 
 RecordBatch (const std::shared_ptr< Schema > &schema, int64_t num_rows, std::vector< std::shared_ptr< ArrayData >> &&columns)
 Construct record batch from vector of internal data structures. More...
 
 RecordBatch (const std::shared_ptr< Schema > &schema, int64_t num_rows, const std::vector< std::shared_ptr< ArrayData >> &columns)
 Construct record batch by copying vector of array data. More...
 
bool Equals (const RecordBatch &other) const
 Determine if two record batches are exactly equal. More...
 
bool ApproxEquals (const RecordBatch &other) const
 Determine if two record batches are approximately equal. More...
 
std::shared_ptr< Schemaschema () const
 
std::shared_ptr< Arraycolumn (int i) const
 Retrieve an array from the record batch. More...
 
std::shared_ptr< ArrayDatacolumn_data (int i) const
 
const std::string & column_name (int i) const
 Name in i-th column. More...
 
int num_columns () const
 
int64_t num_rows () const
 
std::shared_ptr< RecordBatchReplaceSchemaMetadata (const std::shared_ptr< const KeyValueMetadata > &metadata) const
 Replace schema key-value metadata with new metadata (EXPERIMENTAL) More...
 
std::shared_ptr< RecordBatchSlice (int64_t offset) const
 Slice each of the arrays in the record batch. More...
 
std::shared_ptr< RecordBatchSlice (int64_t offset, int64_t length) const
 Slice each of the arrays in the record batch. More...
 
Status Validate () const
 Check for schema or length inconsistencies. More...
 

Detailed Description

Collection of equal-length arrays matching a particular Schema.

A record batch is table-like data structure consisting of an internal sequence of fields, each a contiguous Arrow array

Constructor & Destructor Documentation

◆ RecordBatch() [1/4]

arrow::RecordBatch::RecordBatch ( const std::shared_ptr< Schema > &  schema,
int64_t  num_rows,
const std::vector< std::shared_ptr< Array >> &  columns 
)
Parameters
[in]schemaThe record batch schema
[in]num_rowslength of fields in the record batch. Each array should have the same length as num_rows
[in]columnsthe record batch fields as vector of arrays

◆ RecordBatch() [2/4]

arrow::RecordBatch::RecordBatch ( const std::shared_ptr< Schema > &  schema,
int64_t  num_rows,
std::vector< std::shared_ptr< Array >> &&  columns 
)

Move-based constructor for a vector of Array instances.

◆ RecordBatch() [3/4]

arrow::RecordBatch::RecordBatch ( const std::shared_ptr< Schema > &  schema,
int64_t  num_rows,
std::vector< std::shared_ptr< ArrayData >> &&  columns 
)

Construct record batch from vector of internal data structures.

Since
0.5.0

This class is only provided with an rvalue-reference for the input data, and is intended for internal use, or advanced users.

Parameters
schemathe record batch schema
num_rowsthe number of semantic rows in the record batch. This should be equal to the length of each field
columnsthe data for the batch's columns

◆ RecordBatch() [4/4]

arrow::RecordBatch::RecordBatch ( const std::shared_ptr< Schema > &  schema,
int64_t  num_rows,
const std::vector< std::shared_ptr< ArrayData >> &  columns 
)

Construct record batch by copying vector of array data.

Since
0.5.0

Member Function Documentation

◆ ApproxEquals()

bool arrow::RecordBatch::ApproxEquals ( const RecordBatch other) const

Determine if two record batches are approximately equal.

◆ column()

std::shared_ptr<Array> arrow::RecordBatch::column ( int  i) const

Retrieve an array from the record batch.

Parameters
[in]ifield index, does not boundscheck
Returns
an Array object

◆ column_data()

std::shared_ptr<ArrayData> arrow::RecordBatch::column_data ( int  i) const
inline

◆ column_name()

const std::string& arrow::RecordBatch::column_name ( int  i) const

Name in i-th column.

◆ Equals()

bool arrow::RecordBatch::Equals ( const RecordBatch other) const

Determine if two record batches are exactly equal.

Returns
true if batches are equal

◆ num_columns()

int arrow::RecordBatch::num_columns ( ) const
inline
Returns
the number of columns in the table

◆ num_rows()

int64_t arrow::RecordBatch::num_rows ( ) const
inline
Returns
the number of rows (the corresponding length of each column)

◆ ReplaceSchemaMetadata()

std::shared_ptr<RecordBatch> arrow::RecordBatch::ReplaceSchemaMetadata ( const std::shared_ptr< const KeyValueMetadata > &  metadata) const

Replace schema key-value metadata with new metadata (EXPERIMENTAL)

Since
0.5.0
Parameters
[in]metadatanew KeyValueMetadata
Returns
new RecordBatch

◆ schema()

std::shared_ptr<Schema> arrow::RecordBatch::schema ( ) const
inline
Returns
true if batches are equal

◆ Slice() [1/2]

std::shared_ptr<RecordBatch> arrow::RecordBatch::Slice ( int64_t  offset) const

Slice each of the arrays in the record batch.

Parameters
[in]offsetthe starting offset to slice, through end of batch
Returns
new record batch

◆ Slice() [2/2]

std::shared_ptr<RecordBatch> arrow::RecordBatch::Slice ( int64_t  offset,
int64_t  length 
) const

Slice each of the arrays in the record batch.

Parameters
[in]offsetthe starting offset to slice
[in]lengththe number of elements to slice from offset
Returns
new record batch

◆ Validate()

Status arrow::RecordBatch::Validate ( ) const

Check for schema or length inconsistencies.

Returns
Status

The documentation for this class was generated from the following file: