Testing API Reference#

group nanoarrow_testing

Utilities for testing nanoarrow structures and functions.

Integration testing JSON#

group nanoarrow_testing-json

See testing format documentation for details of the JSON representation. This representation is not canonical but can be used to implement integration tests with other implementations.

class TestingJSONWriter#
#include <nanoarrow_testing.hpp>

Writer for the Arrow integration testing JSON format.

Public Functions

inline void set_float_precision(int value)#

Set the floating point precision of the writer.

The floating point precision by default is -1, which uses the JSON serializer to encode the value in the output. When writing files specifically for integration tests, floating point values should be rounded to 3 decimal places to avoid serialization issues.

inline void set_include_metadata(bool value)#

Set whether metadata should be included in the output of a schema or field.

Use false to skip writing schema/field metadata in the output.

inline ArrowErrorCode WriteDataFile(std::ostream &out, ArrowArrayStream *stream)#

Write an ArrowArrayStream as a data file JSON object to out.

Creates output like {"schema": {...}, "batches": [...], ...}.

inline ArrowErrorCode WriteSchema(std::ostream &out, const ArrowSchema *schema)#

Write a schema to out.

Creates output like {"fields": [...], "metadata": [...]}.

inline ArrowErrorCode WriteField(std::ostream &out, const ArrowSchema *field)#

Write a field to out.

Creates output like {"name" : "col", "type": {...}, ...}

inline ArrowErrorCode WriteType(std::ostream &out, const ArrowSchema *field)#

Write the type portion of a field.

Creates output like {"name": "int", ...}

inline ArrowErrorCode WriteMetadata(std::ostream &out, const char *metadata)#

Write the metadata portion of a field.

Creates output like [{"key": "...", "value": "..."}, ...].

inline ArrowErrorCode WriteBatch(std::ostream &out, const ArrowSchema *schema, const ArrowArrayView *value)#

Write a “batch” to out.

Creates output like {"count": 123, "columns": [...]}.

inline ArrowErrorCode WriteColumn(std::ostream &out, const ArrowSchema *field, const ArrowArrayView *value)#

Write a column to out.

Creates output like {"name": "col", "count": 123, "VALIDITY": [...], ...}.

class TestingJSONReader#
#include <nanoarrow_testing.hpp>

Reader for the Arrow integration testing JSON format.

Public Functions

inline ArrowErrorCode ReadDataFile(const std::string &data_file_json, ArrowArrayStream *out, int num_batch = kNumBatchReadAll, ArrowError *error = nullptr)#

Read JSON representing a data file object.

Read a JSON object in the form {"schema": {...}, "batches": [...], ...}, propagating out on success.

inline ArrowErrorCode ReadSchema(const std::string &schema_json, ArrowSchema *out, ArrowError *error = nullptr)#

Read JSON representing a Schema.

Reads a JSON object in the form {"fields": [...], "metadata": [...]}, propagating out on success.

inline ArrowErrorCode ReadField(const std::string &field_json, ArrowSchema *out, ArrowError *error = nullptr)#

Read JSON representing a Field.

Read a JSON object in the form {"name" : "col", "type": {...}, ...}, propagating out on success.

inline ArrowErrorCode ReadBatch(const std::string &batch_json, const ArrowSchema *schema, ArrowArray *out, ArrowError *error = nullptr)#

Read JSON representing a RecordBatch.

Read a JSON object in the form {"count": 123, "columns": [...]}, propagating out on success.

inline ArrowErrorCode ReadColumn(const std::string &column_json, const ArrowSchema *schema, ArrowArray *out, ArrowError *error = nullptr)#

Read JSON representing a Column.

Read a JSON object in the form {"name": "col", "count": 123, "VALIDITY": [...], ...}, propagating out on success.

class TestingJSONComparison#
#include <nanoarrow_testing.hpp>

Integration testing comparison utility.

Utility to compare ArrowSchema, ArrowArray, and ArrowArrayStream instances. This should only be used in the context of integration testing as the comparison logic is specific to the integration testing JSON files and specification. Notably:

  • Map types are considered equal regardless of the child names “entries”, “key”, and “value”.

  • Float32 and Float64 values are compared according to their JSON serialization.

Public Functions

inline void set_compare_batch_flags(bool value)#

Compare top-level RecordBatch flags (e.g., nullability)

Some Arrow implementations export batches as nullable, and some export them as non-nullable. Use false to consider these two types of batches as equivalent.

inline void set_compare_metadata_order(bool value)#

Compare metadata order.

Some Arrow implementations store metadata using structures (e.g., hash map) that reorder metadata items. Use false to consider metadata whose keys/values have been reordered as equivalent.

inline void set_compare_float_precision(int value)#

Set float precision.

The Arrow Integration Testing JSON document states that values should be compared to 3 decimal places to avoid floating point serialization issues. Use -1 to specify that all decimal places should be used (the default).

inline int64_t num_differences() const#

Returns the number of differences found by the previous call.

inline void WriteDifferences(std::ostream &out)#

Dump a human-readable summary of differences to out.

inline void ClearDifferences()#

Clear any existing differences.

inline ArrowErrorCode CompareArrayStream(ArrowArrayStream *actual, ArrowArrayStream *expected, ArrowError *error = nullptr)#

Compare a stream of record batches.

Compares actual against expected using the following strategy:

  • Compares schemas for equality, returning if differences were found

  • Compares pairs of record batches, returning if one stream finished before another.

Returns NANOARROW_OK if the comparison ran without error. Callers must query num_differences() to obtain the result of the comparison on success.

inline ArrowErrorCode CompareSchema(const ArrowSchema *actual, const ArrowSchema *expected, ArrowError *error = nullptr, const std::string &path = "")#

Compare a top-level ArrowSchema struct.

Returns NANOARROW_OK if the comparison ran without error. Callers must query num_differences() to obtain the result of the comparison on success.

inline ArrowErrorCode SetSchema(const ArrowSchema *schema, ArrowError *error = nullptr)#

Set the ArrowSchema to be used to for future calls to CompareBatch().

inline ArrowErrorCode CompareBatch(const ArrowArray *actual, const ArrowArray *expected, ArrowError *error = nullptr, const std::string &path = "")#

Compare a top-level ArrowArray struct.

Returns NANOARROW_OK if the comparison ran without error. Callers must query num_differences() to obtain the result of the comparison on success.