Utilities

Decimal Numbers

class Decimal128 : public arrow::BasicDecimal128

Represents a signed 128-bit integer in two’s complement.

Calculations wrap around and overflow is ignored.

For a discussion of the algorithms, look at Knuth’s volume 2, Semi-numerical Algorithms section 4.3.1.

Adapted from the Apache ORC C++ implementation

The implementation is split into two parts :

  1. BasicDecimal128

    • can be safely compiled to IR without references to libstdc++.

  2. Decimal128

    • has additional functionality on top of BasicDecimal128 to deal with strings and streams.

Public Functions

constexpr Decimal128(const BasicDecimal128 &value)

constructor creates a Decimal128 from a BasicDecimal128.

Decimal128(const std::string &value)

Parse the number from a base 10 string representation.

constexpr Decimal128()

Empty constructor creates a Decimal128 with a value of 0.

Status Divide(const Decimal128 &divisor, Decimal128 *result, Decimal128 *remainder) const

Divide this number by right and return the result.

This operation is not destructive. The answer rounds to zero. Signs work like: 21 / 5 -> 4, 1 -21 / 5 -> -4, -1 21 / -5 -> -4, 1 -21 / -5 -> 4, -1

Parameters
  • [in] divisor: the number to divide by

  • [out] result: the quotient

  • [out] remainder: the remainder after the division

std::string ToString(int32_t scale) const

Convert the Decimal128 value to a base 10 decimal string with the given scale.

std::string ToIntegerString() const

Convert the value to an integer string.

operator int64_t() const

Cast this value to an int64_t.

Status Rescale(int32_t original_scale, int32_t new_scale, Decimal128 *out) const

Convert Decimal128 from one scale to another.

template<typename T, typename = internal::EnableIfIsOneOf<T, int32_t, int64_t>>
Status ToInteger(T *out) const

Convert to a signed integer.

Public Static Functions

static Status FromString(const util::string_view &s, Decimal128 *out, int32_t *precision = NULLPTR, int32_t *scale = NULLPTR)

Convert a decimal string to a Decimal128 value, optionally including precision and scale if they’re passed in and not null.

static Status FromBigEndian(const uint8_t *data, int32_t length, Decimal128 *out)

Convert from a big-endian byte representation.

The length must be between 1 and 16.

Return

error status if the length is an invalid value

Abstract Sequences

template<typename T>
class Iterator

A generic Iterator that can return errors.

Public Functions

template<typename Wrapped>
Iterator(Wrapped has_next)

Iterator may be constructed from any type which has a member function with signature Status Next(T*);.

The argument is moved or copied to the heap and kept in a unique_ptr<void>. Only its destructor and its Next method (which are stored in function pointers) are referenced after construction.

This approach is used to dodge MSVC linkage hell (ARROW-6244, ARROW-6558) when using an abstract template base class: instead of being inlined as usual for a template function the base’s virtual destructor will be exported, leading to multiple definition errors when linking to any other TU where the base is instantiated.

Status Next(T *out)

Return the next element of the sequence, IterationTraits<T>::End() when the iteration is completed.

Calling this on a default constructed Iterator will result in undefined behavior.

template<typename Visitor>
Status Visit(Visitor &&visitor)

Pass each element of the sequence to a visitor.

Will return any error status returned by the visitor, terminating iteration.

bool operator==(const Iterator &other) const

Iterators will only compare equal if they are both null.

template<typename T>
class VectorIterator

Simple iterator which yields the elements of a std::vector.

Compression

enum arrow::Compression::type

Compression algorithm.

Values:

UNCOMPRESSED
SNAPPY
GZIP
BROTLI
ZSTD
LZ4
LZO
BZ2
class Codec

Compression codec.

Subclassed by arrow::util::BrotliCodec, arrow::util::BZ2Codec, arrow::util::GZipCodec, arrow::util::Lz4Codec, arrow::util::SnappyCodec, arrow::util::ZSTDCodec

Public Functions

virtual Status Decompress(int64_t input_len, const uint8_t *input, int64_t output_buffer_len, uint8_t *output_buffer) = 0

One-shot decompression function.

output_buffer_len must be correct and therefore be obtained in advance.

Note

One-shot decompression is not always compatible with streaming compression. Depending on the codec (e.g. LZ4), different formats may be used.

virtual Status Decompress(int64_t input_len, const uint8_t *input, int64_t output_buffer_len, uint8_t *output_buffer, int64_t *output_len) = 0

One-shot decompression function that also returns the actual decompressed size.

Note

One-shot decompression is not always compatible with streaming compression. Depending on the codec (e.g. LZ4), different formats may be used.

Parameters
  • [in] input_len: the number of bytes of compressed data.

  • [in] input: the compressed data.

  • [in] output_buffer_len: the number of bytes of buffer for decompressed data.

  • [in] output_buffer: the buffer for decompressed data.

  • [out] output_len: the actual decompressed size.

virtual Status Compress(int64_t input_len, const uint8_t *input, int64_t output_buffer_len, uint8_t *output_buffer, int64_t *output_len) = 0

One-shot compression function.

output_buffer_len must first have been computed using MaxCompressedLen().

Note

One-shot compression is not always compatible with streaming decompression. Depending on the codec (e.g. LZ4), different formats may be used.

virtual Status MakeCompressor(std::shared_ptr<Compressor> *out) = 0

Create a streaming compressor instance.

virtual Status MakeDecompressor(std::shared_ptr<Decompressor> *out) = 0

Create a streaming decompressor instance.

Public Static Functions

static int UseDefaultCompressionLevel()

Return special value to indicate that a codec implementation should use its default compression level.

static std::string GetCodecAsString(Compression::type t)

Return a string name for compression type.

static Status Create(Compression::type codec, std::unique_ptr<Codec> *out)

Create a codec for the given compression algorithm.

static Status Create(Compression::type codec, int compression_level, std::unique_ptr<Codec> *out)

Create a codec for the given compression algorithm and level.

class Compressor

Streaming compressor interface.

Public Functions

virtual Status Compress(int64_t input_len, const uint8_t *input, int64_t output_len, uint8_t *output, int64_t *bytes_read, int64_t *bytes_written) = 0

Compress some input.

If bytes_read is 0 on return, then a larger output buffer should be supplied.

virtual Status Flush(int64_t output_len, uint8_t *output, int64_t *bytes_written, bool *should_retry) = 0

Flush part of the compressed output.

If should_retry is true on return, Flush() should be called again with a larger buffer.

virtual Status End(int64_t output_len, uint8_t *output, int64_t *bytes_written, bool *should_retry) = 0

End compressing, doing whatever is necessary to end the stream.

If should_retry is true on return, End() should be called again with a larger buffer. Otherwise, the Compressor should not be used anymore.

End() implies Flush().

class Decompressor

Streaming decompressor interface.

Public Functions

virtual Status Decompress(int64_t input_len, const uint8_t *input, int64_t output_len, uint8_t *output, int64_t *bytes_read, int64_t *bytes_written, bool *need_more_output) = 0

Decompress some input.

If need_more_output is true on return, a larger output buffer needs to be supplied. XXX is need_more_output necessary? (Brotli?)

virtual bool IsFinished() = 0

Return whether the compressed stream is finished.

This is a heuristic. If true is returned, then it is guaranteed that the stream is finished. If false is returned, however, it may simply be that the underlying library isn’t able to provide the information.

virtual Status Reset() = 0

Reinitialize decompressor, making it ready for a new compressed stream.