Compute Functions¶
Datum class¶
-
class
arrow
::
Datum
¶ Variant type for various Arrow C++ data structures.
Public Functions
-
ValueDescr
descr
() const¶ Return the shape (array or scalar) and type for supported kinds (ARRAY, CHUNKED_ARRAY, and SCALAR).
Debug asserts otherwise
-
ValueDescr::Shape
shape
() const¶ Return the shape (array or scalar) for supported kinds (ARRAY, CHUNKED_ARRAY, and SCALAR).
Debug asserts otherwise
-
const std::shared_ptr<DataType> &
type
() const¶ The value type of the variant, if any.
- Returns
nullptr if no type
-
const std::shared_ptr<Schema> &
schema
() const¶ The schema of the variant, if any.
- Returns
nullptr if no schema
-
int64_t
length
() const¶ The value length of the variant, if any.
- Returns
kUnknownLength if no type
-
ArrayVector
chunks
() const¶ The array chunks of the variant, if any.
- Returns
empty if not arraylike
-
struct
Empty
¶
-
ValueDescr
Abstract Function classes¶
-
void
PrintTo
(const FunctionOptions&, std::ostream*)¶
-
class
FunctionOptionsType
¶ - #include <arrow/compute/function.h>
Extension point for defining options outside libarrow (but still within this project).
-
class
arrow::compute
::
FunctionOptions
: public arrow::util::EqualityComparable<FunctionOptions>¶ - #include <arrow/compute/function.h>
Base class for specifying options configuring a function’s behavior, such as error handling.
Subclassed by arrow::compute::ArithmeticOptions, arrow::compute::ArraySortOptions, arrow::compute::AssumeTimezoneOptions, arrow::compute::CastOptions, arrow::compute::CountOptions, arrow::compute::DayOfWeekOptions, arrow::compute::DictionaryEncodeOptions, arrow::compute::ElementWiseAggregateOptions, arrow::compute::ExtractRegexOptions, arrow::compute::FilterOptions, arrow::compute::IndexOptions, arrow::compute::JoinOptions, arrow::compute::MakeStructOptions, arrow::compute::MatchSubstringOptions, arrow::compute::ModeOptions, arrow::compute::NullOptions, arrow::compute::PadOptions, arrow::compute::PartitionNthOptions, arrow::compute::QuantileOptions, arrow::compute::ReplaceSliceOptions, arrow::compute::ReplaceSubstringOptions, arrow::compute::RoundOptions, arrow::compute::RoundToMultipleOptions, arrow::compute::ScalarAggregateOptions, arrow::compute::SelectKOptions, arrow::compute::SetLookupOptions, arrow::compute::SliceOptions, arrow::compute::SortOptions, arrow::compute::SplitOptions, arrow::compute::SplitPatternOptions, arrow::compute::StrftimeOptions, arrow::compute::StrptimeOptions, arrow::compute::TakeOptions, arrow::compute::TDigestOptions, arrow::compute::TrimOptions, arrow::compute::VarianceOptions, arrow::compute::WeekOptions
Public Functions
Public Static Functions
-
static Result<std::unique_ptr<FunctionOptions>>
Deserialize
(const std::string &type_name, const Buffer &buffer)¶ Deserialize an options struct from a buffer.
Note: this will only look for
type_name
in the default FunctionRegistry; to use a custom FunctionRegistry, look up the FunctionOptionsType, then call FunctionOptionsType::Deserialize().
-
static Result<std::unique_ptr<FunctionOptions>>
-
struct
arrow::compute
::
Arity
¶ - #include <arrow/compute/function.h>
Contains the number of required arguments for the function.
Naming conventions taken from https://en.wikipedia.org/wiki/Arity.
Public Members
-
int
num_args
¶ The number of required arguments (or the minimum number for varargs functions).
-
bool
is_varargs
= false¶ If true, then the num_args is the minimum number of required arguments.
Public Static Functions
-
int
-
struct
arrow::compute
::
FunctionDoc
¶ - #include <arrow/compute/function.h>
Public Members
-
std::string
summary
¶ A one-line summary of the function, using a verb.
For example, “Add two numeric arrays or scalars”.
-
std::string
description
¶ A detailed description of the function, meant to follow the summary.
-
std::vector<std::string>
arg_names
¶ Symbolic names (identifiers) for the function arguments.
Some bindings may use this to generate nicer function signatures.
-
std::string
options_class
¶ Name of the options class, if any.
-
std::string
-
class
arrow::compute
::
Function
¶ - #include <arrow/compute/function.h>
Base class for compute functions.
Function implementations contain a collection of “kernels” which are implementations of the function for specific argument types. Selecting a viable kernel for executing a function is referred to as “dispatching”.
Subclassed by arrow::compute::detail::FunctionImpl< KernelType >, arrow::compute::MetaFunction, arrow::compute::detail::FunctionImpl< HashAggregateKernel >, arrow::compute::detail::FunctionImpl< ScalarAggregateKernel >, arrow::compute::detail::FunctionImpl< ScalarKernel >, arrow::compute::detail::FunctionImpl< VectorKernel >
Public Types
-
enum
Kind
¶ The kind of function, which indicates in what contexts it is valid for use.
Values:
-
enumerator
SCALAR
¶ A function that performs scalar data operations on whole arrays of data.
Can generally process Array or Scalar values. The size of the output will be the same as the size (or broadcasted size, in the case of mixing Array and Scalar inputs) of the input.
-
enumerator
VECTOR
¶ A function with array input and output whose behavior depends on the values of the entire arrays passed, rather than the value of each scalar value.
-
enumerator
SCALAR_AGGREGATE
¶ A function that computes scalar summary statistics from array input.
-
enumerator
HASH_AGGREGATE
¶ A function that computes grouped summary statistics from array input and an array of group identifiers.
-
enumerator
META
¶ A function that dispatches to other functions and does not contain its own kernels.
-
enumerator
Public Functions
-
inline const std::string &
name
() const¶ The name of the kernel. The registry enforces uniqueness of names.
-
inline Function::Kind
kind
() const¶ The kind of kernel, which indicates in what contexts it is valid for use.
-
inline const Arity &
arity
() const¶ Contains the number of arguments the function requires, or if the function accepts variable numbers of arguments.
-
inline const FunctionDoc &
doc
() const¶ Return the function documentation.
-
virtual int
num_kernels
() const = 0¶ Returns the number of registered kernels for this function.
-
virtual Result<const Kernel*>
DispatchExact
(const std::vector<ValueDescr> &values) const¶ Return a kernel that can execute the function given the exact argument types (without implicit type casts or scalar->array promotions).
NB: This function is overridden in CastFunction.
-
virtual Result<const Kernel*>
DispatchBest
(std::vector<ValueDescr> *values) const¶ Return a best-match kernel that can execute the function given the argument types, after implicit casts are applied.
- Parameters
[inout] values – Argument types. An element may be modified to indicate that the returned kernel only approximately matches the input value descriptors; callers are responsible for casting inputs to the type and shape required by the kernel.
-
virtual Result<Datum>
Execute
(const std::vector<Datum> &args, const FunctionOptions *options, ExecContext *ctx) const¶ Execute the function eagerly with the passed input arguments with kernel dispatch, batch iteration, and memory allocation details taken care of.
If the
options
pointer is null, thendefault_options()
will be used.This function can be overridden in subclasses.
-
inline const FunctionOptions *
default_options
() const¶ Returns the default options for this function.
Whatever option semantics a Function has, implementations must guarantee that default_options() is valid to pass to Execute as options.
-
enum
-
class
arrow::compute
::
ScalarFunction
: public arrow::compute::detail::FunctionImpl<ScalarKernel>¶ - #include <arrow/compute/function.h>
A function that executes elementwise operations on arrays or scalars, and therefore whose results generally do not depend on the order of the values in the arguments.
Accepts and returns arrays that are all of the same size. These functions roughly correspond to the functions used in SQL expressions.
Subclassed by arrow::compute::CastFunction
Public Functions
-
Status
AddKernel
(std::vector<InputType> in_types, OutputType out_type, ArrayKernelExec exec, KernelInit init = NULLPTR)¶ Add a kernel with given input/output types, no required state initialization, preallocation for fixed-width types, and default null handling (intersect validity bitmaps of inputs).
-
Status
-
class
arrow::compute
::
VectorFunction
: public arrow::compute::detail::FunctionImpl<VectorKernel>¶ - #include <arrow/compute/function.h>
A function that executes general array operations that may yield outputs of different sizes or have results that depend on the whole array contents.
These functions roughly correspond to the functions found in non-SQL array languages like APL and its derivatives.
-
class
arrow::compute
::
ScalarAggregateFunction
: public arrow::compute::detail::FunctionImpl<ScalarAggregateKernel>¶ - #include <arrow/compute/function.h>
-
class
arrow::compute
::
HashAggregateFunction
: public arrow::compute::detail::FunctionImpl<HashAggregateKernel>¶ - #include <arrow/compute/function.h>
-
class
arrow::compute
::
MetaFunction
: public arrow::compute::Function¶ - #include <arrow/compute/function.h>
A function that dispatches to other functions.
Must implement MetaFunction::ExecuteImpl.
For Array, ChunkedArray, and Scalar Datum kinds, may rely on the execution of concrete Function types, but must handle other Datum kinds on its own.
Public Functions
-
inline virtual int
num_kernels
() const override¶ Returns the number of registered kernels for this function.
-
virtual Result<Datum>
Execute
(const std::vector<Datum> &args, const FunctionOptions *options, ExecContext *ctx) const override¶ Execute the function eagerly with the passed input arguments with kernel dispatch, batch iteration, and memory allocation details taken care of.
If the
options
pointer is null, thendefault_options()
will be used.This function can be overridden in subclasses.
-
inline virtual int
Function registry¶
-
class
arrow::compute
::
FunctionRegistry
¶ A mutable central function registry for built-in functions as well as user-defined functions.
Functions are implementations of arrow::compute::Function.
Generally, each function contains kernels which are implementations of a function for a specific argument signature. After looking up a function in the registry, one can either execute it eagerly with Function::Execute or use one of the function’s dispatch methods to pick a suitable kernel for lower-level function execution.
Public Functions
Add a new function to the registry.
Returns Status::KeyError if a function with the same name is already registered
-
Status
AddAlias
(const std::string &target_name, const std::string &source_name)¶ Add aliases for the given function name.
Returns Status::KeyError if the function with the given name is not registered
-
Status
AddFunctionOptionsType
(const FunctionOptionsType *options_type, bool allow_overwrite = false)¶ Add a new function options type to the registry.
Returns Status::KeyError if a function options type with the same name is already registered
-
Result<std::shared_ptr<Function>>
GetFunction
(const std::string &name) const¶ Retrieve a function by name from the registry.
-
std::vector<std::string>
GetFunctionNames
() const¶ Return vector of all entry names in the registry.
Helpful for displaying a manifest of available functions
-
Result<const FunctionOptionsType*>
GetFunctionOptionsType
(const std::string &name) const¶ Retrieve a function options type by name from the registry.
-
int
num_functions
() const¶ The number of currently registered functions.
Public Static Functions
-
static std::unique_ptr<FunctionRegistry>
Make
()¶ Construct a new registry.
Most users only need to use the global registry
-
FunctionRegistry *
arrow::compute
::
GetFunctionRegistry
()¶ Return the process-global function registry.
Convenience functions¶
-
Result<Datum>
CallFunction
(const std::string &func_name, const std::vector<Datum> &args, const FunctionOptions *options, ExecContext *ctx = NULLPTR)¶ One-shot invoker for all types of functions.
Does kernel dispatch, argument checking, iteration of ChunkedArray inputs, and wrapping of outputs.
-
Result<Datum>
CallFunction
(const std::string &func_name, const std::vector<Datum> &args, ExecContext *ctx = NULLPTR)¶ Variant of CallFunction which uses a function’s default options.
NB: Some functions require FunctionOptions be provided.
Concrete options classes¶
-
enum
RoundMode
¶ Rounding and tie-breaking modes for round compute functions.
Additional details and examples are provided in compute.rst.
Values:
-
enumerator
DOWN
¶ Round to nearest integer less than or equal in magnitude (aka “floor”)
-
enumerator
UP
¶ Round to nearest integer greater than or equal in magnitude (aka “ceil”)
-
enumerator
TOWARDS_ZERO
¶ Get the integral part without fractional digits (aka “trunc”)
-
enumerator
TOWARDS_INFINITY
¶ Round negative values with DOWN rule and positive values with UP rule.
-
enumerator
HALF_DOWN
¶ Round ties with DOWN rule.
-
enumerator
HALF_UP
¶ Round ties with UP rule.
-
enumerator
HALF_TOWARDS_ZERO
¶ Round ties with TOWARDS_ZERO rule.
-
enumerator
HALF_TOWARDS_INFINITY
¶ Round ties with TOWARDS_INFINITY rule.
-
enumerator
HALF_TO_EVEN
¶ Round ties to nearest even integer.
-
enumerator
HALF_TO_ODD
¶ Round ties to nearest odd integer.
-
enumerator
-
enum
CompareOperator
¶ Values:
-
enumerator
EQUAL
¶
-
enumerator
NOT_EQUAL
¶
-
enumerator
GREATER
¶
-
enumerator
GREATER_EQUAL
¶
-
enumerator
LESS
¶
-
enumerator
LESS_EQUAL
¶
-
enumerator
-
enum
SortOrder
¶ Values:
-
enumerator
Ascending
¶ Arrange values in increasing order.
-
enumerator
Descending
¶ Arrange values in decreasing order.
-
enumerator
-
enum
NullPlacement
¶ Values:
-
enumerator
AtStart
¶ Place nulls and NaNs before any non-null values.
NaNs will come after nulls.
-
enumerator
AtEnd
¶ Place nulls and NaNs after any non-null values.
NaNs will come before nulls.
-
enumerator
-
class
arrow::compute
::
ScalarAggregateOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_aggregate.h>
Control general scalar aggregate kernel behavior.
By default, null values are ignored (skip_nulls = true).
Public Functions
-
explicit
ScalarAggregateOptions
(bool skip_nulls = true, uint32_t min_count = 1)¶
Public Members
-
bool
skip_nulls
¶ If true (the default), null values are ignored.
Otherwise, if any value is null, emit null.
-
uint32_t
min_count
¶ If less than this many non-null values are observed, emit null.
Public Static Functions
-
static inline ScalarAggregateOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "ScalarAggregateOptions"
-
explicit
-
class
arrow::compute
::
CountOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_aggregate.h>
Control count aggregate kernel behavior.
By default, only non-null values are counted.
Public Types
Public Functions
-
explicit
CountOptions
(CountMode mode = CountMode::ONLY_VALID)¶
Public Static Functions
-
static inline CountOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "CountOptions"
-
explicit
-
class
arrow::compute
::
ModeOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_aggregate.h>
Control Mode kernel behavior.
Returns top-n common values and counts. By default, returns the most common value and count.
Public Functions
-
explicit
ModeOptions
(int64_t n = 1, bool skip_nulls = true, uint32_t min_count = 0)¶
Public Members
-
int64_t
n
= 1¶
-
bool
skip_nulls
¶ If true (the default), null values are ignored.
Otherwise, if any value is null, emit null.
-
uint32_t
min_count
¶ If less than this many non-null values are observed, emit null.
Public Static Functions
-
static inline ModeOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "ModeOptions"
-
explicit
-
class
arrow::compute
::
VarianceOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_aggregate.h>
Control Delta Degrees of Freedom (ddof) of Variance and Stddev kernel.
The divisor used in calculations is N - ddof, where N is the number of elements. By default, ddof is zero, and population variance or stddev is returned.
Public Functions
-
explicit
VarianceOptions
(int ddof = 0, bool skip_nulls = true, uint32_t min_count = 0)¶
Public Members
-
int
ddof
= 0¶
-
bool
skip_nulls
¶ If true (the default), null values are ignored.
Otherwise, if any value is null, emit null.
-
uint32_t
min_count
¶ If less than this many non-null values are observed, emit null.
Public Static Functions
-
static inline VarianceOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "VarianceOptions"
-
explicit
-
class
arrow::compute
::
QuantileOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_aggregate.h>
Control Quantile kernel behavior.
By default, returns the median value.
Public Types
Public Functions
-
explicit
QuantileOptions
(double q = 0.5, enum Interpolation interpolation = LINEAR, bool skip_nulls = true, uint32_t min_count = 0)¶
-
explicit
QuantileOptions
(std::vector<double> q, enum Interpolation interpolation = LINEAR, bool skip_nulls = true, uint32_t min_count = 0)¶
Public Members
-
std::vector<double>
q
¶ quantile must be between 0 and 1 inclusive
-
enum Interpolation
interpolation
¶
-
bool
skip_nulls
¶ If true (the default), null values are ignored.
Otherwise, if any value is null, emit null.
-
uint32_t
min_count
¶ If less than this many non-null values are observed, emit null.
Public Static Functions
-
static inline QuantileOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "QuantileOptions"
-
explicit
-
class
arrow::compute
::
TDigestOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_aggregate.h>
Control TDigest approximate quantile kernel behavior.
By default, returns the median value.
Public Functions
-
explicit
TDigestOptions
(double q = 0.5, uint32_t delta = 100, uint32_t buffer_size = 500, bool skip_nulls = true, uint32_t min_count = 0)¶
-
explicit
TDigestOptions
(std::vector<double> q, uint32_t delta = 100, uint32_t buffer_size = 500, bool skip_nulls = true, uint32_t min_count = 0)¶
Public Members
-
std::vector<double>
q
¶ quantile must be between 0 and 1 inclusive
-
uint32_t
delta
¶ compression parameter, default 100
-
uint32_t
buffer_size
¶ input buffer size, default 500
-
bool
skip_nulls
¶ If true (the default), null values are ignored.
Otherwise, if any value is null, emit null.
-
uint32_t
min_count
¶ If less than this many non-null values are observed, emit null.
Public Static Functions
-
static inline TDigestOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "TDigestOptions"
-
explicit
-
class
arrow::compute
::
IndexOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_aggregate.h>
Control Index kernel behavior.
Public Static Attributes
-
static constexpr static const char kTypeName [] = "IndexOptions"
-
-
class
arrow::compute
::
ArithmeticOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Functions
-
explicit
ArithmeticOptions
(bool check_overflow = false)¶
Public Members
-
bool
check_overflow
¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "ArithmeticOptions"
-
explicit
-
class
arrow::compute
::
ElementWiseAggregateOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Functions
-
explicit
ElementWiseAggregateOptions
(bool skip_nulls = true)¶
Public Members
-
bool
skip_nulls
¶
Public Static Functions
-
static inline ElementWiseAggregateOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "ElementWiseAggregateOptions"
-
explicit
-
class
arrow::compute
::
RoundOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Functions
-
explicit
RoundOptions
(int64_t ndigits = 0, RoundMode round_mode = RoundMode::HALF_TO_EVEN)¶
Public Members
-
int64_t
ndigits
¶ Rounding precision (number of digits to round to)
Public Static Functions
-
static inline RoundOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "RoundOptions"
-
explicit
-
class
arrow::compute
::
RoundToMultipleOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Functions
-
explicit
RoundToMultipleOptions
(double multiple = 1.0, RoundMode round_mode = RoundMode::HALF_TO_EVEN)¶
Public Members
-
std::shared_ptr<Scalar>
multiple
¶ Rounding scale (multiple to round to).
Should be a scalar of a type compatible with the argument to be rounded. For example, rounding a decimal value means a decimal multiple is required. Rounding a floating point or integer value means a floating point scalar is required.
Public Static Functions
-
static inline RoundToMultipleOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "RoundToMultipleOptions"
-
explicit
-
class
arrow::compute
::
JoinOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Options for var_args_join.
Public Types
-
enum
NullHandlingBehavior
¶ How to handle null values. (A null separator always results in a null output.)
Values:
-
enumerator
EMIT_NULL
¶ A null in any input results in a null in the output.
-
enumerator
SKIP
¶ Nulls in inputs are skipped.
-
enumerator
REPLACE
¶ Nulls in inputs are replaced with the replacement string.
-
enumerator
Public Functions
-
explicit
JoinOptions
(NullHandlingBehavior null_handling = EMIT_NULL, std::string null_replacement = "")¶
Public Static Functions
-
static inline JoinOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "JoinOptions"
-
enum
-
class
arrow::compute
::
MatchSubstringOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Functions
-
explicit
MatchSubstringOptions
(std::string pattern, bool ignore_case = false)¶
-
MatchSubstringOptions
()¶
Public Members
-
std::string
pattern
¶ The exact substring (or regex, depending on kernel) to look for inside input values.
-
bool
ignore_case
¶ Whether to perform a case-insensitive match.
Public Static Attributes
-
static constexpr static const char kTypeName [] = "MatchSubstringOptions"
-
explicit
-
class
arrow::compute
::
SplitOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Functions
-
explicit
SplitOptions
(int64_t max_splits = -1, bool reverse = false)¶
Public Members
-
int64_t
max_splits
¶ Maximum number of splits allowed, or unlimited when -1.
-
bool
reverse
¶ Start splitting from the end of the string (only relevant when max_splits != -1)
Public Static Attributes
-
static constexpr static const char kTypeName [] = "SplitOptions"
-
explicit
-
class
arrow::compute
::
SplitPatternOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Functions
-
explicit
SplitPatternOptions
(std::string pattern, int64_t max_splits = -1, bool reverse = false)¶
-
SplitPatternOptions
()¶
Public Members
-
std::string
pattern
¶ The exact substring to split on.
-
int64_t
max_splits
¶ Maximum number of splits allowed, or unlimited when -1.
-
bool
reverse
¶ Start splitting from the end of the string (only relevant when max_splits != -1)
Public Static Attributes
-
static constexpr static const char kTypeName [] = "SplitPatternOptions"
-
explicit
-
class
arrow::compute
::
ReplaceSliceOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Functions
-
explicit
ReplaceSliceOptions
(int64_t start, int64_t stop, std::string replacement)¶
-
ReplaceSliceOptions
()¶
Public Members
-
int64_t
start
¶ Index to start slicing at.
-
int64_t
stop
¶ Index to stop slicing at.
-
std::string
replacement
¶ String to replace the slice with.
Public Static Attributes
-
static constexpr static const char kTypeName [] = "ReplaceSliceOptions"
-
explicit
-
class
arrow::compute
::
ReplaceSubstringOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Functions
-
explicit
ReplaceSubstringOptions
(std::string pattern, std::string replacement, int64_t max_replacements = -1)¶
-
ReplaceSubstringOptions
()¶
Public Members
-
std::string
pattern
¶ Pattern to match, literal, or regular expression depending on which kernel is used.
-
std::string
replacement
¶ String to replace the pattern with.
-
int64_t
max_replacements
¶ Max number of substrings to replace (-1 means unbounded)
Public Static Attributes
-
static constexpr static const char kTypeName [] = "ReplaceSubstringOptions"
-
explicit
-
class
arrow::compute
::
ExtractRegexOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Members
-
std::string
pattern
¶ Regular expression with named capture fields.
Public Static Attributes
-
static constexpr static const char kTypeName [] = "ExtractRegexOptions"
-
std::string
-
class
arrow::compute
::
SetLookupOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Options for IsIn and IndexIn functions.
Public Functions
-
SetLookupOptions
()¶
Public Members
-
bool
skip_nulls
¶ Whether nulls in
value_set
count for lookup.If true, any null in
value_set
is ignored and nulls in the input produce null (IndexIn) or false (IsIn) values in the output. If false, any null invalue_set
is successfully matched in the input.
Public Static Attributes
-
static constexpr static const char kTypeName [] = "SetLookupOptions"
-
-
class
arrow::compute
::
StrptimeOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Functions
-
StrptimeOptions
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "StrptimeOptions"
-
-
class
arrow::compute
::
StrftimeOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Functions
-
explicit
StrftimeOptions
(std::string format, std::string locale = "C")¶
-
StrftimeOptions
()¶
Public Members
-
std::string
format
¶ The desired format string.
-
std::string
locale
¶ The desired output locale string.
Public Static Attributes
-
static constexpr static const char kTypeName [] = "StrftimeOptions"
-
static constexpr static const char * kDefaultFormat = "%Y-%m-%dT%H:%M:%S"
-
explicit
-
class
arrow::compute
::
PadOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Members
-
int64_t
width
¶ The desired string length.
-
std::string
padding
¶ What to pad the string with. Should be one codepoint (Unicode)/byte (ASCII).
Public Static Attributes
-
static constexpr static const char kTypeName [] = "PadOptions"
-
int64_t
-
class
arrow::compute
::
TrimOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Members
-
std::string
characters
¶ The individual characters that can be trimmed from the string.
Public Static Attributes
-
static constexpr static const char kTypeName [] = "TrimOptions"
-
std::string
-
class
arrow::compute
::
SliceOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Functions
-
explicit
SliceOptions
(int64_t start, int64_t stop = std::numeric_limits<int64_t>::max(), int64_t step = 1)¶
-
SliceOptions
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "SliceOptions"
-
explicit
-
class
arrow::compute
::
NullOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Functions
-
explicit
NullOptions
(bool nan_is_null = false)¶
Public Members
-
bool
nan_is_null
¶
Public Static Functions
-
static inline NullOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "NullOptions"
-
explicit
-
struct
arrow::compute
::
CompareOptions
¶ - #include <arrow/compute/api_scalar.h>
Public Members
-
enum CompareOperator
op
¶
-
enum CompareOperator
-
class
arrow::compute
::
MakeStructOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Members
-
std::vector<std::string>
field_names
¶ Names for wrapped columns.
-
std::vector<bool>
field_nullability
¶ Nullability bits for wrapped columns.
-
std::vector<std::shared_ptr<const KeyValueMetadata>>
field_metadata
¶ Metadata attached to wrapped columns.
Public Static Attributes
-
static constexpr static const char kTypeName [] = "MakeStructOptions"
-
std::vector<std::string>
-
struct
arrow::compute
::
DayOfWeekOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Functions
-
explicit
DayOfWeekOptions
(bool count_from_zero = true, uint32_t week_start = 1)¶
Public Members
-
bool
count_from_zero
¶ Number days from 0 if true and from 1 if false.
-
uint32_t
week_start
¶ What day does the week start with (Monday=1, Sunday=7).
The numbering is unaffected by the count_from_zero parameter.
Public Static Functions
-
static inline DayOfWeekOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "DayOfWeekOptions"
-
explicit
-
struct
arrow::compute
::
AssumeTimezoneOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Used to control timestamp timezone conversion and handling ambiguous/nonexistent times.
Public Types
-
enum
Ambiguous
¶ How to interpret ambiguous local times that can be interpreted as multiple instants (normally two) due to DST shifts.
AMBIGUOUS_EARLIEST emits the earliest instant amongst possible interpretations. AMBIGUOUS_LATEST emits the latest instant amongst possible interpretations.
Values:
-
enumerator
AMBIGUOUS_RAISE
¶
-
enumerator
AMBIGUOUS_EARLIEST
¶
-
enumerator
AMBIGUOUS_LATEST
¶
-
enumerator
-
enum
Nonexistent
¶ How to handle local times that do not exist due to DST shifts.
NONEXISTENT_EARLIEST emits the instant “just before” the DST shift instant in the given timestamp precision (for example, for a nanoseconds precision timestamp, this is one nanosecond before the DST shift instant). NONEXISTENT_LATEST emits the DST shift instant.
Values:
-
enumerator
NONEXISTENT_RAISE
¶
-
enumerator
NONEXISTENT_EARLIEST
¶
-
enumerator
NONEXISTENT_LATEST
¶
-
enumerator
Public Functions
-
explicit
AssumeTimezoneOptions
(std::string timezone, Ambiguous ambiguous = AMBIGUOUS_RAISE, Nonexistent nonexistent = NONEXISTENT_RAISE)¶
-
AssumeTimezoneOptions
()¶
Public Members
-
std::string
timezone
¶ Timezone to convert timestamps from.
-
Nonexistent
nonexistent
¶ How to interpret non-existent local times (due to DST shifts)
Public Static Attributes
-
static constexpr static const char kTypeName [] = "AssumeTimezoneOptions"
-
enum
-
struct
arrow::compute
::
WeekOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_scalar.h>
Public Functions
-
explicit
WeekOptions
(bool week_starts_monday = true, bool count_from_zero = false, bool first_week_is_fully_in_year = false)¶
Public Members
-
bool
week_starts_monday
¶ What day does the week start with (Monday=true, Sunday=false)
-
bool
count_from_zero
¶ Dates from current year that fall into last ISO week of the previous year return 0 if true and 52 or 53 if false.
-
bool
first_week_is_fully_in_year
¶ Must the first week be fully in January (true), or is a week that begins on December 29, 30, or 31 considered to be the first week of the new year (false)?
Public Static Functions
-
static inline WeekOptions
Defaults
()¶
-
static inline WeekOptions
ISODefaults
()¶
-
static inline WeekOptions
USDefaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "WeekOptions"
-
explicit
-
class
arrow::compute
::
FilterOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_vector.h>
Public Types
Public Functions
-
explicit
FilterOptions
(NullSelectionBehavior null_selection = DROP)¶
Public Members
-
NullSelectionBehavior
null_selection_behavior
= DROP¶
Public Static Functions
-
static inline FilterOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "FilterOptions"
-
explicit
-
class
arrow::compute
::
TakeOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_vector.h>
Public Functions
-
explicit
TakeOptions
(bool boundscheck = true)¶
Public Members
-
bool
boundscheck
= true¶
Public Static Functions
-
static inline TakeOptions
BoundsCheck
()¶
-
static inline TakeOptions
NoBoundsCheck
()¶
-
static inline TakeOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "TakeOptions"
-
explicit
-
class
arrow::compute
::
DictionaryEncodeOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_vector.h>
Options for the dictionary encode function.
Public Types
Public Functions
-
explicit
DictionaryEncodeOptions
(NullEncodingBehavior null_encoding = MASK)¶
Public Members
-
NullEncodingBehavior
null_encoding_behavior
= MASK¶
Public Static Functions
-
static inline DictionaryEncodeOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "DictionaryEncodeOptions"
-
explicit
-
class
arrow::compute
::
SortKey
: public arrow::util::EqualityComparable<SortKey>¶ - #include <arrow/compute/api_vector.h>
One sort key for PartitionNthIndices (TODO) and SortIndices.
Public Functions
-
std::string
ToString
() const¶
-
std::string
-
class
arrow::compute
::
ArraySortOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_vector.h>
Public Functions
-
explicit
ArraySortOptions
(SortOrder order = SortOrder::Ascending, NullPlacement null_placement = NullPlacement::AtEnd)¶
Public Members
-
NullPlacement
null_placement
¶ Whether nulls and NaNs are placed at the start or at the end.
Public Static Functions
-
static inline ArraySortOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "ArraySortOptions"
-
explicit
-
class
arrow::compute
::
SortOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_vector.h>
Public Functions
-
explicit
SortOptions
(std::vector<SortKey> sort_keys = {}, NullPlacement null_placement = NullPlacement::AtEnd)¶
Public Members
-
NullPlacement
null_placement
¶ Whether nulls and NaNs are placed at the start or at the end.
Public Static Functions
-
static inline SortOptions
Defaults
()¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "SortOptions"
-
explicit
-
class
arrow::compute
::
SelectKOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_vector.h>
SelectK options.
Public Members
-
int64_t
k
¶ The number of
k
elements to keep.
Public Static Functions
-
static inline SelectKOptions
Defaults
()¶
-
static inline SelectKOptions
TopKDefault
(int64_t k, std::vector<std::string> key_names = {})¶
-
static inline SelectKOptions
BottomKDefault
(int64_t k, std::vector<std::string> key_names = {})¶
Public Static Attributes
-
static constexpr static const char kTypeName [] = "SelectKOptions"
-
int64_t
-
class
arrow::compute
::
PartitionNthOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/api_vector.h>
Partitioning options for NthToIndices.
Public Functions
-
explicit
PartitionNthOptions
(int64_t pivot, NullPlacement null_placement = NullPlacement::AtEnd)¶
-
inline
PartitionNthOptions
()¶
Public Members
-
int64_t
pivot
¶ The index into the equivalent sorted array of the partition pivot element.
-
NullPlacement
null_placement
¶ Whether nulls and NaNs are partitioned at the start or at the end.
Public Static Attributes
-
static constexpr static const char kTypeName [] = "PartitionNthOptions"
-
explicit
-
class
arrow::compute
::
CastOptions
: public arrow::compute::FunctionOptions¶ - #include <arrow/compute/cast.h>
Public Functions
-
explicit
CastOptions
(bool safe = true)¶
Public Members
-
bool
allow_int_overflow
¶
-
bool
allow_time_truncate
¶
-
bool
allow_time_overflow
¶
-
bool
allow_decimal_truncate
¶
-
bool
allow_float_truncate
¶
-
bool
allow_invalid_utf8
¶
Public Static Functions
Public Static Attributes
-
static constexpr static const char kTypeName [] = "CastOptions"
-
explicit