Compute Functions

Aggregations

count(array, *[, memory_pool, options, …])

Count the number of null / non-null values.

index(data, value[, start, end, memory_pool])

Find the index of the first occurrence of a given value.

mean(array, *[, memory_pool, options, …])

Compute the mean of a numeric array.

min_max(array, *[, memory_pool, options, …])

Compute the minimum and maximum values of a numeric array.

mode(array[, n])

Return top-n most common values and number of times they occur in a passed numerical (chunked) array, in descending order of occurance.

stddev(array, *[, memory_pool, options, ddof])

Calculate the standard deviation of a numeric array.

sum(array, *[, memory_pool, options, …])

Compute the sum of a numeric array.

variance(array, *[, memory_pool, options, ddof])

Calculate the variance of a numeric array.

Arithmetic Functions

By default these functions do not detect overflow. Most functions are also available in an overflow-checking variant, suffixed _checked, which throws an ArrowInvalid exception when overflow is detected.

abs(x, *[, memory_pool])

Calculate the absolute value of the argument element-wise.

abs_checked(x, *[, memory_pool])

Calculate the absolute value of the argument element-wise.

add(x, y, *[, memory_pool])

Add the arguments element-wise.

add_checked(x, y, *[, memory_pool])

Add the arguments element-wise.

divide(dividend, divisor, *[, memory_pool])

Divide the arguments element-wise.

divide_checked(dividend, divisor, *[, …])

Divide the arguments element-wise.

multiply(x, y, *[, memory_pool])

Multiply the arguments element-wise.

multiply_checked(x, y, *[, memory_pool])

Multiply the arguments element-wise.

power(base, exponent, *[, memory_pool])

Raise arguments to power element-wise.

power_checked(base, exponent, *[, memory_pool])

Raise arguments to power element-wise.

shift_left(x, y, *[, memory_pool])

Left shift x by y.

shift_left_checked(x, y, *[, memory_pool])

Left shift x by y with invalid shift check.

shift_right(x, y, *[, memory_pool])

Right shift x by y.

shift_right_checked(x, y, *[, memory_pool])

Right shift x by y with invalid shift check.

sign(x, *[, memory_pool])

Get the signedness of the arguments element-wise.

subtract(x, y, *[, memory_pool])

Subtract the arguments element-wise.

subtract_checked(x, y, *[, memory_pool])

Subtract the arguments element-wise.

Bit-wise operations do not offer (or need) a checked variant.

bit_wise_and(x, y, *[, memory_pool])

Bit-wise AND the arguments element-wise.

bit_wise_not(x, *[, memory_pool])

Bit-wise negate the arguments element-wise.

bit_wise_or(x, y, *[, memory_pool])

Bit-wise OR the arguments element-wise.

bit_wise_xor(x, y, *[, memory_pool])

Bit-wise XOR the arguments element-wise.

Rounding Functions

Rounding functions convert a numeric input into an approximate value with a simpler representation based on the rounding strategy.

ceil(x, *[, memory_pool])

Round up to the nearest integer.

floor(x, *[, memory_pool])

Round down to the nearest integer.

trunc(x, *[, memory_pool])

Get the integral part without fractional digits.

Logarithmic Functions

Logarithmic functions are also supported, and also offer _checked variants which detect domain errors.

ln(x, *[, memory_pool])

Compute natural log of arguments element-wise.

ln_checked(x, *[, memory_pool])

Compute natural log of arguments element-wise.

log10(x, *[, memory_pool])

Compute log base 10 of arguments element-wise.

log10_checked(x, *[, memory_pool])

Compute log base 10 of arguments element-wise.

log1p(x, *[, memory_pool])

Compute natural log of (1+x) element-wise.

log1p_checked(x, *[, memory_pool])

Compute natural log of (1+x) element-wise.

log2(x, *[, memory_pool])

Compute log base 2 of arguments element-wise.

log2_checked(x, *[, memory_pool])

Compute log base 2 of arguments element-wise.

Trigonometric Functions

Trigonometric functions are also supported, and also offer _checked variants which detect domain errors where appropriate.

acos(x, *[, memory_pool])

Compute the inverse cosine of the elements argument-wise.

acos_checked(x, *[, memory_pool])

Compute the inverse cosine of the elements argument-wise.

asin(x, *[, memory_pool])

Compute the inverse sine of the elements argument-wise.

asin_checked(x, *[, memory_pool])

Compute the inverse sine of the elements argument-wise.

atan(x, *[, memory_pool])

Compute the principal value of the inverse tangent.

atan2(y, x, *[, memory_pool])

Compute the inverse tangent using argument signs to determine the quadrant.

cos(x, *[, memory_pool])

Compute the cosine of the elements argument-wise.

cos_checked(x, *[, memory_pool])

Compute the cosine of the elements argument-wise.

sin(x, *[, memory_pool])

Compute the sine of the elements argument-wise.

sin_checked(x, *[, memory_pool])

Compute the sine of the elements argument-wise.

tan(x, *[, memory_pool])

Compute the tangent of the elements argument-wise.

tan_checked(x, *[, memory_pool])

Compute the tangent of the elements argument-wise.

Comparisons

These functions expect two inputs of the same type. If one of the inputs is null they return null.

equal(x, y, *[, memory_pool])

Compare values for equality (x == y).

greater(x, y, *[, memory_pool])

Compare values for ordered inequality (x > y).

greater_equal(x, y, *[, memory_pool])

Compare values for ordered inequality (x >= y).

less(x, y, *[, memory_pool])

Compare values for ordered inequality (x < y).

less_equal(x, y, *[, memory_pool])

Compare values for ordered inequality (x <= y).

not_equal(x, y, *[, memory_pool])

Compare values for inequality (x != y).

These functions take any number of arguments of a numeric or temporal type.

max_element_wise(*args[, memory_pool, …])

Find the element-wise maximum value.

min_element_wise(*args[, memory_pool, …])

Find the element-wise minimum value.

Logical Functions

These functions normally emit a null when one of the inputs is null. However, Kleene logic variants are provided (suffixed _kleene). See User Guide for details.

and_(x, y, *[, memory_pool])

Logical ‘and’ boolean values.

and_kleene(x, y, *[, memory_pool])

Logical ‘and’ boolean values (Kleene logic).

all(array, *[, memory_pool, options, …])

Test whether all elements in a boolean array evaluate to true.

any(array, *[, memory_pool, options, …])

Test whether any element in a boolean array evaluates to true.

invert(values, *[, memory_pool])

Invert boolean values.

or_(x, y, *[, memory_pool])

Logical ‘or’ boolean values.

or_kleene(x, y, *[, memory_pool])

Logical ‘or’ boolean values (Kleene logic).

xor(x, y, *[, memory_pool])

Logical ‘xor’ boolean values.

String Predicates

In these functions an empty string emits false in the output. For ASCII variants (prefixed ascii_) a string element with non-ASCII characters emits false in the output.

The first set of functions emit true if the input contains only characters of a given class.

ascii_is_alnum(strings, *[, memory_pool])

Classify strings as ASCII alphanumeric.

ascii_is_alpha(strings, *[, memory_pool])

Classify strings as ASCII alphabetic.

ascii_is_decimal(strings, *[, memory_pool])

Classify strings as ASCII decimal.

ascii_is_lower(strings, *[, memory_pool])

Classify strings as ASCII lowercase.

ascii_is_printable(strings, *[, memory_pool])

Classify strings as ASCII printable.

ascii_is_space(strings, *[, memory_pool])

Classify strings as ASCII whitespace.

ascii_is_upper(strings, *[, memory_pool])

Classify strings as ASCII uppercase.

utf8_is_alnum(strings, *[, memory_pool])

Classify strings as alphanumeric.

utf8_is_alpha(strings, *[, memory_pool])

Classify strings as alphabetic.

utf8_is_decimal(strings, *[, memory_pool])

Classify strings as decimal.

utf8_is_digit(strings, *[, memory_pool])

Classify strings as digits.

utf8_is_lower(strings, *[, memory_pool])

Classify strings as lowercase.

utf8_is_numeric(strings, *[, memory_pool])

Classify strings as numeric.

utf8_is_printable(strings, *[, memory_pool])

Classify strings as printable.

utf8_is_space(strings, *[, memory_pool])

Classify strings as whitespace.

utf8_is_upper(strings, *[, memory_pool])

Classify strings as uppercase.

The second set of functions also consider the order of characters in the string element.

ascii_is_title(strings, *[, memory_pool])

Classify strings as ASCII titlecase.

utf8_is_title(strings, *[, memory_pool])

Classify strings as titlecase.

The third set of functions examines string elements on a byte-by-byte basis.

string_is_ascii(strings, *[, memory_pool])

Classify strings as ASCII.

String Splitting

split_pattern(strings, *[, memory_pool, …])

Split string according to separator.

split_pattern_regex(strings, *[, …])

Split string according to regex pattern.

ascii_split_whitespace(strings, *[, …])

Split string according to any ASCII whitespace.

utf8_split_whitespace(strings, *[, …])

Split string according to any Unicode whitespace.

String Component Extraction

extract_regex(strings, *[, memory_pool, options])

Extract substrings captured by a regex pattern.

String Joining

binary_join(list, separator, *[, memory_pool])

Join a list of strings together with a separator to form a single string.

binary_join_element_wise(*strings[, …])

Join string arguments into one, using the last argument as the separator.

String Transforms

ascii_center(strings, *[, memory_pool, …])

For each string in strings, emit a centered string by padding both sides with the given UTF8 codeunit.

ascii_lpad(strings, *[, memory_pool, …])

For each string in strings, emit a right-aligned string by prepending the given UTF8 codeunit.

ascii_ltrim(strings, *[, memory_pool, options])

Trim leading characters present in the characters arguments.

ascii_ltrim_whitespace(strings, *[, memory_pool])

Trim leading ASCII whitespace characters.

ascii_lower(strings, *[, memory_pool])

Transform ASCII input to lowercase.

ascii_reverse(strings, *[, memory_pool])

Reverse ASCII input.

ascii_rpad(strings, *[, memory_pool, …])

For each string in strings, emit a left-aligned string by appending the given UTF8 codeunit.

ascii_rtrim(strings, *[, memory_pool, options])

Trim trailing characters present in the characters arguments.

ascii_rtrim_whitespace(strings, *[, memory_pool])

Trim trailing ASCII whitespace characters.

ascii_trim(strings, *[, memory_pool, options])

Trim leading and trailing characters present in the characters arguments.

ascii_upper(strings, *[, memory_pool])

Transform ASCII input to uppercase.

binary_length(strings, *[, memory_pool])

Compute string lengths.

binary_replace_slice(strings, *[, …])

Replace a slice of a binary string with replacement.

replace_substring(strings, *[, memory_pool, …])

Replace non-overlapping substrings that match pattern by replacement.

replace_substring_regex(strings, *[, …])

Replace non-overlapping substrings that match regex pattern by replacement.

utf8_center(strings, *[, memory_pool, …])

Center strings by padding with a given character.

utf8_length(strings, *[, memory_pool])

Compute UTF8 string lengths.

utf8_lower(strings, *[, memory_pool])

Transform input to lowercase.

utf8_lpad(strings, *[, memory_pool, …])

Right-align strings by padding with a given character.

utf8_ltrim(strings, *[, memory_pool, options])

Trim leading characters present in the characters arguments.

utf8_ltrim_whitespace(strings, *[, memory_pool])

Trim leading whitespace characters.

utf8_replace_slice(strings, *[, …])

Replace a slice of a string with replacement.

utf8_reverse(strings, *[, memory_pool])

Reverse utf8 input.

utf8_rpad(strings, *[, memory_pool, …])

Left-align strings by padding with a given character.

utf8_rtrim(strings, *[, memory_pool, options])

Trim trailing characters present in the characters arguments.

utf8_rtrim_whitespace(strings, *[, memory_pool])

Trim trailing whitespace characters.

utf8_trim(strings, *[, memory_pool, options])

Trim leading and trailing characters present in the characters arguments.

utf8_upper(strings, *[, memory_pool])

Transform input to uppercase.

Containment tests

count_substring(array, pattern, *[, ignore_case])

Count the occurrences of substring pattern in each value of a string array.

count_substring_regex(array, pattern, *[, …])

Count the non-overlapping matches of regex pattern in each value of a string array.

ends_with(strings, *[, memory_pool, …])

Match strings against literal pattern.

find_substring(array, pattern, *[, ignore_case])

Find the index of the first occurrence of substring pattern in each value of a string array.

find_substring_regex(array, pattern, *[, …])

Find the index of the first match of regex pattern in each value of a string array.

index_in(values, *[, memory_pool, options, …])

Return index of each element in a set of values.

is_in(values, *[, memory_pool, options, …])

Find each element in a set of values.

match_like(array, pattern, *[, ignore_case])

Test if the SQL-style LIKE pattern pattern matches a value of a string array.

match_substring(array, pattern, *[, ignore_case])

Test if substring pattern is contained within a value of a string array.

match_substring_regex(array, pattern, *[, …])

Test if regex pattern matches at any position a value of a string array.

starts_with(strings, *[, memory_pool, …])

Match strings against literal pattern.

Conversions

cast(arr, target_type[, safe])

Cast array values to another data type.

strptime(strings, *[, memory_pool, options])

Parse timestamps.

Replacements

replace_with_mask(values, mask, replacements, *)

Replace items using a mask and replacement values.

Selections

filter(data, mask[, null_selection_behavior])

Select values (or records) from array- or table-like data given boolean filter, where true values are selected.

take(data, indices, *[, boundscheck, …])

Select values (or records) from array- or table-like data given integer selection indices.

Associative transforms

dictionary_encode(array, *[, memory_pool, …])

Dictionary-encode array.

unique(array, *[, memory_pool])

Compute unique elements.

value_counts(array, *[, memory_pool])

Compute counts of unique elements.

Sorts and partitions

partition_nth_indices(array, *[, …])

Return the indices that would partition an array around a pivot.

sort_indices(input, *[, memory_pool, …])

Return the indices that would sort an array, record batch or table.

Structural Transforms

binary_length(strings, *[, memory_pool])

Compute string lengths.

case_when(cond, *cases[, memory_pool])

Choose values based on multiple conditions.

coalesce(*values[, memory_pool])

Select the first non-null value in each slot.

fill_null(values, fill_value)

Replace each null element in values with fill_value.

if_else(cond, left, right, *[, memory_pool])

Choose values based on a condition.

is_finite(values, *[, memory_pool])

Return true if value is finite.

is_inf(values, *[, memory_pool])

Return true if infinity.

is_nan(values, *[, memory_pool])

Return true if NaN.

is_null(values, *[, memory_pool])

Return true if null.

is_valid(values, *[, memory_pool])

Return true if non-null.

list_value_length(lists, *[, memory_pool])

Compute list lengths.

list_flatten(lists, *[, memory_pool])

Flatten list values.

list_parent_indices(lists, *[, memory_pool])

Compute parent indices of nested list values.