Scalar Functions¶
Math Functions¶
abs
¶
Returns the absolute value of a number.
abs(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
acos
¶
Returns the arc cosine or inverse cosine of a number.
acos(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
acosh
¶
Returns the area hyperbolic cosine or inverse hyperbolic cosine of a number.
acosh(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
asin
¶
Returns the arc sine or inverse sine of a number.
asin(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
asinh
¶
Returns the area hyperbolic sine or inverse hyperbolic sine of a number.
asinh(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
atan
¶
Returns the arc tangent or inverse tangent of a number.
atan(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
atanh
¶
Returns the area hyperbolic tangent or inverse hyperbolic tangent of a number.
atanh(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
atan2
¶
Returns the arc tangent or inverse tangent of expression_y / expression_x
.
atan2(expression_y, expression_x)
Arguments¶
expression_y: First numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
expression_x: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
cbrt
¶
Returns the cube root of a number.
cbrt(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
ceil
¶
Returns the nearest integer greater than or equal to a number.
ceil(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
cos
¶
Returns the cosine of a number.
cos(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
cosh
¶
Returns the hyperbolic cosine of a number.
cosh(numeric_expression)
degrees
¶
Converts radians to degrees.
degrees(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
exp
¶
Returns the base-e exponential of a number.
exp(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to use as the exponent. Can be a constant, column, or function, and any combination of arithmetic operators.
factorial
¶
Factorial. Returns 1 if value is less than 2.
factorial(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
floor
¶
Returns the nearest integer less than or equal to a number.
floor(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
gcd
¶
Returns the greatest common divisor of expression_x
and expression_y
. Returns 0 if both inputs are zero.
gcd(expression_x, expression_y)
Arguments¶
expression_x: First numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
expression_y: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
isnan
¶
Returns true if a given number is +NaN or -NaN otherwise returns false.
isnan(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
iszero
¶
Returns true if a given number is +0.0 or -0.0 otherwise returns false.
iszero(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
lcm
¶
Returns the least common multiple of expression_x
and expression_y
. Returns 0 if either input is zero.
lcm(expression_x, expression_y)
Arguments¶
expression_x: First numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
expression_y: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
ln
¶
Returns the natural logarithm of a number.
ln(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
log
¶
Returns the base-x logarithm of a number. Can either provide a specified base, or if omitted then takes the base-10 of a number.
log(base, numeric_expression)
log(numeric_expression)
Arguments¶
base: Base numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
log10
¶
Returns the base-10 logarithm of a number.
log10(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
log2
¶
Returns the base-2 logarithm of a number.
log2(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
nanvl
¶
Returns the first argument if it’s not NaN. Returns the second argument otherwise.
nanvl(expression_x, expression_y)
Arguments¶
expression_x: Numeric expression to return if it’s not NaN. Can be a constant, column, or function, and any combination of arithmetic operators.
expression_y: Numeric expression to return if the first expression is NaN. Can be a constant, column, or function, and any combination of arithmetic operators.
pi
¶
Returns an approximate value of π.
pi()
power
¶
Returns a base expression raised to the power of an exponent.
power(base, exponent)
Arguments¶
base: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
exponent: Exponent numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
Aliases¶
pow
pow
¶
Alias of power.
radians
¶
Converts degrees to radians.
radians(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
random
¶
Returns a random float value in the range [0, 1). The random seed is unique to each row.
random()
round
¶
Rounds a number to the nearest integer.
round(numeric_expression[, decimal_places])
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
decimal_places: Optional. The number of decimal places to round to. Defaults to 0.
signum
¶
Returns the sign of a number.
Negative numbers return -1
.
Zero and positive numbers return 1
.
signum(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
sin
¶
Returns the sine of a number.
sin(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
sinh
¶
Returns the hyperbolic sine of a number.
sinh(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
sqrt
¶
Returns the square root of a number.
sqrt(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
tan
¶
Returns the tangent of a number.
tan(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
tanh
¶
Returns the hyperbolic tangent of a number.
tanh(numeric_expression)
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
trunc
¶
Truncates a number to a whole number or truncated to the specified decimal places.
trunc(numeric_expression[, decimal_places])
Arguments¶
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
decimal_places: Optional. The number of decimal places to truncate to. Defaults to 0 (truncate to a whole number). If
decimal_places
is a positive integer, truncates digits to the right of the decimal point. Ifdecimal_places
is a negative integer, replaces digits to the left of the decimal point with0
.
Conditional Functions¶
coalesce
¶
Returns the first of its arguments that is not null. Returns null if all arguments are null. This function is often used to substitute a default value for null values.
coalesce(expression1[, ..., expression_n])
Arguments¶
expression1, expression_n: Expression to use if previous expressions are null. Can be a constant, column, or function, and any combination of arithmetic operators. Pass as many expression arguments as necessary.
nullif
¶
Returns null if expression1 equals expression2; otherwise it returns expression1.
This can be used to perform the inverse operation of coalesce
.
nullif(expression1, expression2)
Arguments¶
expression1: Expression to compare and return if equal to expression2. Can be a constant, column, or function, and any combination of arithmetic operators.
expression2: Expression to compare to expression1. Can be a constant, column, or function, and any combination of arithmetic operators.
String Functions¶
ascii
¶
Returns the ASCII value of the first character in a string.
ascii(str)
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Related functions: chr
bit_length
¶
Returns the bit length of a string.
bit_length(str)
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Related functions: length, octet_length
btrim
¶
Trims the specified trim string from the start and end of a string. If no trim string is provided, all whitespace is removed from the start and end of the input string.
btrim(str[, trim_str])
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
trim_str: String expression to trim from the beginning and end of the input string. Can be a constant, column, or function, and any combination of arithmetic operators. Default is whitespace characters.
char_length
¶
Alias of length.
character_length
¶
Alias of length.
concat
¶
Concatenates multiple strings together.
concat(str[, ..., str_n])
Arguments¶
str: String expression to concatenate. Can be a constant, column, or function, and any combination of string operators.
str_n: Subsequent string column or literal string to concatenate.
Related functions: contcat_ws
concat_ws
¶
Concatenates multiple strings together with a specified separator.
concat(separator, str[, ..., str_n])
Arguments¶
separator: Separator to insert between concatenated strings.
str: String expression to concatenate. Can be a constant, column, or function, and any combination of string operators.
str_n: Subsequent string column or literal string to concatenate.
Related functions: concat
chr
¶
Returns the character with the specified ASCII or Unicode code value.
chr(expression)
Arguments¶
expression: Expression containing the ASCII or Unicode code value to operate on. Can be a constant, column, or function, and any combination of arithmetic or string operators.
Related functions: ascii
initcap
¶
Capitalizes the first character in each word in the input string. Words are delimited by non-alphanumeric characters.
initcap(str)
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
left
¶
Returns a specified number of characters from the left side of a string.
left(str, n)
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
n: Number of characters to return.
Related functions: right
length
¶
Returns the number of characters in a string.
length(str)
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Aliases¶
char_length
character_length
Related functions: bit_length, octet_length
lower
¶
Converts a string to lower-case.
lower(str)
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
lpad
¶
Pads the left side of a string with another string to a specified string length.
lpad(str, n[, padding_str])
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
n: String length to pad to.
padding_str: String expression to pad with. Can be a constant, column, or function, and any combination of string operators. Default is a space.
Related functions: rpad
ltrim
¶
Removes leading spaces from a string.
ltrim(str)
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
octet_length
¶
Returns the length of a string in bytes.
octet_length(str)
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Related functions: bit_length, length
repeat
¶
Returns a string with an input string repeated a specified number.
repeat(str, n)
Arguments¶
str: String expression to repeat. Can be a constant, column, or function, and any combination of string operators.
n: Number of times to repeat the input string.
replace
¶
Replaces all occurrences of a specified substring in a string with a new substring.
replace(str, substr, replacement)
Arguments¶
str: String expression to repeat. Can be a constant, column, or function, and any combination of string operators.
substr: Substring expression to replace in the input string. Can be a constant, column, or function, and any combination of string operators.
replacement: Replacement substring expression. Can be a constant, column, or function, and any combination of string operators.
reverse
¶
Reverses the character order of a string.
reverse(str)
Arguments¶
str: String expression to repeat. Can be a constant, column, or function, and any combination of string operators.
right
¶
Returns a specified number of characters from the right side of a string.
right(str, n)
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
n: Number of characters to return.
Related functions: left
rpad
¶
Pads the right side of a string with another string to a specified string length.
rpad(str, n[, padding_str])
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
n: String length to pad to.
padding_str: String expression to pad with. Can be a constant, column, or function, and any combination of string operators. Default is a space.
Related functions: lpad
rtrim
¶
Removes trailing spaces from a string.
rtrim(str)
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
split_part
¶
Splits a string based on a specified delimiter and returns the substring in the specified position.
split_part(str, delimiter, pos)
Arguments¶
str: String expression to spit. Can be a constant, column, or function, and any combination of string operators.
delimiter: String or character to split on.
pos: Position of the part to return.
starts_with
¶
Tests if a string starts with a substring.
starts_with(str, substr)
Arguments¶
str: String expression to test. Can be a constant, column, or function, and any combination of string operators.
substr: Substring to test for.
strpos
¶
Returns the starting position of a specified substring in a string. Positions begin at 1. If the substring does not exist in the string, the function returns 0.
strpos(str, substr)
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
substr: Substring expression to search for. Can be a constant, column, or function, and any combination of string operators.
substr
¶
Extracts a substring of a specified number of characters from a specific starting position in a string.
substr(str, start_pos[, length])
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
start_pos: Character position to start the substring at. The first character in the string has a position of 1.
length: Number of characters to extract. If not specified, returns the rest of the string after the start position.
translate
¶
Translates characters in a string to specified translation characters.
translate(str, chars, translation)
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
chars: Characters to translate.
translation: Translation characters. Translation characters replace only characters at the same position in the chars string.
to_hex
¶
Converts an integer to a hexadecimal string.
to_hex(int)
Arguments¶
int: Integer expression to convert. Can be a constant, column, or function, and any combination of arithmetic operators.
trim
¶
Removes leading and trailing spaces from a string.
trim(str)
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
upper
¶
Converts a string to upper-case.
upper(str)
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
uuid
¶
Returns UUID v4 string value which is unique per row.
uuid()
overlay
¶
Returns the string which is replaced by another string from the specified position and specified count length.
For example, overlay('Txxxxas' placing 'hom' from 2 for 4) → Thomas
overlay(str PLACING substr FROM pos [FOR count])
Arguments¶
str: String expression to operate on.
substr: the string to replace part of str.
pos: the start position to replace of str.
count: the count of characters to be replaced from start position of str. If not specified, will use substr length instead.
levenshtein
¶
Returns the Levenshtein distance between the two given strings.
For example, levenshtein('kitten', 'sitting') = 3
levenshtein(str1, str2)
Arguments¶
str1: String expression to compute Levenshtein distance with str2.
str2: String expression to compute Levenshtein distance with str1.
substr_index
¶
Returns the substring from str before count occurrences of the delimiter delim.
If count is positive, everything to the left of the final delimiter (counting from the left) is returned.
If count is negative, everything to the right of the final delimiter (counting from the right) is returned.
For example, substr_index('www.apache.org', '.', 1) = www
, substr_index('www.apache.org', '.', -1) = org
substr_index(str, delim, count)
Arguments¶
str: String expression to operate on.
delim: the string to find in str to split str.
count: The number of times to search for the delimiter. Can be both a positive or negative number.
find_in_set
¶
Returns a value in the range of 1 to N if the string str is in the string list strlist consisting of N substrings.
For example, find_in_set('b', 'a,b,c,d') = 2
find_in_set(str, strlist)
Arguments¶
str: String expression to find in strlist.
strlist: A string list is a string composed of substrings separated by , characters.
Binary String Functions¶
encode
¶
Encode binary data into a textual representation.
encode(expression, format)
Arguments¶
expression: Expression containing string or binary data
format: Supported formats are:
base64
,hex
Related functions: decode
decode
¶
Decode binary data from textual representation in string.
decode(expression, format)
Arguments¶
expression: Expression containing encoded string data
format: Same arguments as encode
Related functions: encode
Regular Expression Functions¶
Apache DataFusion uses the POSIX regular expression syntax and supports the following regular expression functions:
regexp_match
¶
Returns a list of regular expression matches in a string.
regexp_match(str, regexp)
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
regexp: Regular expression to match against. Can be a constant, column, or function.
regexp_replace
¶
Replaces substrings in a string that match a regular expression.
regexp_replace(str, regexp, replacement, flags)
Arguments¶
str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
regexp: Regular expression to match against. Can be a constant, column, or function.
replacement: Replacement string expression. Can be a constant, column, or function, and any combination of string operators.
flags: Regular expression flags that control the behavior of the regular expression. The following flags are supported.
g: (global) Search globally and don’t return after the first match.
i: (insensitive) Ignore case when matching.
Time and Date Functions¶
now
¶
Returns the current UTC timestamp.
The now()
return value is determined at query time and will return the same timestamp,
no matter when in the query plan the function executes.
now()
current_date
¶
Returns the current UTC date.
The current_date()
return value is determined at query time and will return the same date,
no matter when in the query plan the function executes.
current_date()
current_time
¶
Returns the current UTC time.
The current_time()
return value is determined at query time and will return the same time,
no matter when in the query plan the function executes.
current_time()
date_bin
¶
Calculates time intervals and returns the start of the interval nearest to the specified timestamp.
Use date_bin
to downsample time series data by grouping rows into time-based “bins” or “windows”
and applying an aggregate or selector function to each window.
For example, if you “bin” or “window” data into 15 minute intervals, an input
timestamp of 2023-01-01T18:18:18Z
will be updated to the start time of the 15
minute bin it is in: 2023-01-01T18:15:00Z
.
date_bin(interval, expression, origin-timestamp)
Arguments¶
interval: Bin interval.
expression: Time expression to operate on. Can be a constant, column, or function.
timestamp: Starting point used to determine bin boundaries.
The following intervals are supported:
nanoseconds
microseconds
milliseconds
seconds
minutes
hours
days
weeks
months
years
century
date_trunc
¶
Truncates a timestamp value to a specified precision.
date_trunc(precision, expression)
Arguments¶
precision: Time precision to truncate to. The following precisions are supported:
year / YEAR
quarter / QUARTER
month / MONTH
week / WEEK
day / DAY
hour / HOUR
minute / MINUTE
second / SECOND
expression: Time expression to operate on. Can be a constant, column, or function.
Aliases¶
datetrunc
datetrunc
¶
Alias of date_trunc.
date_part
¶
Returns the specified part of the date as an integer.
date_part(part, expression)
Arguments¶
part: Part of the date to return. The following date parts are supported:
year
month
week (week of the year)
day (day of the month)
hour
minute
second
millisecond
microsecond
nanosecond
dow (day of the week)
doy (day of the year)
expression: Time expression to operate on. Can be a constant, column, or function.
Aliases¶
datepart
datepart
¶
Alias of date_part.
extract
¶
Returns a sub-field from a time value as an integer.
Similar to date_part
, but with different arguments.
extract(field FROM source)
Arguments¶
field: Part or field of the date to return. The following date fields are supported:
year
month
week (week of the year)
day (day of the month)
hour
minute
second
millisecond
microsecond
nanosecond
dow (day of the week)
doy (day of the year)
source: Source time expression to operate on. Can be a constant, column, or function.
to_timestamp
¶
Converts a value to a timestamp (YYYY-MM-DDT00:00:00Z
).
Supports strings, integer, unsigned integer, and double types as input.
Strings are parsed as RFC3339 (e.g. ‘2023-07-20T05:44:00’)
Integers, unsigned integers, and doubles are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z
)
return the corresponding timestamp.
Note: to_timestamp
returns Timestamp(Nanosecond)
. The supported range for integer input is between -9223372037
and 9223372036
.
Supported range for string input is between 1677-09-21T00:12:44.0
and 2262-04-11T23:47:16.0
. Please use to_timestamp_seconds
for the input outside of supported bounds.
to_timestamp(expression)
Arguments¶
expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
to_timestamp_millis
¶
Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000Z
).
Supports strings, integer, and unsigned integer types as input.
Strings are parsed as RFC3339 (e.g. ‘2023-07-20T05:44:00’)
Integers and unsigned integers are interpreted as milliseconds since the unix epoch (1970-01-01T00:00:00Z
)
return the corresponding timestamp.
to_timestamp_millis(expression)
Arguments¶
expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
to_timestamp_micros
¶
Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000000Z
).
Supports strings, integer, and unsigned integer types as input.
Strings are parsed as RFC3339 (e.g. ‘2023-07-20T05:44:00’)
Integers and unsigned integers are interpreted as microseconds since the unix epoch (1970-01-01T00:00:00Z
)
return the corresponding timestamp.
to_timestamp_nanos(expression)
to_timestamp_nanos
¶
Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000000000Z
).
Supports strings, integer, and unsigned integer types as input.
Strings are parsed as RFC3339 (e.g. ‘2023-07-20T05:44:00’)
Integers and unsigned integers are interpreted as nanoseconds since the unix epoch (1970-01-01T00:00:00Z
)
return the corresponding timestamp.
to_timestamp_nanos(expression)
Arguments¶
expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
to_timestamp_seconds
¶
Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000Z
).
Supports strings, integer, and unsigned integer types as input.
Strings are parsed as RFC3339 (e.g. ‘2023-07-20T05:44:00’)
Integers and unsigned integers are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z
)
return the corresponding timestamp.
to_timestamp_seconds(expression)
Arguments¶
expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
from_unixtime
¶
Converts an integer to RFC3339 timestamp format (YYYY-MM-DDT00:00:00.000000000Z
).
Integers and unsigned integers are interpreted as nanoseconds since the unix epoch (1970-01-01T00:00:00Z
)
return the corresponding timestamp.
from_unixtime(expression)
Arguments¶
expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
Array Functions¶
array_append
¶
Appends an element to the end of an array.
array_append(array, element)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
element: Element to append to the array.
Example¶
❯ select array_append([1, 2, 3], 4);
+--------------------------------------+
| array_append(List([1,2,3]),Int64(4)) |
+--------------------------------------+
| [1, 2, 3, 4] |
+--------------------------------------+
Aliases¶
array_push_back
list_append
list_push_back
array_cat
¶
Alias of array_concat.
array_concat
¶
Concatenates arrays.
array_concat(array[, ..., array_n])
Arguments¶
array: Array expression to concatenate. Can be a constant, column, or function, and any combination of array operators.
array_n: Subsequent array column or literal array to concatenate.
Example¶
❯ select array_concat([1, 2], [3, 4], [5, 6]);
+---------------------------------------------------+
| array_concat(List([1,2]),List([3,4]),List([5,6])) |
+---------------------------------------------------+
| [1, 2, 3, 4, 5, 6] |
+---------------------------------------------------+
Aliases¶
array_cat
list_cat
list_concat
array_has
¶
Returns true if the array contains the element
array_has(array, element)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
element: Scalar or Array expression. Can be a constant, column, or function, and any combination of array operators.
array_has_all
¶
Returns true if all elements of sub-array exist in array
array_has_all(array, sub-array)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
sub-array: Array expression. Can be a constant, column, or function, and any combination of array operators.
array_has_any
¶
Returns true if any elements exist in both arrays
array_has_any(array, sub-array)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
sub-array: Array expression. Can be a constant, column, or function, and any combination of array operators.
array_dims
¶
Returns an array of the array’s dimensions.
array_dims(array)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
Example¶
❯ select array_dims([[1, 2, 3], [4, 5, 6]]);
+---------------------------------+
| array_dims(List([1,2,3,4,5,6])) |
+---------------------------------+
| [2, 3] |
+---------------------------------+
Aliases¶
list_dims
array_element
¶
Extracts the element with the index n from the array.
array_element(array, index)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
index: Index to extract the element from the array.
Example¶
❯ select array_element([1, 2, 3, 4], 3);
+-----------------------------------------+
| array_element(List([1,2,3,4]),Int64(3)) |
+-----------------------------------------+
| 3 |
+-----------------------------------------+
Aliases¶
array_extract
list_element
list_extract
array_empty
¶
array_extract
¶
Alias of array_element.
array_fill
¶
Returns an array filled with copies of the given value.
DEPRECATED: use array_repeat
instead!
array_fill(element, array)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
element: Element to copy to the array.
flatten
¶
Converts an array of arrays to a flat array
Applies to any depth of nested arrays
Does not change arrays that are already flat
The flattened array contains all the elements from all source arrays.
Arguments¶
array: Array expression Can be a constant, column, or function, and any combination of array operators.
flatten(array)
array_indexof
¶
Alias of array_position.
array_join
¶
Alias of array_to_string.
array_length
¶
Returns the length of the array dimension.
array_length(array, dimension)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
dimension: Array dimension.
Example¶
❯ select array_length([1, 2, 3, 4, 5]);
+---------------------------------+
| array_length(List([1,2,3,4,5])) |
+---------------------------------+
| 5 |
+---------------------------------+
Aliases¶
list_length
array_ndims
¶
Returns the number of dimensions of the array.
array_ndims(array, element)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
Example¶
❯ select array_ndims([[1, 2, 3], [4, 5, 6]]);
+----------------------------------+
| array_ndims(List([1,2,3,4,5,6])) |
+----------------------------------+
| 2 |
+----------------------------------+
Aliases¶
list_ndims
array_prepend
¶
Prepends an element to the beginning of an array.
array_prepend(element, array)
Arguments¶
element: Element to prepend to the array.
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
Example¶
❯ select array_prepend(1, [2, 3, 4]);
+---------------------------------------+
| array_prepend(Int64(1),List([2,3,4])) |
+---------------------------------------+
| [1, 2, 3, 4] |
+---------------------------------------+
Aliases¶
array_push_front
list_prepend
list_push_front
array_pop_front
¶
Returns the array without the first element.
array_pop_first(array)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
Example¶
❯ select array_pop_first([1, 2, 3]);
+-------------------------------+
| array_pop_first(List([1,2,3])) |
+-------------------------------+
| [2, 3] |
+-------------------------------+
array_pop_back
¶
Returns the array without the last element.
array_pop_back(array)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
Example¶
❯ select array_pop_back([1, 2, 3]);
+-------------------------------+
| array_pop_back(List([1,2,3])) |
+-------------------------------+
| [1, 2] |
+-------------------------------+
array_position
¶
Returns a string with an input string repeated a specified number.
array_position(array, element)
array_position(array, element, index)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
element: Element to search for position in the array.
index: Index at which to start searching.
Example¶
❯ select array_position([1, 2, 2, 3, 1, 4], 2);
+----------------------------------------------+
| array_position(List([1,2,2,3,1,4]),Int64(2)) |
+----------------------------------------------+
| 2 |
+----------------------------------------------+
Aliases¶
array_indexof
list_indexof
list_position
array_positions
¶
Searches for an element in the array, returns all occurrences.
array_positions(array, element)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
element: Element to search for positions in the array.
Example¶
❯ select array_positions([1, 2, 2, 3, 1, 4], 2);
+-----------------------------------------------+
| array_positions(List([1,2,2,3,1,4]),Int64(2)) |
+-----------------------------------------------+
| [2, 3] |
+-----------------------------------------------+
Aliases¶
list_positions
array_push_back
¶
Alias of array_append.
array_push_front
¶
Alias of array_prepend.
array_repeat
¶
Returns an array containing element count
times.
array_repeat(element, count)
Arguments¶
element: Element expression. Can be a constant, column, or function, and any combination of array operators.
count: Value of how many times to repeat the element.
Example¶
❯ select array_repeat(1, 3);
+---------------------------------+
| array_repeat(Int64(1),Int64(3)) |
+---------------------------------+
| [1, 1, 1] |
+---------------------------------+
❯ select array_repeat([1, 2], 2);
+------------------------------------+
| array_repeat(List([1,2]),Int64(2)) |
+------------------------------------+
| [[1, 2], [1, 2]] |
+------------------------------------+
array_remove
¶
Removes the first element from the array equal to the given value.
array_remove(array, element)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
element: Element to be removed from the array.
Example¶
❯ select array_remove([1, 2, 2, 3, 2, 1, 4], 2);
+----------------------------------------------+
| array_remove(List([1,2,2,3,2,1,4]),Int64(2)) |
+----------------------------------------------+
| [1, 2, 3, 2, 1, 4] |
+----------------------------------------------+
Aliases¶
list_remove
array_remove_n
¶
Removes the first max
elements from the array equal to the given value.
array_remove_n(array, element, max)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
element: Element to be removed from the array.
max: Number of first occurrences to remove.
Example¶
❯ select array_remove_n([1, 2, 2, 3, 2, 1, 4], 2, 2);
+---------------------------------------------------------+
| array_remove_n(List([1,2,2,3,2,1,4]),Int64(2),Int64(2)) |
+---------------------------------------------------------+
| [1, 3, 2, 1, 4] |
+---------------------------------------------------------+
Aliases¶
list_remove_n
array_remove_all
¶
Removes all elements from the array equal to the given value.
array_remove_all(array, element)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
element: Element to be removed from the array.
Example¶
❯ select array_remove_all([1, 2, 2, 3, 2, 1, 4], 2);
+--------------------------------------------------+
| array_remove_all(List([1,2,2,3,2,1,4]),Int64(2)) |
+--------------------------------------------------+
| [1, 3, 1, 4] |
+--------------------------------------------------+
Aliases¶
list_remove_all
array_replace
¶
Replaces the first occurrence of the specified element with another specified element.
array_replace(array, from, to)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
from: Initial element.
to: Final element.
Example¶
❯ select array_replace([1, 2, 2, 3, 2, 1, 4], 2, 5);
+--------------------------------------------------------+
| array_replace(List([1,2,2,3,2,1,4]),Int64(2),Int64(5)) |
+--------------------------------------------------------+
| [1, 5, 2, 3, 2, 1, 4] |
+--------------------------------------------------------+
Aliases¶
list_replace
array_replace_n
¶
Replaces the first max
occurrences of the specified element with another specified element.
array_replace_n(array, from, to, max)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
from: Initial element.
to: Final element.
max: Number of first occurrences to replace.
Example¶
❯ select array_replace_n([1, 2, 2, 3, 2, 1, 4], 2, 5, 2);
+-------------------------------------------------------------------+
| array_replace_n(List([1,2,2,3,2,1,4]),Int64(2),Int64(5),Int64(2)) |
+-------------------------------------------------------------------+
| [1, 5, 5, 3, 2, 1, 4] |
+-------------------------------------------------------------------+
Aliases¶
list_replace_n
array_replace_all
¶
Replaces all occurrences of the specified element with another specified element.
array_replace_all(array, from, to)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
from: Initial element.
to: Final element.
Example¶
❯ select array_replace_all([1, 2, 2, 3, 2, 1, 4], 2, 5);
+------------------------------------------------------------+
| array_replace_all(List([1,2,2,3,2,1,4]),Int64(2),Int64(5)) |
+------------------------------------------------------------+
| [1, 5, 5, 3, 5, 1, 4] |
+------------------------------------------------------------+
Aliases¶
list_replace_all
array_slice
¶
Returns a slice of the array.
array_slice(array, begin, end)
Example¶
❯ select array_slice([1, 2, 3, 4, 5, 6, 7, 8], 3, 6);
+--------------------------------------------------------+
| array_slice(List([1,2,3,4,5,6,7,8]),Int64(3),Int64(6)) |
+--------------------------------------------------------+
| [3, 4, 5, 6] |
+--------------------------------------------------------+
Aliases¶
list_slice
array_to_string
¶
Converts each element to its text representation.
array_to_string(array, delimiter)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
delimiter: Array element separator.
Example¶
❯ select array_to_string([[1, 2, 3, 4], [5, 6, 7, 8]], ',');
+----------------------------------------------------+
| array_to_string(List([1,2,3,4,5,6,7,8]),Utf8(",")) |
+----------------------------------------------------+
| 1,2,3,4,5,6,7,8 |
+----------------------------------------------------+
Aliases¶
array_join
list_join
list_to_string
array_union
¶
Returns an array of elements that are present in both arrays (all elements from both arrays) with out duplicates.
array_union(array1, array2)
Arguments¶
array1: Array expression. Can be a constant, column, or function, and any combination of array operators.
array2: Array expression. Can be a constant, column, or function, and any combination of array operators.
Example¶
❯ select array_union([1, 2, 3, 4], [5, 6, 3, 4]);
+----------------------------------------------------+
| array_union([1, 2, 3, 4], [5, 6, 3, 4]); |
+----------------------------------------------------+
| [1, 2, 3, 4, 5, 6] |
+----------------------------------------------------+
❯ select array_union([1, 2, 3, 4], [5, 6, 7, 8]);
+----------------------------------------------------+
| array_union([1, 2, 3, 4], [5, 6, 7, 8]); |
+----------------------------------------------------+
| [1, 2, 3, 4, 5, 6, 7, 8] |
+----------------------------------------------------+
Aliases¶
list_union
array_except
¶
Returns an array of the elements that appear in the first array but not in the second.
array_except(array1, array2)
Arguments¶
array1: Array expression. Can be a constant, column, or function, and any combination of array operators.
array2: Array expression. Can be a constant, column, or function, and any combination of array operators.
Example¶
❯ select array_except([1, 2, 3, 4], [5, 6, 3, 4]);
+----------------------------------------------------+
| array_except([1, 2, 3, 4], [5, 6, 3, 4]); |
+----------------------------------------------------+
| [1, 2] |
+----------------------------------------------------+
❯ select array_except([1, 2, 3, 4], [3, 4, 5, 6]);
+----------------------------------------------------+
| array_except([1, 2, 3, 4], [3, 4, 5, 6]); |
+----------------------------------------------------+
| [3, 4] |
+----------------------------------------------------+
Aliases¶
list_except
cardinality
¶
Returns the total number of elements in the array.
cardinality(array)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
Example¶
❯ select cardinality([[1, 2, 3, 4], [5, 6, 7, 8]]);
+--------------------------------------+
| cardinality(List([1,2,3,4,5,6,7,8])) |
+--------------------------------------+
| 8 |
+--------------------------------------+
empty
¶
Returns 1 for an empty array or 0 for a non-empty array.
empty(array)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
Example¶
❯ select empty([1]);
+------------------+
| empty(List([1])) |
+------------------+
| 0 |
+------------------+
list_append
¶
Alias of array_append.
list_cat
¶
Alias of array_concat.
list_concat
¶
Alias of array_concat.
list_dims
¶
Alias of array_dims.
list_element
¶
Alias of array_element.
list_extract
¶
Alias of array_element.
list_indexof
¶
Alias of array_position.
list_join
¶
Alias of array_to_string.
list_length
¶
Alias of array_length.
list_ndims
¶
Alias of array_ndims.
list_prepend
¶
Alias of array_prepend.
list_position
¶
Alias of array_position.
list_positions
¶
Alias of array_positions.
list_push_back
¶
Alias of array_append.
list_push_front
¶
Alias of array_prepend.
list_repeat
¶
Alias of array_repeat.
list_remove
¶
Alias of array_remove.
list_remove_n
¶
Alias of array_remove_n.
list_remove_all
¶
Alias of array_remove_all.
list_replace
¶
Alias of array_replace.
list_replace_n
¶
Alias of array_replace_n.
list_replace_all
¶
Alias of array_replace_all.
list_slice
¶
Alias of array_slice.
list_to_string
¶
Alias of list_to_string.
make_array
¶
Returns an Arrow array using the specified input expressions.
make_array(expression1[, ..., expression_n])
Arguments¶
expression_n: Expression to include in the output array. Can be a constant, column, or function, and any combination of arithmetic or string operators.
Example¶
❯ select make_array(1, 2, 3, 4, 5);
+----------------------------------------------------------+
| make_array(Int64(1),Int64(2),Int64(3),Int64(4),Int64(5)) |
+----------------------------------------------------------+
| [1, 2, 3, 4, 5] |
+----------------------------------------------------------+
Aliases¶
make_list
make_list
¶
Alias of make_array.
string_to_array
¶
Splits a string in to an array of substrings based on a delimiter. Any substrings matching the optional null_str
argument are replaced with NULL.
starts_with(str, delimiter[, null_str])
Arguments¶
str: String expression to split.
delimiter: Delimiter string to split on.
null_str: Substring values to be replaced with
NULL
Aliases¶
string_to_list
string_to_list
¶
Alias of string_to_array.
trim_array
¶
Removes the last n elements from the array.
DEPRECATED: use array_slice
instead!
trim_array(array, n)
Arguments¶
array: Array expression. Can be a constant, column, or function, and any combination of array operators.
n: Element to trim the array.
range
¶
Returns an Arrow array between start and stop with step. SELECT range(2, 10, 3) -> [2, 5, 8]
The range start..end contains all values with start <= x < end. It is empty if start >= end.
Step can not be 0 (then the range will be nonsense.).
Arguments¶
start: start of the range
end: end of the range (not included)
step: increase by step (can not be 0)
Struct Functions¶
struct
¶
Returns an Arrow struct using the specified input expressions.
Fields in the returned struct use the cN
naming convention.
For example: c0
, c1
, c2
, etc.
struct(expression1[, ..., expression_n])
For example, this query converts two columns a
and b
to a single column with
a struct type of fields c0
and c1
:
❯ select * from t;
+---+---+
| a | b |
+---+---+
| 1 | 2 |
| 3 | 4 |
+---+---+
❯ select struct(a, b) from t;
+-----------------+
| struct(t.a,t.b) |
+-----------------+
| {c0: 1, c1: 2} |
| {c0: 3, c1: 4} |
+-----------------+
Arguments¶
expression_n: Expression to include in the output struct. Can be a constant, column, or function, and any combination of arithmetic or string operators.
Hashing Functions¶
digest
¶
Computes the binary hash of an expression using the specified algorithm.
digest(expression, algorithm)
Arguments¶
expression: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
algorithm: String expression specifying algorithm to use. Must be one of:
md5
sha224
sha256
sha384
sha512
blake2s
blake2b
blake3
md5
¶
Computes an MD5 128-bit checksum for a string expression.
md5(expression)
Arguments¶
expression: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
sha224
¶
Computes the SHA-224 hash of a binary string.
sha224(expression)
Arguments¶
expression: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
sha256
¶
Computes the SHA-256 hash of a binary string.
sha256(expression)
Arguments¶
expression: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
sha384
¶
Computes the SHA-384 hash of a binary string.
sha384(expression)
Arguments¶
expression: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
sha512
¶
Computes the SHA-512 hash of a binary string.
sha512(expression)
Arguments¶
expression: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Other Functions¶
arrow_cast
¶
Casts a value to a specific Arrow data type:
arrow_cast(expression, datatype)
Arguments¶
expression: Expression to cast. Can be a constant, column, or function, and any combination of arithmetic or string operators.
datatype: Arrow data type name to cast to, as a string. The format is the same as that returned by [
arrow_typeof
]
Example¶
❯ select arrow_cast(-5, 'Int8') as a,
arrow_cast('foo', 'Dictionary(Int32, Utf8)') as b,
arrow_cast('bar', 'LargeUtf8') as c,
arrow_cast('2023-01-02T12:53:02', 'Timestamp(Microsecond, Some("+08:00"))') as d
;
+----+-----+-----+---------------------------+
| a | b | c | d |
+----+-----+-----+---------------------------+
| -5 | foo | bar | 2023-01-02T12:53:02+08:00 |
+----+-----+-----+---------------------------+
1 row in set. Query took 0.001 seconds.
arrow_typeof
¶
Returns the name of the underlying Arrow data type of the expression:
arrow_typeof(expression)
Arguments¶
expression: Expression to evaluate. Can be a constant, column, or function, and any combination of arithmetic or string operators.
Example¶
❯ select arrow_typeof('foo'), arrow_typeof(1);
+---------------------------+------------------------+
| arrow_typeof(Utf8("foo")) | arrow_typeof(Int64(1)) |
+---------------------------+------------------------+
| Utf8 | Int64 |
+---------------------------+------------------------+
1 row in set. Query took 0.001 seconds.