Skip to contents

The register_binding() and register_binding_agg() functions are used to populate a list of functions that operate on (and return) Expressions. These are the basis for the .data mask inside dplyr methods.

Usage

register_binding(
  fun_name,
  fun,
  registry = nse_funcs,
  update_cache = FALSE,
  notes = character(0)
)

register_binding_agg(
  fun_name,
  agg_fun,
  registry = agg_funcs,
  notes = character(0)
)

Arguments

fun_name

A string containing a function name in the form "function" or "package::function". The package name is currently not used but may be used in the future to allow these types of function calls.

fun

A function or NULL to un-register a previous function. This function must accept Expression objects as arguments and return Expression objects instead of regular R objects.

registry

An environment in which the functions should be assigned.

update_cache

Update .cache$functions at the time of registration. the default is FALSE because the majority of usage is to register bindings at package load, after which we create the cache once. The reason why .cache$functions is needed in addition to nse_funcs for non-aggregate functions could be revisited...it is currently used as the data mask in mutate, filter, and aggregate (but not summarise) because the data mask has to be a list.

notes

string for the docs: note any limitations or differences in behavior between the Arrow version and the R function.

agg_fun

An aggregate function or NULL to un-register a previous aggregate function. This function must accept Expression objects as arguments and return a list() with components:

  • fun: string function name

  • data: list of 0 or more Expressions

  • options: list of function options, as passed to call_function

Value

The previously registered binding or NULL if no previously registered function existed.

Writing bindings

  • Expression$create() will wrap any non-Expression inputs as Scalar Expressions. If you want to try to coerce scalar inputs to match the type of the Expression(s) in the arguments, call cast_scalars_to_common_type(args) on the args. For example, Expression$create("add", args = list(int16_field, 1)) would result in a float64 type output because 1 is a double in R. To prevent casting all of the data in int16_field to float and to preserve it as int16, do Expression$create("add", args = cast_scalars_to_common_type(list(int16_field, 1)))

  • Inside your function, you can call any other binding with call_binding().