The register_binding()
and register_binding_agg()
functions
are used to populate a list of functions that operate on (and return)
Expressions. These are the basis for the .data
mask inside dplyr methods.
register_binding(fun_name, fun, registry = nse_funcs, update_cache = FALSE)
A string containing a function name in the form "function"
or
"package::function"
. The package name is currently not used but
may be used in the future to allow these types of function calls.
A function or NULL
to un-register a previous function.
This function must accept Expression
objects as arguments and return
Expression
objects instead of regular R objects.
An environment in which the functions should be assigned.
Update .cache$functions at the time of registration. the default is FALSE because the majority of usage is to register bindings at package load, after which we create the cache once. The reason why .cache$functions is needed in addition to nse_funcs for non-aggregate functions could be revisited...it is currently used as the data mask in mutate, filter, and aggregate (but not summarise) because the data mask has to be a list.
An aggregate function or NULL
to un-register a previous
aggregate function. This function must accept Expression
objects as
arguments and return a list()
with components:
fun
: string function name
data
: Expression
(these are all currently a single field)
options
: list of function options, as passed to call_function
The previously registered binding or NULL
if no previously
registered function existed.
When to use build_expr()
vs. Expression$create()
?
Use build_expr()
if you need to
map R function names to Arrow C++ functions
wrap R inputs (vectors) as Array/Scalar
Expression$create()
is lower level. Most of the bindings use it
because they manage the preparation of the user-provided inputs
and don't need or don't want to the automatic conversion of R objects
to Scalar.