Apply a function to a stream of RecordBatchesSource:
As an alternative to calling
collect() on a
Dataset query, you can
use this function to access the stream of
RecordBatches in the
This lets you do more complex operations in R that operate on chunks of data
without having to hold the entire Dataset in memory at once. You can include
map_batches() in a dplyr pipeline and do additional dplyr methods on the
stream of data in Arrow after it.
arrow_dplyr_queryobject, as returned by the
A function or
purrr-style lambda expression to apply to each batch. It must return a RecordBatch or something coercible to one via `as_record_batch()'.
Additional arguments passed to
schema(). If NULL, the schema will be inferred from the first batch.
FUNlazily as batches are read from the result; use
FUNon all batches before returning the reader.
Deprecated argument, ignored