Function reference
-
open_dataset()
- Open a multi-file dataset
-
open_delim_dataset()
open_csv_dataset()
open_tsv_dataset()
- Open a multi-file dataset of CSV or other delimiter-separated format
-
csv_read_options()
- CSV Reading Options
-
csv_parse_options()
- CSV Parsing Options
-
csv_convert_options()
- CSV Convert Options
-
write_dataset()
- Write a dataset
-
write_delim_dataset()
write_csv_dataset()
write_tsv_dataset()
- Write a dataset into partitioned flat files.
-
csv_write_options()
- CSV Writing Options
-
read_delim_arrow()
read_csv_arrow()
read_csv2_arrow()
read_tsv_arrow()
- Read a CSV or other delimited file with Arrow
-
read_parquet()
- Read a Parquet file
-
read_feather()
read_ipc_file()
- Read a Feather file (an Arrow IPC file)
-
read_ipc_stream()
- Read Arrow IPC stream format
-
read_json_arrow()
- Read a JSON file
-
write_csv_arrow()
- Write CSV file to disk
-
write_parquet()
- Write Parquet file to disk
-
write_feather()
write_ipc_file()
- Write a Feather file (an Arrow IPC file)
-
write_ipc_stream()
- Write Arrow IPC stream format
-
write_to_raw()
- Write Arrow data to a raw vector
-
scalar()
- Create an Arrow Scalar
-
arrow_array()
- Create an Arrow Array
-
chunked_array()
- Create a Chunked Array
-
record_batch()
- Create a RecordBatch
-
arrow_table()
- Create an Arrow Table
-
buffer()
- Create a Buffer
-
vctrs_extension_array()
vctrs_extension_type()
- Extension type for generic typed vectors
Working with Arrow data containers
Functions for converting R objects to Arrow data containers and combining Arrow data containers.
-
as_arrow_array()
- Convert an object to an Arrow Array
-
as_chunked_array()
- Convert an object to an Arrow ChunkedArray
-
as_record_batch()
- Convert an object to an Arrow RecordBatch
-
as_arrow_table()
- Convert an object to an Arrow Table
-
concat_arrays()
c(<Array>)
- Concatenate zero or more Arrays
-
concat_tables()
- Concatenate one or more Tables
-
int8()
int16()
int32()
int64()
uint8()
uint16()
uint32()
uint64()
float16()
halffloat()
float32()
float()
float64()
boolean()
bool()
utf8()
large_utf8()
binary()
large_binary()
fixed_size_binary()
string()
date32()
date64()
time32()
time64()
duration()
null()
timestamp()
decimal()
decimal128()
decimal256()
struct()
list_of()
large_list_of()
fixed_size_list_of()
map_of()
- Create Arrow data types
-
dictionary()
- Create a dictionary type
-
new_extension_type()
new_extension_array()
register_extension_type()
reregister_extension_type()
unregister_extension_type()
- Extension types
-
vctrs_extension_array()
vctrs_extension_type()
- Extension type for generic typed vectors
-
as_data_type()
- Convert an object to an Arrow DataType
-
infer_type()
type()
- Infer the arrow Array type from an R object
-
field()
- Create a Field
-
schema()
- Create a schema or extract one from an object.
-
unify_schemas()
- Combine and harmonize schemas
-
as_schema()
- Convert an object to an Arrow Schema
-
infer_schema()
- Extract a schema from an object
-
read_schema()
- Read a Schema from a stream
-
acero
arrow-functions
arrow-verbs
arrow-dplyr
- Functions available in Arrow dplyr queries
-
call_function()
- Call an Arrow compute function
-
match_arrow()
is_in()
- Value matching for Arrow objects
-
value_counts()
table
for Arrow objects
-
list_compute_functions()
- List available Arrow C++ compute functions
-
register_scalar_function()
- Register user-defined functions
-
show_exec_plan()
- Show the details of an Arrow Execution Plan
-
to_arrow()
- Create an Arrow object from a DuckDB connection
-
to_duckdb()
- Create a (virtual) DuckDB table from an Arrow object
-
s3_bucket()
- Connect to an AWS S3 bucket
-
gs_bucket()
- Connect to a Google Cloud Storage (GCS) bucket
-
copy_files()
- Copy files between FileSystems
-
load_flight_server()
- Load a Python Flight server
-
flight_connect()
- Connect to a Flight server
-
flight_disconnect()
- Explicitly close a Flight client
-
flight_get()
- Get data from a Flight server
-
flight_put()
- Send data to a Flight server
-
list_flights()
flight_path_exists()
- See available resources on a Flight server
-
arrow_info()
arrow_available()
arrow_with_acero()
arrow_with_dataset()
arrow_with_substrait()
arrow_with_parquet()
arrow_with_s3()
arrow_with_gcs()
arrow_with_json()
- Report information on the package's capabilities
-
cpu_count()
set_cpu_count()
- Manage the global CPU thread pool in libarrow
-
io_thread_count()
set_io_thread_count()
- Manage the global I/O thread pool in libarrow
-
install_arrow()
- Install or upgrade the Arrow library
-
install_pyarrow()
- Install pyarrow for use with reticulate
-
create_package_with_all_dependencies()
- Create a source bundle that includes all thirdparty dependencies
-
InputStream
RandomAccessFile
MemoryMappedFile
ReadableFile
BufferReader
- InputStream classes
-
read_message()
- Read a Message from a stream
-
mmap_open()
- Open a memory mapped file
-
mmap_create()
- Create a new read/write memory mapped file of a given size
-
OutputStream
FileOutputStream
BufferOutputStream
- OutputStream classes
-
Message
- Message class
-
MessageReader
- MessageReader class
-
compression
CompressedOutputStream
CompressedInputStream
- Compressed stream classes
-
Codec
- Compression Codec class
-
codec_is_available()
- Check whether a compression codec is available
-
ParquetFileReader
- ParquetFileReader class
-
ParquetReaderProperties
- ParquetReaderProperties class
-
ParquetArrowReaderProperties
- ParquetArrowReaderProperties class
-
ParquetFileWriter
- ParquetFileWriter class
-
ParquetWriterProperties
- ParquetWriterProperties class
-
FeatherReader
- FeatherReader class
-
CsvTableReader
JsonTableReader
- Arrow CSV and JSON table reader classes
-
CsvReadOptions
CsvWriteOptions
CsvParseOptions
TimestampParser
CsvConvertOptions
JsonReadOptions
JsonParseOptions
- File reader options
-
RecordBatchReader
RecordBatchStreamReader
RecordBatchFileReader
- RecordBatchReader classes
-
RecordBatchWriter
RecordBatchStreamWriter
RecordBatchFileWriter
- RecordBatchWriter classes
-
as_record_batch_reader
as_record_batch_reader.RecordBatchReader
as_record_batch_reader.Table
as_record_batch_reader.RecordBatch
as_record_batch_reader.data.frame
as_record_batch_reader.Dataset
as_record_batch_reader.function
as_record_batch_reader.arrow_dplyr_query
as_record_batch_reader.Scanner
- Convert an object to an Arrow RecordBatchReader
Low-level C++ wrappers
Low-level R6 class representations of Arrow C++ objects intended for advanced users.
-
Buffer
- Buffer class
-
Scalar
- Arrow scalars
-
Array
DictionaryArray
StructArray
ListArray
LargeListArray
FixedSizeListArray
MapArray
- Array Classes
-
ChunkedArray
- ChunkedArray class
-
RecordBatch
- RecordBatch class
-
Schema
- Schema class
-
Field
- Field class
-
Table
- Table class
-
DataType
- DataType class
-
ArrayData
- ArrayData class
-
DictionaryType
- class DictionaryType
-
FixedWidthType
- FixedWidthType class
-
ExtensionType
- ExtensionType class
-
ExtensionArray
- ExtensionArray class
Dataset and Filesystem R6 classes and helper functions
R6 classes and helper functions useful for when working with multi-file datases in Arrow.
-
Dataset
FileSystemDataset
UnionDataset
InMemoryDataset
DatasetFactory
FileSystemDatasetFactory
- Multi-file datasets
-
dataset_factory()
- Create a DatasetFactory
-
Partitioning
DirectoryPartitioning
HivePartitioning
DirectoryPartitioningFactory
HivePartitioningFactory
- Define Partitioning for a Dataset
-
Expression
- Arrow expressions
-
Scanner
ScannerBuilder
- Scan the contents of a dataset
-
FileFormat
ParquetFileFormat
IpcFileFormat
- Dataset file formats
-
CsvFileFormat
- CSV dataset file format
-
JsonFileFormat
- JSON dataset file format
-
FileWriteOptions
- Format-specific write options
-
FragmentScanOptions
CsvFragmentScanOptions
ParquetFragmentScanOptions
JsonFragmentScanOptions
- Format-specific scan options
-
hive_partition()
- Construct Hive partitioning
-
map_batches()
- Apply a function to a stream of RecordBatches
-
FileSystem
LocalFileSystem
S3FileSystem
GcsFileSystem
SubTreeFileSystem
- FileSystem classes
-
FileInfo
- FileSystem entry info
-
FileSelector
- file selector