Function reference
-
open_dataset() - Open a multi-file dataset
-
open_delim_dataset()open_csv_dataset()open_tsv_dataset() - Open a multi-file dataset of CSV or other delimiter-separated format
-
csv_read_options() - CSV Reading Options
-
csv_parse_options() - CSV Parsing Options
-
csv_convert_options() - CSV Convert Options
-
write_dataset() - Write a dataset
-
write_delim_dataset()write_csv_dataset()write_tsv_dataset() - Write a dataset into partitioned flat files.
-
csv_write_options() - CSV Writing Options
-
read_delim_arrow()read_csv_arrow()read_csv2_arrow()read_tsv_arrow() - Read a CSV or other delimited file with Arrow
-
read_parquet() - Read a Parquet file
-
read_feather()read_ipc_file() - Read a Feather file (an Arrow IPC file)
-
read_ipc_stream() - Read Arrow IPC stream format
-
read_json_arrow() - Read a JSON file
-
write_csv_arrow() - Write CSV file to disk
-
write_parquet() - Write Parquet file to disk
-
write_feather()write_ipc_file() - Write a Feather file (an Arrow IPC file)
-
write_ipc_stream() - Write Arrow IPC stream format
-
write_to_raw() - Write Arrow data to a raw vector
-
scalar() - Create an Arrow Scalar
-
arrow_array() - Create an Arrow Array
-
chunked_array() - Create a Chunked Array
-
record_batch() - Create a RecordBatch
-
arrow_table() - Create an Arrow Table
-
buffer() - Create a Buffer
-
vctrs_extension_array()vctrs_extension_type() - Extension type for generic typed vectors
Working with Arrow data containers
Functions for converting R objects to Arrow data containers and combining Arrow data containers.
-
as_arrow_array() - Convert an object to an Arrow Array
-
as_chunked_array() - Convert an object to an Arrow ChunkedArray
-
as_record_batch() - Convert an object to an Arrow RecordBatch
-
as_arrow_table() - Convert an object to an Arrow Table
-
concat_arrays()c(<Array>) - Concatenate zero or more Arrays
-
concat_tables() - Concatenate one or more Tables
-
int8()int16()int32()int64()uint8()uint16()uint32()uint64()float16()halffloat()float32()float()float64()boolean()bool()utf8()large_utf8()binary()large_binary()fixed_size_binary()string()date32()date64()time32()time64()duration()null()timestamp()decimal()decimal128()decimal256()struct()list_of()large_list_of()fixed_size_list_of()map_of() - Create Arrow data types
-
dictionary() - Create a dictionary type
-
new_extension_type()new_extension_array()register_extension_type()reregister_extension_type()unregister_extension_type() - Extension types
-
vctrs_extension_array()vctrs_extension_type() - Extension type for generic typed vectors
-
as_data_type() - Convert an object to an Arrow DataType
-
infer_type()type() - Infer the arrow Array type from an R object
-
field() - Create a Field
-
schema() - Create a schema or extract one from an object.
-
unify_schemas() - Combine and harmonize schemas
-
as_schema() - Convert an object to an Arrow Schema
-
infer_schema() - Extract a schema from an object
-
read_schema() - Read a Schema from a stream
-
aceroarrow-functionsarrow-verbsarrow-dplyr - Functions available in Arrow dplyr queries
-
call_function() - Call an Arrow compute function
-
match_arrow()is_in() - Value matching for Arrow objects
-
value_counts() tablefor Arrow objects
-
list_compute_functions() - List available Arrow C++ compute functions
-
register_scalar_function() - Register user-defined functions
-
show_exec_plan() - Show the details of an Arrow Execution Plan
-
to_arrow() - Create an Arrow object from a DuckDB connection
-
to_duckdb() - Create a (virtual) DuckDB table from an Arrow object
-
s3_bucket() - Connect to an AWS S3 bucket
-
gs_bucket() - Connect to a Google Cloud Storage (GCS) bucket
-
copy_files() - Copy files between FileSystems
-
load_flight_server() - Load a Python Flight server
-
flight_connect() - Connect to a Flight server
-
flight_disconnect() - Explicitly close a Flight client
-
flight_get() - Get data from a Flight server
-
flight_put() - Send data to a Flight server
-
list_flights()flight_path_exists() - See available resources on a Flight server
-
arrow_info()arrow_available()arrow_with_acero()arrow_with_dataset()arrow_with_substrait()arrow_with_parquet()arrow_with_s3()arrow_with_gcs()arrow_with_json() - Report information on the package's capabilities
-
cpu_count()set_cpu_count() - Manage the global CPU thread pool in libarrow
-
io_thread_count()set_io_thread_count() - Manage the global I/O thread pool in libarrow
-
install_arrow() - Install or upgrade the Arrow library
-
install_pyarrow() - Install pyarrow for use with reticulate
-
create_package_with_all_dependencies() - Create a source bundle that includes all thirdparty dependencies
-
InputStreamRandomAccessFileMemoryMappedFileReadableFileBufferReader - InputStream classes
-
read_message() - Read a Message from a stream
-
mmap_open() - Open a memory mapped file
-
mmap_create() - Create a new read/write memory mapped file of a given size
-
OutputStreamFileOutputStreamBufferOutputStream - OutputStream classes
-
Message - Message class
-
MessageReader - MessageReader class
-
compressionCompressedOutputStreamCompressedInputStream - Compressed stream classes
-
Codec - Compression Codec class
-
codec_is_available() - Check whether a compression codec is available
-
ParquetFileReader - ParquetFileReader class
-
ParquetReaderProperties - ParquetReaderProperties class
-
ParquetArrowReaderProperties - ParquetArrowReaderProperties class
-
ParquetFileWriter - ParquetFileWriter class
-
ParquetWriterProperties - ParquetWriterProperties class
-
FeatherReader - FeatherReader class
-
CsvTableReaderJsonTableReader - Arrow CSV and JSON table reader classes
-
CsvReadOptionsCsvWriteOptionsCsvParseOptionsTimestampParserCsvConvertOptionsJsonReadOptionsJsonParseOptions - File reader options
-
RecordBatchReaderRecordBatchStreamReaderRecordBatchFileReader - RecordBatchReader classes
-
RecordBatchWriterRecordBatchStreamWriterRecordBatchFileWriter - RecordBatchWriter classes
-
as_record_batch_reader() - Convert an object to an Arrow RecordBatchReader
Low-level C++ wrappers
Low-level R6 class representations of Arrow C++ objects intended for advanced users.
-
Buffer - Buffer class
-
Scalar - Arrow scalars
-
ArrayDictionaryArrayStructArrayListArrayLargeListArrayFixedSizeListArrayMapArray - Array Classes
-
ChunkedArray - ChunkedArray class
-
RecordBatch - RecordBatch class
-
Schema - Schema class
-
Field - Field class
-
Table - Table class
-
DataType - DataType class
-
ArrayData - ArrayData class
-
DictionaryType - class DictionaryType
-
FixedWidthType - FixedWidthType class
-
ExtensionType - ExtensionType class
-
ExtensionArray - ExtensionArray class
Dataset and Filesystem R6 classes and helper functions
R6 classes and helper functions useful for when working with multi-file datases in Arrow.
-
DatasetFileSystemDatasetUnionDatasetInMemoryDatasetDatasetFactoryFileSystemDatasetFactory - Multi-file datasets
-
dataset_factory() - Create a DatasetFactory
-
PartitioningDirectoryPartitioningHivePartitioningDirectoryPartitioningFactoryHivePartitioningFactory - Define Partitioning for a Dataset
-
Expression - Arrow expressions
-
ScannerScannerBuilder - Scan the contents of a dataset
-
FileFormatParquetFileFormatIpcFileFormat - Dataset file formats
-
CsvFileFormat - CSV dataset file format
-
JsonFileFormat - JSON dataset file format
-
FileWriteOptions - Format-specific write options
-
FragmentScanOptionsCsvFragmentScanOptionsParquetFragmentScanOptionsJsonFragmentScanOptions - Format-specific scan options
-
hive_partition() - Construct Hive partitioning
-
map_batches() - Apply a function to a stream of RecordBatches
-
FileSystemLocalFileSystemS3FileSystemGcsFileSystemSubTreeFileSystem - FileSystem classes
-
FileInfo - FileSystem entry info
-
FileSelector - file selector