Function reference
-
open_dataset()
- Open a multi-file dataset
-
open_delim_dataset()
open_csv_dataset()
open_tsv_dataset()
- Open a multi-file dataset of CSV or other delimiter-separated format
-
write_dataset()
- Write a dataset
-
dataset_factory()
- Create a DatasetFactory
-
hive_partition()
- Construct Hive partitioning
-
Dataset
FileSystemDataset
UnionDataset
InMemoryDataset
DatasetFactory
FileSystemDatasetFactory
- Multi-file datasets
-
Partitioning
DirectoryPartitioning
HivePartitioning
DirectoryPartitioningFactory
HivePartitioningFactory
- Define Partitioning for a Dataset
-
Expression
- Arrow expressions
-
Scanner
ScannerBuilder
- Scan the contents of a dataset
-
FileFormat
ParquetFileFormat
IpcFileFormat
- Dataset file formats
-
CsvFileFormat
- CSV dataset file format
-
FileWriteOptions
- Format-specific write options
-
FragmentScanOptions
CsvFragmentScanOptions
ParquetFragmentScanOptions
- Format-specific scan options
-
map_batches()
- Apply a function to a stream of RecordBatches
-
read_feather()
read_ipc_file()
- Read a Feather file (an Arrow IPC file)
-
read_ipc_stream()
- Read Arrow IPC stream format
-
read_parquet()
- Read a Parquet file
-
read_delim_arrow()
read_csv_arrow()
read_tsv_arrow()
- Read a CSV or other delimited file with Arrow
-
read_json_arrow()
- Read a JSON file
-
write_feather()
write_ipc_file()
- Write a Feather file (an Arrow IPC file)
-
write_ipc_stream()
- Write Arrow IPC stream format
-
write_to_raw()
- Write Arrow data to a raw vector
-
write_parquet()
- Write Parquet file to disk
-
write_csv_arrow()
- Write CSV file to disk
-
ParquetFileReader
- ParquetFileReader class
-
ParquetArrowReaderProperties
- ParquetArrowReaderProperties class
-
ParquetFileWriter
- ParquetFileWriter class
-
ParquetWriterProperties
- ParquetWriterProperties class
-
FeatherReader
- FeatherReader class
-
CsvTableReader
JsonTableReader
- Arrow CSV and JSON table reader classes
-
RecordBatchReader
RecordBatchStreamReader
RecordBatchFileReader
- RecordBatchReader classes
-
RecordBatchWriter
RecordBatchStreamWriter
RecordBatchFileWriter
- RecordBatchWriter classes
-
CsvReadOptions
CsvWriteOptions
CsvParseOptions
TimestampParser
CsvConvertOptions
JsonReadOptions
JsonParseOptions
- File reader options
-
as_record_batch_reader
as_record_batch_reader.RecordBatchReader
as_record_batch_reader.Table
as_record_batch_reader.RecordBatch
as_record_batch_reader.data.frame
as_record_batch_reader.Dataset
as_record_batch_reader.function
as_record_batch_reader.arrow_dplyr_query
as_record_batch_reader.Scanner
- Convert an object to an Arrow RecordBatchReader
-
array
Array
DictionaryArray
StructArray
ListArray
LargeListArray
FixedSizeListArray
MapArray
StructScalar
- Arrow Arrays
-
chunked_array()
- ChunkedArray class
-
Scalar
- Arrow scalars
-
record_batch()
- RecordBatch class
-
arrow_table()
- Table class
-
ArrayData
- ArrayData class
-
buffer()
- Buffer class
-
read_message()
- Read a Message from a stream
-
concat_arrays()
c(<Array>)
- Concatenate zero or more Arrays
-
concat_tables()
- Concatenate one or more Tables
-
ExtensionArray
- class arrow::ExtensionArray
-
vctrs_extension_array()
vctrs_extension_type()
- Extension type for generic typed vectors
-
as_arrow_array()
- Convert an object to an Arrow Array
-
as_chunked_array()
- Convert an object to an Arrow ChunkedArray
-
as_record_batch()
- Convert an object to an Arrow RecordBatch
-
as_arrow_table()
- Convert an object to an Arrow Table
-
schema()
- Schema class
-
unify_schemas()
- Combine and harmonize schemas
-
infer_type()
type()
- Infer the arrow Array type from an R object
-
dictionary()
- Create a dictionary type
-
field()
- Field class
-
read_schema()
- read a Schema from a stream
-
int8()
int16()
int32()
int64()
uint8()
uint16()
uint32()
uint64()
float16()
halffloat()
float32()
float()
float64()
boolean()
bool()
utf8()
large_utf8()
binary()
large_binary()
fixed_size_binary()
string()
date32()
date64()
time32()
time64()
duration()
null()
timestamp()
decimal()
decimal128()
decimal256()
struct()
list_of()
large_list_of()
fixed_size_list_of()
map_of()
- Apache Arrow data types
-
DataType
- class arrow::DataType
-
DictionaryType
- class DictionaryType
-
FixedWidthType
- class arrow::FixedWidthType
-
new_extension_type()
new_extension_array()
register_extension_type()
reregister_extension_type()
unregister_extension_type()
- Extension types
-
vctrs_extension_array()
vctrs_extension_type()
- Extension type for generic typed vectors
-
ExtensionType
- class arrow::ExtensionType
-
as_data_type()
- Convert an object to an Arrow DataType
-
as_schema()
- Convert an object to an Arrow DataType
-
load_flight_server()
- Load a Python Flight server
-
flight_connect()
- Connect to a Flight server
-
flight_disconnect()
- Explicitly close a Flight client
-
flight_get()
- Get data from a Flight server
-
flight_put()
- Send data to a Flight server
-
list_flights()
flight_path_exists()
- See available resources on a Flight server
-
s3_bucket()
- Connect to an AWS S3 bucket
-
gs_bucket()
- Connect to a Google Cloud Storage (GCS) bucket
-
FileSystem
LocalFileSystem
S3FileSystem
GcsFileSystem
SubTreeFileSystem
- FileSystem classes
-
FileInfo
- FileSystem entry info
-
FileSelector
- file selector
-
copy_files()
- Copy files between FileSystems
-
InputStream
RandomAccessFile
MemoryMappedFile
ReadableFile
BufferReader
- InputStream classes
-
mmap_open()
- Open a memory mapped file
-
mmap_create()
- Create a new read/write memory mapped file of a given size
-
OutputStream
FileOutputStream
BufferOutputStream
- OutputStream classes
-
Message
- class arrow::Message
-
MessageReader
- class arrow::MessageReader
-
compression
CompressedOutputStream
CompressedInputStream
- Compressed stream classes
-
Codec
- Compression Codec class
-
codec_is_available()
- Check whether a compression codec is available
-
acero
- Functions available in Arrow dplyr queries
-
call_function()
- Call an Arrow compute function
-
match_arrow()
is_in()
match
and%in%
for Arrow objects
-
value_counts()
table
for Arrow objects
-
list_compute_functions()
- List available Arrow C++ compute functions
-
register_scalar_function()
- Register user-defined functions
-
show_exec_plan()
- Show the details of an Arrow Execution Plan
-
to_arrow()
- Create an Arrow object from others
-
to_duckdb()
- Create a (virtual) DuckDB table from an Arrow object
-
arrow_info()
arrow_available()
arrow_with_acero()
arrow_with_dataset()
arrow_with_substrait()
arrow_with_parquet()
arrow_with_s3()
arrow_with_gcs()
arrow_with_json()
- Report information on the package's capabilities
-
cpu_count()
set_cpu_count()
- Manage the global CPU thread pool in libarrow
-
io_thread_count()
set_io_thread_count()
- Manage the global I/O thread pool in libarrow
-
install_arrow()
- Install or upgrade the Arrow library
-
install_pyarrow()
- Install pyarrow for use with reticulate
-
create_package_with_all_dependencies()
- Create a source bundle that includes all thirdparty dependencies