Feather provides binary columnar serialization for data frames.
It is designed to make reading and writing data frames efficient,
and to make sharing data across data analysis languages easy.
read_feather()
can read both the Feather Version 1 (V1), a legacy version available starting in 2016,
and the Version 2 (V2), which is the Apache Arrow IPC file format.
read_ipc_file()
is an alias of read_feather()
.
Usage
read_feather(file, col_select = NULL, as_data_frame = TRUE, mmap = TRUE)
read_ipc_file(file, col_select = NULL, as_data_frame = TRUE, mmap = TRUE)
Arguments
- file
A character file name or URI,
raw
vector, an Arrow input stream, or aFileSystem
with path (SubTreeFileSystem
). If a file name or URI, an Arrow InputStream will be opened and closed when finished. If an input stream is provided, it will be left open.- col_select
A character vector of column names to keep, as in the "select" argument to
data.table::fread()
, or a tidy selection specification of columns, as used indplyr::select()
.- as_data_frame
Should the function return a
data.frame
(default) or an Arrow Table?- mmap
Logical: whether to memory-map the file (default
TRUE
)
Value
A data.frame
if as_data_frame
is TRUE
(the default), or an
Arrow Table otherwise
See also
FeatherReader and RecordBatchReader for lower-level access to reading Arrow IPC data.
Examples
# We recommend the ".arrow" extension for Arrow IPC files (Feather V2).
tf <- tempfile(fileext = ".arrow")
on.exit(unlink(tf))
write_feather(mtcars, tf)
df <- read_feather(tf)
dim(df)
#> [1] 32 11
# Can select columns
df <- read_feather(tf, col_select = starts_with("d"))