Skip to contents

'Parquet' is a columnar storage file format. This function enables you to read Parquet files into R.

Usage

read_parquet(
  file,
  col_select = NULL,
  as_data_frame = TRUE,
  props = ParquetArrowReaderProperties$create(),
  ...
)

Arguments

file

A character file name or URI, raw vector, an Arrow input stream, or a FileSystem with path (SubTreeFileSystem). If a file name or URI, an Arrow InputStream will be opened and closed when finished. If an input stream is provided, it will be left open.

col_select

A character vector of column names to keep, as in the "select" argument to data.table::fread(), or a tidy selection specification of columns, as used in dplyr::select().

as_data_frame

Should the function return a tibble (default) or an Arrow Table?

props

ParquetArrowReaderProperties

...

Additional arguments passed to ParquetFileReader$create()

Value

A tibble if as_data_frame is TRUE (the default), or an Arrow Table otherwise.

Examples

tf <- tempfile()
on.exit(unlink(tf))
write_parquet(mtcars, tf)
df <- read_parquet(tf, col_select = starts_with("d"))
head(df)
#> # A tibble: 6 x 2
#>    disp  drat
#>   <dbl> <dbl>
#> 1   160  3.9 
#> 2   160  3.9 
#> 3   108  3.85
#> 4   258  3.08
#> 5   360  3.15
#> 6   225  2.76