This class enables you to interact with Parquet files.

Factory

The ParquetFileReader$create() factory method instantiates the object and takes the following arguments:

  • file A character file name, raw vector, or Arrow file connection object (e.g. RandomAccessFile).

  • props Optional ParquetReaderProperties

  • mmap Logical: whether to memory-map the file (default TRUE)

  • ... Additional arguments, currently ignored

Methods

  • $ReadTable(col_select): get an arrow::Table from the file, possibly with columns filtered by a character vector of column names or a tidyselect specification.

  • $GetSchema(): get the arrow::Schema of the data in the file

Examples

# \donttest{ f <- system.file("v0.7.1.parquet", package="arrow") pq <- ParquetFileReader$create(f) pq$GetSchema()
#> Schema #> carat: double #> cut: string #> color: string #> clarity: string #> depth: double #> table: double #> price: int64 #> x: double #> y: double #> z: double #> __index_level_0__: int64 #> #> See $metadata for additional Schema metadata
if (codec_is_available("snappy")) { # This file has compressed data columns tab <- pq$ReadTable(starts_with("c")) tab$schema }
#> Schema #> carat: double #> cut: string #> color: string #> clarity: string #> #> See $metadata for additional Schema metadata
# }