A FragmentScanOptions
holds options specific to a FileFormat
and a scan
operation.
FragmentScanOptions$create()
takes the following arguments:
format
: A string identifier of the file format. Currently supported values:
"parquet"
"csv"/"text", aliases for the same format.
...
: Additional format-specific options
`format = "parquet"``:
use_buffered_stream
: Read files through buffered input streams rather than
loading entire row groups at once. This may be enabled
to reduce memory overhead. Disabled by default.
buffer_size
: Size of buffered stream, if enabled. Default is 8KB.
pre_buffer
: Pre-buffer the raw Parquet data. This can improve performance
on high-latency filesystems. Disabled by default.
format = "text"
: see CsvConvertOptions. Note that options can only be
specified with the Arrow C++ library naming. Also, "block_size" from
CsvReadOptions may be given.
It returns the appropriate subclass of FragmentScanOptions
(e.g. CsvFragmentScanOptions
).