dataset_factory(
  x,
  filesystem = NULL,
  format = c("parquet", "arrow", "ipc", "feather", "csv", "tsv", "text"),
  partitioning = NULL,
  ...
)
    Arguments
    
    
    
      | x | 
      A string file x containing data files, or
a list of DatasetFactory objects whose datasets should be
grouped. If this argument is specified it will be used to construct a
UnionDatasetFactory and other arguments will be ignored.  | 
    
    
      | filesystem | 
      A FileSystem object; if omitted, the FileSystem will
be detected from x  | 
    
    
      | format | 
      A FileFormat object, or a string identifier of the format of
the files in x. Currently supported values: 
"parquet"  
"ipc"/"arrow"/"feather", all aliases for each other; for Feather, note that
only version 2 files are supported  
"csv"/"text", aliases for the same thing (because comma is the default
delimiter for text files  
"tsv", equivalent to passing format = "text", delimiter = "\t"  
 
Default is "parquet", unless a delimiter is also specified, in which case
it is assumed to be "text".  | 
    
    
      | partitioning | 
      One of 
A Schema, in which case the file paths relative to sources will be
parsed, and path segments will be matched with the schema fields. For
example, schema(year = int16(), month = int8()) would create partitions
for file paths like "2019/01/file.parquet", "2019/02/file.parquet", etc.  
A character vector that defines the field names corresponding to those
path segments (that is, you're providing the names that would correspond
to a Schema but the types will be autodetected)  
A HivePartitioning or HivePartitioningFactory, as returned
by hive_partition() which parses explicit or autodetected fields from
Hive-style path segments  
NULL for no partitioning
  
  | 
    
    
      | ... | 
      Additional format-specific options, passed to
FileFormat$create(). For CSV options, note that you can specify them either
with the Arrow C++ library naming ("delimiter", "quoting", etc.) or the
readr-style naming used in read_csv_arrow() ("delim", "quote", etc.)  | 
    
    
    Value
    A DatasetFactory object. Pass this to open_dataset(),
in a list potentially with other DatasetFactory objects, to create
a Dataset.
    Details
    If you would only have a single DatasetFactory (for example, you have a
single directory containing Parquet files), you can call open_dataset()
directly. Use dataset_factory() when you
want to combine different directories, file systems, or file formats.