pyarrow.dataset.FileSystemFactoryOptions#

class pyarrow.dataset.FileSystemFactoryOptions(partition_base_dir=None, partitioning=None, exclude_invalid_files=None, list selector_ignore_prefixes=None)#

Bases: _Weakrefable

Influences the discovery of filesystem paths.

Parameters:
partition_base_dirstr, optional

For the purposes of applying the partitioning, paths will be stripped of the partition_base_dir. Files not matching the partition_base_dir prefix will be skipped for partitioning discovery. The ignored files will still be part of the Dataset, but will not have partition information.

partitioningPartitioning/PartitioningFactory, optional

Apply the Partitioning to every discovered Fragment. See Partitioning or PartitioningFactory documentation.

exclude_invalid_filesbool, optional (default True)

If True, invalid files will be excluded (file format specific check). This will incur IO for each files in a serial and single threaded fashion. Disabling this feature will skip the IO, but unsupported files may be present in the Dataset (resulting in an error at scan time).

selector_ignore_prefixeslist, optional

When discovering from a Selector (and not from an explicit file list), ignore files and directories matching any of these prefixes. By default this is [‘.’, ‘_’].

__init__(*args, **kwargs)#

Methods

__init__(*args, **kwargs)

Attributes

exclude_invalid_files

Whether to exclude invalid files.

partition_base_dir

Base directory to strip paths before applying the partitioning.

partitioning

Partitioning to apply to discovered files.

partitioning_factory

PartitioningFactory to apply to discovered files and discover a Partitioning.

selector_ignore_prefixes

List of prefixes.

exclude_invalid_files#

Whether to exclude invalid files.

partition_base_dir#

Base directory to strip paths before applying the partitioning.

partitioning#

Partitioning to apply to discovered files.

NOTE: setting this property will overwrite partitioning_factory.

partitioning_factory#

PartitioningFactory to apply to discovered files and discover a Partitioning.

NOTE: setting this property will overwrite partitioning.

selector_ignore_prefixes#

List of prefixes. Files matching one of those prefixes will be ignored by the discovery process.