pyarrow.fs.SubTreeFileSystem

class pyarrow.fs.SubTreeFileSystem(base_path, FileSystem base_fs)

Bases: pyarrow._fs.FileSystem

Delegates to another implementation after prepending a fixed base path.

This is useful to expose a logical view of a subtree of a filesystem, for example a directory in a LocalFileSystem.

Note, that this makes no security guarantee. For example, symlinks may allow to “escape” the subtree and access other parts of the underlying filesystem.

Parameters
base_pathstr

The root of the subtree.

base_fsFileSystem

FileSystem object the operations delegated to.

__init__(*args, **kwargs)

Methods

__init__(*args, **kwargs)

copy_file(self, src, dest)

Copy a file.

create_dir(self, path, *, bool recursive=True)

Create a directory and subdirectories.

delete_dir(self, path)

Delete a directory and its contents, recursively.

delete_dir_contents(self, path, *, ...)

Delete a directory's contents, recursively.

delete_file(self, path)

Delete a file.

equals(self, FileSystem other)

from_uri(uri)

Create a new FileSystem from URI or Path.

get_file_info(self, paths_or_selector)

Get info for the given files.

move(self, src, dest)

Move / rename a file or directory.

normalize_path(self, path)

Normalize filesystem path.

open_append_stream(self, path[, ...])

Open an output stream for appending.

open_input_file(self, path)

Open an input file for random access reading.

open_input_stream(self, path[, compression, ...])

Open an input stream for sequential reading.

open_output_stream(self, path[, ...])

Open an output stream for sequential writing.

Attributes

base_fs

base_path

type_name

The filesystem's type name.

base_fs
base_path
copy_file(self, src, dest)

Copy a file.

If the destination exists and is a directory, an error is returned. Otherwise, it is replaced.

Parameters
srcstr

The path of the file to be copied from.

deststr

The destination path where the file is copied to.

create_dir(self, path, *, bool recursive=True)

Create a directory and subdirectories.

This function succeeds if the directory already exists.

Parameters
pathstr

The path of the new directory.

recursive: bool, default True

Create nested directories as well.

delete_dir(self, path)

Delete a directory and its contents, recursively.

Parameters
pathstr

The path of the directory to be deleted.

delete_dir_contents(self, path, *, bool accept_root_dir=False)

Delete a directory’s contents, recursively.

Like delete_dir, but doesn’t delete the directory itself.

Parameters
pathstr

The path of the directory to be deleted.

accept_root_dirbool, default False

Allow deleting the root directory’s contents (if path is empty or “/”)

delete_file(self, path)

Delete a file.

Parameters
pathstr

The path of the file to be deleted.

equals(self, FileSystem other)
static from_uri(uri)

Create a new FileSystem from URI or Path.

Recognized URI schemes are “file”, “mock”, “s3fs”, “hdfs” and “viewfs”. In addition, the argument can be a pathlib.Path object, or a string describing an absolute local path.

Parameters
uristr

URI-based path, for example: file:///some/local/path.

Returns
tuple of (FileSystem, str path)

With (filesystem, path) tuple where path is the abstract path inside the FileSystem instance.

get_file_info(self, paths_or_selector)

Get info for the given files.

Any symlink is automatically dereferenced, recursively. A non-existing or unreachable file returns a FileStat object and has a FileType of value NotFound. An exception indicates a truly exceptional condition (low-level I/O error, etc.).

Parameters
paths_or_selector: FileSelector, path-like or list of path-likes

Either a selector object, a path-like object or a list of path-like objects. The selector’s base directory will not be part of the results, even if it exists. If it doesn’t exist, use allow_not_found.

Returns
FileInfo or list of FileInfo

Single FileInfo object is returned for a single path, otherwise a list of FileInfo objects is returned.

move(self, src, dest)

Move / rename a file or directory.

If the destination exists: - if it is a non-empty directory, an error is returned - otherwise, if it has the same type as the source, it is replaced - otherwise, behavior is unspecified (implementation-dependent).

Parameters
srcstr

The path of the file or the directory to be moved.

deststr

The destination path where the file or directory is moved to.

normalize_path(self, path)

Normalize filesystem path.

Parameters
pathstr

The path to normalize

Returns
normalized_pathstr

The normalized path

open_append_stream(self, path, compression='detect', buffer_size=None, metadata=None)

Open an output stream for appending.

If the target doesn’t exist, a new empty file is created.

Note

Some filesystem implementations do not support efficient appending to an existing file, in which case this method will raise NotImplementedError. Consider writing to multiple files (using e.g. the dataset layer) instead.

Parameters
pathstr

The source to open for writing.

compressionstr optional, default ‘detect’

The compression algorithm to use for on-the-fly compression. If “detect” and source is a file path, then compression will be chosen based on the file extension. If None, no compression will be applied. Otherwise, a well-known algorithm name must be supplied (e.g. “gzip”).

buffer_sizeint optional, default None

If None or 0, no buffering will happen. Otherwise the size of the temporary write buffer.

metadatadict optional, default None

If not None, a mapping of string keys to string values. Some filesystems support storing metadata along the file (such as “Content-Type”). Unsupported metadata keys will be ignored.

Returns
streamNativeFile
open_input_file(self, path)

Open an input file for random access reading.

Parameters
pathstr

The source to open for reading.

Returns
stramNativeFile
open_input_stream(self, path, compression='detect', buffer_size=None)

Open an input stream for sequential reading.

Parameters
sourcestr

The source to open for reading.

compressionstr optional, default ‘detect’

The compression algorithm to use for on-the-fly decompression. If “detect” and source is a file path, then compression will be chosen based on the file extension. If None, no compression will be applied. Otherwise, a well-known algorithm name must be supplied (e.g. “gzip”).

buffer_sizeint optional, default None

If None or 0, no buffering will happen. Otherwise the size of the temporary read buffer.

Returns
streamNativeFile
open_output_stream(self, path, compression='detect', buffer_size=None, metadata=None)

Open an output stream for sequential writing.

If the target already exists, existing data is truncated.

Parameters
pathstr

The source to open for writing.

compressionstr optional, default ‘detect’

The compression algorithm to use for on-the-fly compression. If “detect” and source is a file path, then compression will be chosen based on the file extension. If None, no compression will be applied. Otherwise, a well-known algorithm name must be supplied (e.g. “gzip”).

buffer_sizeint optional, default None

If None or 0, no buffering will happen. Otherwise the size of the temporary write buffer.

metadatadict optional, default None

If not None, a mapping of string keys to string values. Some filesystems support storing metadata along the file (such as “Content-Type”). Unsupported metadata keys will be ignored.

Returns
streamNativeFile
type_name

The filesystem’s type name.