pyarrow.fs.copy_files#
- pyarrow.fs.copy_files(source, destination, source_filesystem=None, destination_filesystem=None, *, chunk_size=1048576, use_threads=True)[source]#
Copy files between FileSystems.
This functions allows you to recursively copy directories of files from one file system to another, such as from S3 to your local machine.
- Parameters
- source
str
Source file path or URI to a single file or directory. If a directory, files will be copied recursively from this path.
- destination
str
Destination file path or URI. If source is a file, destination is also interpreted as the destination file (not directory). Directories will be created as necessary.
- source_filesystem
FileSystem
, optional Source filesystem, needs to be specified if source is not a URI, otherwise inferred.
- destination_filesystem
FileSystem
, optional Destination filesystem, needs to be specified if destination is not a URI, otherwise inferred.
- chunk_size
int
, default 1MB The maximum size of block to read before flushing to the destination file. A larger chunk_size will use more memory while copying but may help accommodate high latency FileSystems.
- use_threadsbool, default
True
Whether to use multiple threads to accelerate copying.
- source
Examples
Copy an S3 bucket’s files to a local directory:
>>> copy_files("s3://your-bucket-name", "local-directory")
Using a FileSystem object:
>>> copy_files("your-bucket-name", "local-directory", ... source_filesystem=S3FileSystem(...))