- pyarrow.fs.copy_files(source, destination, source_filesystem=None, destination_filesystem=None, *, chunk_size=1048576, use_threads=True)#
Copy files between FileSystems.
This functions allows you to recursively copy directories of files from one file system to another, such as from S3 to your local machine.
Source file path or URI to a single file or directory. If a directory, files will be copied recursively from this path.
Destination file path or URI. If source is a file, destination is also interpreted as the destination file (not directory). Directories will be created as necessary.
Source filesystem, needs to be specified if source is not a URI, otherwise inferred.
Destination filesystem, needs to be specified if destination is not a URI, otherwise inferred.
int, default 1MB
The maximum size of block to read before flushing to the destination file. A larger chunk_size will use more memory while copying but may help accommodate high latency FileSystems.
- use_threadsbool, default
Whether to use multiple threads to accelerate copying.
Copy an S3 bucket’s files to a local directory:
>>> copy_files("s3://your-bucket-name", "local-directory")
Using a FileSystem object:
>>> copy_files("your-bucket-name", "local-directory", ... source_filesystem=S3FileSystem(...))