pyarrow.fs.copy_files(source, destination, source_filesystem=None, destination_filesystem=None, *, chunk_size=1048576, use_threads=True)[source]#

Copy files between FileSystems.

This functions allows you to recursively copy directories of files from one file system to another, such as from S3 to your local machine.


Source file path or URI to a single file or directory. If a directory, files will be copied recursively from this path.


Destination file path or URI. If source is a file, destination is also interpreted as the destination file (not directory). Directories will be created as necessary.

source_filesystemFileSystem, optional

Source filesystem, needs to be specified if source is not a URI, otherwise inferred.

destination_filesystemFileSystem, optional

Destination filesystem, needs to be specified if destination is not a URI, otherwise inferred.

chunk_sizeint, default 1MB

The maximum size of block to read before flushing to the destination file. A larger chunk_size will use more memory while copying but may help accommodate high latency FileSystems.

use_threadsbool, default True

Whether to use multiple threads to accelerate copying.


Inspect an S3 bucket’s files:

>>> s3, path = fs.FileSystem.from_uri(
...            "s3://")
>>> selector = fs.FileSelector(path)
>>> s3.get_file_info(selector)
[<FileInfo for '':...]

Copy one file from S3 bucket to a local directory:

>>> fs.copy_files("s3://",
...               "file:///{}/index_copy.ndjson".format(local_path))
>>> fs.LocalFileSystem().get_file_info(str(local_path)+
...                                    '/index_copy.ndjson')
<FileInfo for '.../index_copy.ndjson': type=FileType.File, size=...>

Copy file using a FileSystem object:

>>> fs.copy_files("",
...               "file:///{}/index_copy.ndjson".format(local_path),
...               source_filesystem=fs.S3FileSystem())