pyarrow.fs.copy_files¶
- pyarrow.fs.copy_files(source, destination, source_filesystem=None, destination_filesystem=None, *, chunk_size=1048576, use_threads=True)[source]¶
- Copy files between FileSystems. - This functions allows you to recursively copy directories of files from one file system to another, such as from S3 to your local machine. - Parameters:
- sourcestr
- Source file path or URI to a single file or directory. If a directory, files will be copied recursively from this path. 
- destinationstr
- Destination file path or URI. If source is a file, destination is also interpreted as the destination file (not directory). Directories will be created as necessary. 
- source_filesystemFileSystem, optional
- Source filesystem, needs to be specified if source is not a URI, otherwise inferred. 
- destination_filesystemFileSystem, optional
- Destination filesystem, needs to be specified if destination is not a URI, otherwise inferred. 
- chunk_sizeint, default 1MB
- The maximum size of block to read before flushing to the destination file. A larger chunk_size will use more memory while copying but may help accommodate high latency FileSystems. 
- use_threadsbool, default True
- Whether to use multiple threads to accelerate copying. 
 
- source
 - Examples - Inspect an S3 bucket’s files: - >>> s3, path = fs.FileSystem.from_uri( ... "s3://registry.opendata.aws/roda/ndjson/") >>> selector = fs.FileSelector(path) >>> s3.get_file_info(selector) [<FileInfo for 'registry.opendata.aws/roda/ndjson/index.ndjson':...] - Copy one file from S3 bucket to a local directory: - >>> fs.copy_files("s3://registry.opendata.aws/roda/ndjson/index.ndjson", ... "file:///{}/index_copy.ndjson".format(local_path)) - >>> fs.LocalFileSystem().get_file_info(str(local_path)+ ... '/index_copy.ndjson') <FileInfo for '.../index_copy.ndjson': type=FileType.File, size=...> - Copy file using a FileSystem object: - >>> fs.copy_files("registry.opendata.aws/roda/ndjson/index.ndjson", ... "file:///{}/index_copy.ndjson".format(local_path), ... source_filesystem=fs.S3FileSystem()) 
