Streams and File Access

Factory Functions

These factory functions are the recommended way to create a Arrow stream. They accept various kinds of sources, such as in-memory buffers or on-disk files.

input_stream(source[, compression, buffer_size])

Create an Arrow input stream.

output_stream(source[, compression, buffer_size])

Create an Arrow output stream.

memory_map(path[, mode])

Open memory map at file path.

create_memory_map(path, size)

Create a file of the given size and memory-map it.

Stream Classes

NativeFile

The base class for all Arrow streams.

OSFile

A stream backed by a regular file descriptor.

PythonFile

A stream backed by a Python file object.

BufferReader

Zero-copy reader from objects convertible to Arrow buffer.

BufferOutputStream

FixedSizeBufferWriter

A stream writing to a Arrow buffer.

MemoryMappedFile

A stream that represents a memory-mapped file.

CompressedInputStream(stream, …)

An input stream wrapper which decompresses data on the fly.

CompressedOutputStream(stream, …)

An output stream wrapper which compresses data on the fly.

File Systems

hdfs.connect([host, port, user, …])

DEPRECATED: Connect to an HDFS cluster.

LocalFileSystem()

class pyarrow.HadoopFileSystem[source]