pyarrow.ipc.RecordBatchStreamReader#

class pyarrow.ipc.RecordBatchStreamReader(source, *, options=None, memory_pool=None)[source]#

Bases: _RecordBatchStreamReader

Reader for the Arrow streaming binary format.

Parameters:
sourcebytes/buffer-like, pyarrow.NativeFile, or file-like Python object

Either an in-memory buffer, or a readable file object. If you want to use memory map use MemoryMappedFile as source.

optionspyarrow.ipc.IpcReadOptions

Options for IPC deserialization. If None, default values will be used.

memory_poolMemoryPool, default None

If None, default memory pool is used.

__init__(source, *, options=None, memory_pool=None)[source]#

Methods

__init__(source, *[, options, memory_pool])

close(self)

Release any resources associated with the reader.

from_batches(Schema schema, batches)

Create RecordBatchReader from an iterable of batches.

iter_batches_with_custom_metadata(self)

Iterate over record batches from the stream along with their custom metadata.

read_all(self)

Read all record batches as a pyarrow.Table.

read_next_batch(self)

Read next RecordBatch from the stream.

read_next_batch_with_custom_metadata(self)

Read next RecordBatch from the stream along with its custom metadata.

read_pandas(self, **options)

Read contents of stream to a pandas.DataFrame.

Attributes

schema

Shared schema of the record batches in the stream.

stats

Current IPC read statistics.

close(self)#

Release any resources associated with the reader.

static from_batches(Schema schema, batches)#

Create RecordBatchReader from an iterable of batches.

Parameters:
schemaSchema

The shared schema of the record batches

batchesIterable[RecordBatch]

The batches that this reader will return.

Returns:
readerRecordBatchReader
iter_batches_with_custom_metadata(self)#

Iterate over record batches from the stream along with their custom metadata.

Yields:
RecordBatchWithMetadata
read_all(self)#

Read all record batches as a pyarrow.Table.

Returns:
Table
read_next_batch(self)#

Read next RecordBatch from the stream.

Returns:
RecordBatch
Raises:
StopIteration:

At end of stream.

read_next_batch_with_custom_metadata(self)#

Read next RecordBatch from the stream along with its custom metadata.

Returns:
batchRecordBatch
custom_metadataKeyValueMetadata
Raises:
StopIteration:

At end of stream.

read_pandas(self, **options)#

Read contents of stream to a pandas.DataFrame.

Read all record batches as a pyarrow.Table then convert it to a pandas.DataFrame using Table.to_pandas.

Parameters:
**options

Arguments to forward to Table.to_pandas().

Returns:
dfpandas.DataFrame
schema#

Shared schema of the record batches in the stream.

Returns:
Schema
stats#

Current IPC read statistics.