pyarrow.ipc.RecordBatchFileReader

class pyarrow.ipc.RecordBatchFileReader(source, footer_offset=None)[source]

Bases: pyarrow.lib._RecordBatchFileReader

Class for reading Arrow record batch data from the Arrow binary file format

Parameters
sourcebytes/buffer-like, pyarrow.NativeFile, or file-like Python object

Either an in-memory buffer, or a readable file object

footer_offsetint, default None

If the file is embedded in some larger file, this is the byte offset to the very end of the file data

__init__(source, footer_offset=None)[source]

Methods

__init__(source[, footer_offset])

get_batch(self, int i)

get_record_batch(self, int i)

read_all(self)

Read all record batches as a pyarrow.Table

read_pandas(self, **options)

Read contents of stream to a pandas.DataFrame.

Attributes

num_record_batches

schema

stats

Current IPC read statistics.

get_batch(self, int i)
get_record_batch(self, int i)
num_record_batches
read_all(self)

Read all record batches as a pyarrow.Table

read_pandas(self, **options)

Read contents of stream to a pandas.DataFrame.

Read all record batches as a pyarrow.Table then convert it to a pandas.DataFrame using Table.to_pandas.

Parameters
**options

Arguments to forward to Table.to_pandas.

Returns
dfpandas.DataFrame
schema
stats

Current IPC read statistics.