pyarrow.RecordBatchFileReader¶
- class pyarrow.RecordBatchFileReader(source, footer_offset=None)[source]¶
- Bases: - pyarrow.lib._RecordBatchFileReader- Class for reading Arrow record batch data from the Arrow binary file format - Parameters
- source (bytes/buffer-like, pyarrow.NativeFile, or file-like Python object) – Either an in-memory buffer, or a readable file object 
- footer_offset (int, default None) – If the file is embedded in some larger file, this is the byte offset to the very end of the file data 
 
 - __init__(source, footer_offset=None)[source]¶
- Initialize self. See help(type(self)) for accurate signature. 
 - Methods - __init__(source[, footer_offset])- Initialize self. - get_batch(self, int i)- _RecordBatchFileReader.get_batch(self, int i) - read_all(self)- Read all record batches as a pyarrow.Table - read_pandas(self, **options)- Read contents of stream to a pandas.DataFrame. - Attributes - get_batch(self, int i)¶
 - get_record_batch()¶
- _RecordBatchFileReader.get_batch(self, int i) 
 - num_record_batches¶
 - read_all(self)¶
- Read all record batches as a pyarrow.Table 
 - read_pandas(self, **options)¶
- Read contents of stream to a pandas.DataFrame. - Read all record batches as a pyarrow.Table then convert it to a pandas.DataFrame using Table.to_pandas. - Parameters
- **options (arguments to forward to Table.to_pandas) – 
- Returns
- df (pandas.DataFrame) 
 
 - schema¶
 - stats¶
- Current IPC read statistics. 
 
