pyarrow.orc.ORCFile¶
- class pyarrow.orc.ORCFile(source)[source]¶
Bases:
object
Reader interface for a single ORC file
- Parameters
source (str or pyarrow.io.NativeFile) – Readable source. For passing Python file objects or byte buffers, see pyarrow.io.PythonFileInterface or pyarrow.io.BufferReader.
Methods
__init__
(source)Initialize self.
read
([columns])Read the whole file.
read_stripe
(n[, columns])Read a single stripe from the file.
Attributes
The number of rows in the file
The number of stripes in the file
The file schema, as an arrow schema
- property nrows¶
The number of rows in the file
- property nstripes¶
The number of stripes in the file
- read(columns=None)[source]¶
Read the whole file.
- Parameters
columns (list) – If not None, only these columns will be read from the file. A column name may be a prefix of a nested field, e.g. ‘a’ will select ‘a.b’, ‘a.c’, and ‘a.d.e’
- Returns
pyarrow.lib.Table – Content of the file as a Table.
- read_stripe(n, columns=None)[source]¶
Read a single stripe from the file.
- Parameters
n (int) – The stripe index
columns (list) – If not None, only these columns will be read from the stripe. A column name may be a prefix of a nested field, e.g. ‘a’ will select ‘a.b’, ‘a.c’, and ‘a.d.e’
- Returns
pyarrow.lib.RecordBatch – Content of the stripe as a RecordBatch.
- property schema¶
The file schema, as an arrow schema