pyarrow.orc.ORCFile¶
-
class
pyarrow.orc.
ORCFile
(source)[source]¶ Bases:
object
Reader interface for a single ORC file
- Parameters
source (str or pyarrow.io.NativeFile) – Readable source. For passing Python file objects or byte buffers, see pyarrow.io.PythonFileInterface or pyarrow.io.BufferReader.
Methods
__init__
(source)Initialize self.
read
([columns])Read the whole file.
read_stripe
(n[, columns])Read a single stripe from the file.
Attributes
The number of rows in the file
The number of stripes in the file
The file schema, as an arrow schema
-
property
nrows
¶ The number of rows in the file
-
property
nstripes
¶ The number of stripes in the file
-
read
(columns=None)[source]¶ Read the whole file.
- Parameters
columns (list) – If not None, only these columns will be read from the file. A column name may be a prefix of a nested field, e.g. ‘a’ will select ‘a.b’, ‘a.c’, and ‘a.d.e’
- Returns
pyarrow.lib.Table – Content of the file as a Table.
-
read_stripe
(n, columns=None)[source]¶ Read a single stripe from the file.
- Parameters
n (int) – The stripe index
columns (list) – If not None, only these columns will be read from the stripe. A column name may be a prefix of a nested field, e.g. ‘a’ will select ‘a.b’, ‘a.c’, and ‘a.d.e’
- Returns
pyarrow.lib.RecordBatch – Content of the stripe as a RecordBatch.
-
property
schema
¶ The file schema, as an arrow schema