pyarrow.orc.ORCFile¶
-
class
pyarrow.orc.
ORCFile
(source)[source]¶ Bases:
object
Reader interface for a single ORC file
- Parameters
source (str or pyarrow.io.NativeFile) – Readable source. For passing Python file objects or byte buffers, see pyarrow.io.PythonFileInterface or pyarrow.io.BufferReader.
Methods
__init__
(source)Initialize self.
read
([columns])Read the whole file.
read_stripe
(n[, columns])Read a single stripe from the file.
Attributes
The file metadata, as an arrow KeyValueMetadata
The number of rows in the file
The number of stripes in the file
The file schema, as an arrow schema
-
property
metadata
¶ The file metadata, as an arrow KeyValueMetadata
-
property
nrows
¶ The number of rows in the file
-
property
nstripes
¶ The number of stripes in the file
-
read
(columns=None)[source]¶ Read the whole file.
- Parameters
columns (list) – If not None, only these columns will be read from the file. A column name may be a prefix of a nested field, e.g. ‘a’ will select ‘a.b’, ‘a.c’, and ‘a.d.e’
- Returns
pyarrow.lib.Table – Content of the file as a Table.
-
read_stripe
(n, columns=None)[source]¶ Read a single stripe from the file.
- Parameters
n (int) – The stripe index
columns (list) – If not None, only these columns will be read from the stripe. A column name may be a prefix of a nested field, e.g. ‘a’ will select ‘a.b’, ‘a.c’, and ‘a.d.e’
- Returns
pyarrow.lib.RecordBatch – Content of the stripe as a RecordBatch.
-
property
schema
¶ The file schema, as an arrow schema