pyarrow.parquet.FileMetaData#
- class pyarrow.parquet.FileMetaData#
Bases:
pyarrow.lib._Weakrefable
Parquet metadata for a single file.
- __init__(*args, **kwargs)#
Methods
__init__
(*args, **kwargs)append_row_groups
(self, FileMetaData other)Append row groups from other FileMetaData object.
equals
(self, FileMetaData other)Return whether the two file metadata objects are equal.
row_group
(self, int i)Get metadata for row group at index i.
set_file_path
(self, path)Set ColumnChunk file paths to the given value.
to_dict
(self)Get dictionary represenation of the file metadata.
write_metadata_file
(self, where)Write the metadata to a metadata-only Parquet file.
Attributes
String describing source of the parquet file (str).
Parquet format version used in file (str, such as '1.0', '2.4').
Additional metadata as key value pairs (dict[bytes, bytes]).
Number of columns in file (int).
Number of row groups in file (int).
Total number of rows in file (int).
Schema of the file (
ParquetSchema
).Size of the original thrift encoded metadata footer (int).
- append_row_groups(self, FileMetaData other)#
Append row groups from other FileMetaData object.
- Parameters
- other
FileMetaData
Other metadata to append row groups from.
- other
- created_by#
String describing source of the parquet file (str).
This typically includes library name and version number. For example, Arrow 7.0’s writer returns ‘parquet-cpp-arrow version 7.0.0’.
- equals(self, FileMetaData other)#
Return whether the two file metadata objects are equal.
- Parameters
- other
FileMetaData
Metadata to compare against.
- other
- Returns
- are_equalbool
- format_version#
Parquet format version used in file (str, such as ‘1.0’, ‘2.4’).
If version is missing or unparsable, will default to assuming ‘1.0’.
- metadata#
Additional metadata as key value pairs (dict[bytes, bytes]).
- num_columns#
Number of columns in file (int).
- num_row_groups#
Number of row groups in file (int).
- num_rows#
Total number of rows in file (int).
- row_group(self, int i)#
Get metadata for row group at index i.
- Parameters
- i
int
Row group index to get.
- i
- Returns
- row_group_metadata
RowGroupMetaData
- row_group_metadata
- schema#
Schema of the file (
ParquetSchema
).
- serialized_size#
Size of the original thrift encoded metadata footer (int).
- set_file_path(self, path)#
Set ColumnChunk file paths to the given value.
This method modifies the
file_path
field of each ColumnChunk in the FileMetaData to be a particular value.- Parameters
- path
str
The file path to set on all ColumnChunks.
- path
- to_dict(self)#
Get dictionary represenation of the file metadata.
- Returns
dict
Dictionary with a key for each attribute of this class.
- write_metadata_file(self, where)#
Write the metadata to a metadata-only Parquet file.
- Parameters
- wherepath or file-like object
Where to write the metadata. Should be a writable path on the local filesystem, or a writable file-like object.