pyarrow.parquet.FileMetaData¶
- class pyarrow.parquet.FileMetaData¶
Bases:
_Weakrefable
Parquet metadata for a single file.
- __init__(*args, **kwargs)¶
Methods
__init__
(*args, **kwargs)append_row_groups
(self, FileMetaData other)Append row groups from other FileMetaData object.
equals
(self, FileMetaData other)Return whether the two file metadata objects are equal.
row_group
(self, int i)Get metadata for row group at index i.
set_file_path
(self, path)Set ColumnChunk file paths to the given value.
to_dict
(self)Get dictionary represenation of the file metadata.
write_metadata_file
(self, where)Write the metadata to a metadata-only Parquet file.
Attributes
String describing source of the parquet file (str).
Parquet format version used in file (str, such as '1.0', '2.4').
Additional metadata as key value pairs (dict[bytes, bytes]).
Number of columns in file (int).
Number of row groups in file (int).
Total number of rows in file (int).
Schema of the file (
ParquetSchema
).Size of the original thrift encoded metadata footer (int).
- append_row_groups(self, FileMetaData other)¶
Append row groups from other FileMetaData object.
- Parameters:
- other
FileMetaData
Other metadata to append row groups from.
- other
- created_by¶
String describing source of the parquet file (str).
This typically includes library name and version number. For example, Arrow 7.0’s writer returns ‘parquet-cpp-arrow version 7.0.0’.
- equals(self, FileMetaData other)¶
Return whether the two file metadata objects are equal.
- Parameters:
- other
FileMetaData
Metadata to compare against.
- other
- Returns:
- are_equalbool
- format_version¶
Parquet format version used in file (str, such as ‘1.0’, ‘2.4’).
If version is missing or unparsable, will default to assuming ‘2.6’.
- metadata¶
Additional metadata as key value pairs (dict[bytes, bytes]).
- num_columns¶
Number of columns in file (int).
- num_row_groups¶
Number of row groups in file (int).
- num_rows¶
Total number of rows in file (int).
- row_group(self, int i)¶
Get metadata for row group at index i.
- Parameters:
- i
int
Row group index to get.
- i
- Returns:
- row_group_metadata
RowGroupMetaData
- row_group_metadata
- schema¶
Schema of the file (
ParquetSchema
).
- serialized_size¶
Size of the original thrift encoded metadata footer (int).
- set_file_path(self, path)¶
Set ColumnChunk file paths to the given value.
This method modifies the
file_path
field of each ColumnChunk in the FileMetaData to be a particular value.- Parameters:
- path
str
The file path to set on all ColumnChunks.
- path
- to_dict(self)¶
Get dictionary represenation of the file metadata.
- Returns:
dict
Dictionary with a key for each attribute of this class.
- write_metadata_file(self, where)¶
Write the metadata to a metadata-only Parquet file.
- Parameters:
- wherepath or file-like object
Where to write the metadata. Should be a writable path on the local filesystem, or a writable file-like object.