pyarrow.parquet.ColumnChunkMetaData

class pyarrow.parquet.ColumnChunkMetaData

Bases: _Weakrefable

Column metadata for a single row group.

__init__(*args, **kwargs)

Methods

__init__(*args, **kwargs)

equals(self, ColumnChunkMetaData other)

Return whether the two column chunk metadata objects are equal.

to_dict(self)

Get dictionary represenation of the column chunk metadata.

Attributes

compression

Type of compression used for column (str).

data_page_offset

Offset of data page reglative to column chunk offset (int).

dictionary_page_offset

Offset of dictionary page reglative to column chunk offset (int).

encodings

Encodings used for column (tuple of str).

file_offset

Offset into file where column chunk is located (int).

file_path

Optional file path if set (str or None).

has_dictionary_page

Whether there is dictionary data present in the column chunk (bool).

has_index_page

Not yet supported.

index_page_offset

Not yet supported.

is_stats_set

Whether or not statistics are present in metadata (bool).

num_values

Total number of values (int).

path_in_schema

Nested path to field, separated by periods (str).

physical_type

Physical type of column (str).

statistics

Statistics for column chunk (Statistics).

total_compressed_size

Compresssed size in bytes (int).

total_uncompressed_size

Uncompressed size in bytes (int).

compression

Type of compression used for column (str).

One of ‘UNCOMPRESSED’, ‘SNAPPY’, ‘GZIP’, ‘LZO’, ‘BROTLI’, ‘LZ4’, ‘ZSTD’, or ‘UNKNOWN’.

data_page_offset

Offset of data page reglative to column chunk offset (int).

dictionary_page_offset

Offset of dictionary page reglative to column chunk offset (int).

encodings

Encodings used for column (tuple of str).

One of ‘PLAIN’, ‘BIT_PACKED’, ‘RLE’, ‘BYTE_STREAM_SPLIT’, ‘DELTA_BINARY_PACKED’, ‘DELTA_BYTE_ARRAY’.

equals(self, ColumnChunkMetaData other)

Return whether the two column chunk metadata objects are equal.

Parameters:
otherColumnChunkMetaData

Metadata to compare against.

Returns:
are_equalbool
file_offset

Offset into file where column chunk is located (int).

file_path

Optional file path if set (str or None).

has_dictionary_page

Whether there is dictionary data present in the column chunk (bool).

has_index_page

Not yet supported.

index_page_offset

Not yet supported.

is_stats_set

Whether or not statistics are present in metadata (bool).

num_values

Total number of values (int).

path_in_schema

Nested path to field, separated by periods (str).

physical_type

Physical type of column (str).

statistics

Statistics for column chunk (Statistics).

to_dict(self)

Get dictionary represenation of the column chunk metadata.

Returns:
dict

Dictionary with a key for each attribute of this class.

total_compressed_size

Compresssed size in bytes (int).

total_uncompressed_size

Uncompressed size in bytes (int).