pyarrow.parquet.ColumnChunkMetaData#

class pyarrow.parquet.ColumnChunkMetaData#

Bases: _Weakrefable

Column metadata for a single row group.

__init__(*args, **kwargs)#

Methods

__init__(*args, **kwargs)

equals(self, ColumnChunkMetaData other)

Return whether the two column chunk metadata objects are equal.

to_dict(self)

Get dictionary representation of the column chunk metadata.

Attributes

compression

Type of compression used for column (str).

data_page_offset

Offset of data page relative to column chunk offset (int).

dictionary_page_offset

Offset of dictionary page relative to column chunk offset (int).

encodings

Encodings used for column (tuple of str).

file_offset

Offset into file where column chunk is located (int).

file_path

Optional file path if set (str or None).

has_column_index

Whether the column chunk has a column index

has_dictionary_page

Whether there is dictionary data present in the column chunk (bool).

has_index_page

Not yet supported.

has_offset_index

Whether the column chunk has an offset index

index_page_offset

Not yet supported.

is_stats_set

Whether or not statistics are present in metadata (bool).

metadata

Additional metadata as key value pairs (dict[bytes, bytes]).

num_values

Total number of values (int).

path_in_schema

Nested path to field, separated by periods (str).

physical_type

Physical type of column (str).

statistics

Statistics for column chunk (Statistics).

total_compressed_size

Compressed size in bytes (int).

total_uncompressed_size

Uncompressed size in bytes (int).

compression#

Type of compression used for column (str).

One of ‘UNCOMPRESSED’, ‘SNAPPY’, ‘GZIP’, ‘LZO’, ‘BROTLI’, ‘LZ4’, ‘ZSTD’, or ‘UNKNOWN’.

data_page_offset#

Offset of data page relative to column chunk offset (int).

dictionary_page_offset#

Offset of dictionary page relative to column chunk offset (int).

encodings#

Encodings used for column (tuple of str).

One of ‘PLAIN’, ‘BIT_PACKED’, ‘RLE’, ‘BYTE_STREAM_SPLIT’, ‘DELTA_BINARY_PACKED’, ‘DELTA_LENGTH_BYTE_ARRAY’, ‘DELTA_BYTE_ARRAY’.

equals(self, ColumnChunkMetaData other)#

Return whether the two column chunk metadata objects are equal.

Parameters:
otherColumnChunkMetaData

Metadata to compare against.

Returns:
are_equalbool
file_offset#

Offset into file where column chunk is located (int).

file_path#

Optional file path if set (str or None).

has_column_index#

Whether the column chunk has a column index

has_dictionary_page#

Whether there is dictionary data present in the column chunk (bool).

has_index_page#

Not yet supported.

has_offset_index#

Whether the column chunk has an offset index

index_page_offset#

Not yet supported.

is_stats_set#

Whether or not statistics are present in metadata (bool).

metadata#

Additional metadata as key value pairs (dict[bytes, bytes]).

num_values#

Total number of values (int).

path_in_schema#

Nested path to field, separated by periods (str).

physical_type#

Physical type of column (str).

statistics#

Statistics for column chunk (Statistics).

to_dict(self)#

Get dictionary representation of the column chunk metadata.

Returns:
dict

Dictionary with a key for each attribute of this class.

total_compressed_size#

Compressed size in bytes (int).

total_uncompressed_size#

Uncompressed size in bytes (int).