pyarrow.parquet.Statistics#

class pyarrow.parquet.Statistics#

Bases: _Weakrefable

Statistics for a single column in a single row group.

__init__(*args, **kwargs)#

Methods

__init__(*args, **kwargs)

equals(self, Statistics other)

Return whether the two column statistics objects are equal.

to_dict(self)

Get dictionary representation of statistics.

Attributes

converted_type

Legacy converted type (str or None).

distinct_count

Distinct number of values in chunk (int).

has_distinct_count

Whether distinct count is preset (bool).

has_min_max

Whether min and max are present (bool).

has_null_count

Whether null count is present (bool).

logical_type

Logical type of column (ParquetLogicalType).

max

Max value as logical type.

max_raw

Max value as physical type (bool, int, float, or bytes).

min

Min value as logical type.

min_raw

Min value as physical type (bool, int, float, or bytes).

null_count

Number of null values in chunk (int).

num_values

Number of non-null values (int).

physical_type

Physical type of column (str).

converted_type#

Legacy converted type (str or None).

distinct_count#

Distinct number of values in chunk (int).

equals(self, Statistics other)#

Return whether the two column statistics objects are equal.

Parameters:
otherStatistics

Statistics to compare against.

Returns:
are_equalbool
has_distinct_count#

Whether distinct count is preset (bool).

has_min_max#

Whether min and max are present (bool).

has_null_count#

Whether null count is present (bool).

logical_type#

Logical type of column (ParquetLogicalType).

max#

Max value as logical type.

Returned as the Python equivalent of logical type, such as datetime.date for dates and decimal.Decimal for decimals.

max_raw#

Max value as physical type (bool, int, float, or bytes).

min#

Min value as logical type.

Returned as the Python equivalent of logical type, such as datetime.date for dates and decimal.Decimal for decimals.

min_raw#

Min value as physical type (bool, int, float, or bytes).

null_count#

Number of null values in chunk (int).

num_values#

Number of non-null values (int).

physical_type#

Physical type of column (str).

to_dict(self)#

Get dictionary representation of statistics.

Returns:
dict

Dictionary with a key for each attribute of this class.