pub enum Statistics {
Boolean(ValueStatistics<bool>),
Int32(ValueStatistics<i32>),
Int64(ValueStatistics<i64>),
Int96(ValueStatistics<Int96>),
Float(ValueStatistics<f32>),
Double(ValueStatistics<f64>),
ByteArray(ValueStatistics<ByteArray>),
FixedLenByteArray(ValueStatistics<FixedLenByteArray>),
}
Expand description
Strongly typed statistics for a column chunk within a row group.
This structure is a natively typed, in memory representation of the
Statistics
structure in a parquet file footer. The statistics stored in
this structure can be used by query engines to skip decoding pages while
reading parquet data.
Page level statistics are stored separately, in NativeIndex.
Variants§
Boolean(ValueStatistics<bool>)
Statistics for Boolean column
Int32(ValueStatistics<i32>)
Statistics for Int32 column
Int64(ValueStatistics<i64>)
Statistics for Int64 column
Int96(ValueStatistics<Int96>)
Statistics for Int96 column
Float(ValueStatistics<f32>)
Statistics for Float column
Double(ValueStatistics<f64>)
Statistics for Double column
ByteArray(ValueStatistics<ByteArray>)
Statistics for ByteArray column
FixedLenByteArray(ValueStatistics<FixedLenByteArray>)
Statistics for FixedLenByteArray column
Implementations§
Source§impl Statistics
impl Statistics
Sourcepub fn new<T: ParquetValueType>(
min: Option<T>,
max: Option<T>,
distinct_count: Option<u64>,
null_count: Option<u64>,
is_deprecated: bool,
) -> Self
pub fn new<T: ParquetValueType>( min: Option<T>, max: Option<T>, distinct_count: Option<u64>, null_count: Option<u64>, is_deprecated: bool, ) -> Self
Creates new statistics for a column type
Sourcepub fn boolean(
min: Option<bool>,
max: Option<bool>,
distinct: Option<u64>,
nulls: Option<u64>,
is_deprecated: bool,
) -> Self
pub fn boolean( min: Option<bool>, max: Option<bool>, distinct: Option<u64>, nulls: Option<u64>, is_deprecated: bool, ) -> Self
Creates new statistics for Boolean
column type.
Sourcepub fn int32(
min: Option<i32>,
max: Option<i32>,
distinct: Option<u64>,
nulls: Option<u64>,
is_deprecated: bool,
) -> Self
pub fn int32( min: Option<i32>, max: Option<i32>, distinct: Option<u64>, nulls: Option<u64>, is_deprecated: bool, ) -> Self
Creates new statistics for Int32
column type.
Sourcepub fn int64(
min: Option<i64>,
max: Option<i64>,
distinct: Option<u64>,
nulls: Option<u64>,
is_deprecated: bool,
) -> Self
pub fn int64( min: Option<i64>, max: Option<i64>, distinct: Option<u64>, nulls: Option<u64>, is_deprecated: bool, ) -> Self
Creates new statistics for Int64
column type.
Sourcepub fn int96(
min: Option<Int96>,
max: Option<Int96>,
distinct: Option<u64>,
nulls: Option<u64>,
is_deprecated: bool,
) -> Self
pub fn int96( min: Option<Int96>, max: Option<Int96>, distinct: Option<u64>, nulls: Option<u64>, is_deprecated: bool, ) -> Self
Creates new statistics for Int96
column type.
Sourcepub fn float(
min: Option<f32>,
max: Option<f32>,
distinct: Option<u64>,
nulls: Option<u64>,
is_deprecated: bool,
) -> Self
pub fn float( min: Option<f32>, max: Option<f32>, distinct: Option<u64>, nulls: Option<u64>, is_deprecated: bool, ) -> Self
Creates new statistics for Float
column type.
Sourcepub fn double(
min: Option<f64>,
max: Option<f64>,
distinct: Option<u64>,
nulls: Option<u64>,
is_deprecated: bool,
) -> Self
pub fn double( min: Option<f64>, max: Option<f64>, distinct: Option<u64>, nulls: Option<u64>, is_deprecated: bool, ) -> Self
Creates new statistics for Double
column type.
Sourcepub fn byte_array(
min: Option<ByteArray>,
max: Option<ByteArray>,
distinct: Option<u64>,
nulls: Option<u64>,
is_deprecated: bool,
) -> Self
pub fn byte_array( min: Option<ByteArray>, max: Option<ByteArray>, distinct: Option<u64>, nulls: Option<u64>, is_deprecated: bool, ) -> Self
Creates new statistics for ByteArray
column type.
Sourcepub fn fixed_len_byte_array(
min: Option<FixedLenByteArray>,
max: Option<FixedLenByteArray>,
distinct: Option<u64>,
nulls: Option<u64>,
is_deprecated: bool,
) -> Self
pub fn fixed_len_byte_array( min: Option<FixedLenByteArray>, max: Option<FixedLenByteArray>, distinct: Option<u64>, nulls: Option<u64>, is_deprecated: bool, ) -> Self
Creates new statistics for FixedLenByteArray
column type.
Sourcepub fn is_min_max_deprecated(&self) -> bool
pub fn is_min_max_deprecated(&self) -> bool
Returns true
if statistics have old min
and max
fields set.
This means that the column order is likely to be undefined, which, for old files
could mean a signed sort order of values.
Refer to ColumnOrder
and
SortOrder
for more information.
Sourcepub fn is_min_max_backwards_compatible(&self) -> bool
pub fn is_min_max_backwards_compatible(&self) -> bool
Old versions of parquet stored statistics in min
and max
fields, ordered
using signed comparison. This resulted in an undefined ordering for unsigned
quantities, such as booleans and unsigned integers.
These fields were therefore deprecated in favour of min_value
and max_value
,
which have a type-defined sort order.
However, not all readers have been updated. For backwards compatibility, this method
returns true
if the statistics within this have a signed sort order, that is
compatible with being stored in the deprecated min
and max
fields
Sourcepub fn distinct_count(&self) -> Option<u64>
👎Deprecated since 53.0.0: Use distinct_count_opt
method instead
pub fn distinct_count(&self) -> Option<u64>
distinct_count_opt
method insteadReturns optional value of number of distinct values occurring.
When it is None
, the value should be ignored.
Sourcepub fn distinct_count_opt(&self) -> Option<u64>
pub fn distinct_count_opt(&self) -> Option<u64>
Returns optional value of number of distinct values occurring.
When it is None
, the value should be ignored.
Sourcepub fn null_count(&self) -> u64
👎Deprecated since 53.0.0: Use null_count_opt
method instead
pub fn null_count(&self) -> u64
null_count_opt
method insteadReturns number of null values for the column. Note that this includes all nulls when column is part of the complex type.
Note this API returns 0 if the null count is not available.
Sourcepub fn has_nulls(&self) -> bool
👎Deprecated since 53.0.0: Use null_count_opt
method instead
pub fn has_nulls(&self) -> bool
null_count_opt
method insteadReturns true
if statistics collected any null values, false
otherwise.
Sourcepub fn null_count_opt(&self) -> Option<u64>
pub fn null_count_opt(&self) -> Option<u64>
Returns number of null values for the column, if known. Note that this includes all nulls when column is part of the complex type.
Note this API returns Some(0) even if the null count was not present in the statistics. See https://github.com/apache/arrow-rs/pull/6216/files
Sourcepub fn has_min_max_set(&self) -> bool
👎Deprecated since 53.0.0: Use min_bytes_opt
and max_bytes_opt
methods instead
pub fn has_min_max_set(&self) -> bool
min_bytes_opt
and max_bytes_opt
methods insteadWhether or not min and max values are set.
Normally both min/max values will be set to Some(value)
or None
.
Sourcepub fn min_is_exact(&self) -> bool
pub fn min_is_exact(&self) -> bool
Returns true
if the min value is set, and is an exact min value.
Sourcepub fn max_is_exact(&self) -> bool
pub fn max_is_exact(&self) -> bool
Returns true
if the max value is set, and is an exact max value.
Sourcepub fn min_bytes_opt(&self) -> Option<&[u8]>
pub fn min_bytes_opt(&self) -> Option<&[u8]>
Returns slice of bytes that represent min value, if min value is known.
Sourcepub fn min_bytes(&self) -> &[u8] ⓘ
👎Deprecated since 53.0.0: Use max_bytes_opt
instead
pub fn min_bytes(&self) -> &[u8] ⓘ
max_bytes_opt
insteadReturns slice of bytes that represent min value. Panics if min value is not set.
Sourcepub fn max_bytes_opt(&self) -> Option<&[u8]>
pub fn max_bytes_opt(&self) -> Option<&[u8]>
Returns slice of bytes that represent max value, if max value is known.
Sourcepub fn max_bytes(&self) -> &[u8] ⓘ
👎Deprecated since 53.0.0: Use max_bytes_opt
instead
pub fn max_bytes(&self) -> &[u8] ⓘ
max_bytes_opt
insteadReturns slice of bytes that represent max value. Panics if max value is not set.
Sourcepub fn physical_type(&self) -> Type
pub fn physical_type(&self) -> Type
Returns physical type associated with statistics.
Trait Implementations§
Source§impl Clone for Statistics
impl Clone for Statistics
Source§fn clone(&self) -> Statistics
fn clone(&self) -> Statistics
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moreSource§impl Debug for Statistics
impl Debug for Statistics
Source§impl Display for Statistics
impl Display for Statistics
Source§impl<T: ParquetValueType> From<ValueStatistics<T>> for Statistics
impl<T: ParquetValueType> From<ValueStatistics<T>> for Statistics
Source§fn from(t: ValueStatistics<T>) -> Self
fn from(t: ValueStatistics<T>) -> Self
Source§impl HeapSize for Statistics
impl HeapSize for Statistics
Source§impl PartialEq for Statistics
impl PartialEq for Statistics
impl StructuralPartialEq for Statistics
Auto Trait Implementations§
impl !Freeze for Statistics
impl RefUnwindSafe for Statistics
impl Send for Statistics
impl Sync for Statistics
impl Unpin for Statistics
impl UnwindSafe for Statistics
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more