parquet

Module format

Source
Expand description

Automatically generated code from the Parquet thrift definition.

This module code generated from parquet.thrift. See crate::file for more information on reading Parquet encoded data.

See crate::file for easier to use APIs.

Structs§

  • Bloom filter header is stored at beginning of Bloom filter data of each column and followed by its bitset.
  • Enum to annotate whether lists of min/max elements inside ColumnIndex are ordered and if so, in which direction.
  • Embedded BSON logical type annotation
  • Optional statistics for each data page in a ColumnChunk.
  • Description for column metadata
  • Supported compression algorithms.
  • DEPRECATED: Common types used by frameworks(e.g. hive, pig) using parquet. ConvertedType is superseded by LogicalType. This enum should not be extended.
  • Data page header
  • New page format allowing reading levels without decompressing the data Repetition and definition levels are uncompressed The remaining section containing the data is compressed if is_compressed is true
  • Decimal logical type annotation
  • The dictionary page must be placed at the first position of the column chunk if it is partly or completely dictionary encoded. At most one dictionary page can be placed in a column chunk.
  • Encodings supported by Parquet. Not all encodings are valid for all types. These enums are also used to specify the encoding of definition and repetition levels. See the accompanying doc for the details of the more complicated encodings.
  • Representation of Schemas
  • Crypto metadata for files with encrypted footer *
  • Description for file metadata
  • Integer logical type annotation
  • Embedded JSON logical type annotation
  • Wrapper struct to store key values
  • Time units for logical types
  • Logical type to annotate a column that is always null.
  • Optional offsets for each data page in a ColumnChunk.
  • statistics of a given page type and encoding
  • Represents a element inside a schema definition.
  • A structure for capturing metadata for estimating the unencoded, uncompressed size of data written. This is useful for readers to estimate how much memory is needed to reconstruct data in their memory model and for fine grained filter pushdown on nested structures (the histograms contained in this structure can help determine the number of nulls at a particular nesting level and maximum length of lists).
  • Sort order within a RowGroup of a leaf column
  • Block-based algorithm type annotation. *
  • Statistics per row group and per page All fields are optional.
  • Empty structs to use as logical type annotations
  • Time logical type annotation
  • Timestamp logical type annotation
  • Types supported by Parquet. These types are intended to be used in combination with the encodings to control the on disk storage format. For example INT16 is not included as a type since a good encoding of INT32 would handle this.
  • Empty struct to signal the order defined by the physical or logical type
  • The compression used in the Bloom filter.
  • Hash strategy type annotation. xxHash is an extremely fast non-cryptographic hash algorithm. It uses 64 bits version of xxHash.

Enums§