parquet_to_arrow_field_levels_with_virtual

Function parquet_to_arrow_field_levels_with_virtual 

Source
pub fn parquet_to_arrow_field_levels_with_virtual(
    schema: &SchemaDescriptor,
    mask: ProjectionMask,
    hint: Option<&Fields>,
    virtual_columns: &[FieldRef],
) -> Result<FieldLevels>
Expand description

Convert a parquet SchemaDescriptor to FieldLevels with support for virtual columns

Columns not included within ProjectionMask will be ignored.

The optional hint parameter is the desired Arrow schema. See the arrow module documentation for more information.

§Arguments

  • schema - The Parquet schema descriptor
  • mask - Projection mask to select which columns to include
  • hint - Optional hint for Arrow field types to use instead of defaults
  • virtual_columns - Virtual columns to append to the schema (e.g., row numbers)

§Notes:

Where a field type in hint is compatible with the corresponding parquet type in schema, it will be used, otherwise the default arrow type for the given parquet column type will be used.

Virtual columns are columns that don’t exist in the Parquet file but are generated during reading. They must have extension type names starting with “arrow.virtual.”.

This is to accommodate arrow types that cannot be round-tripped through parquet natively. Depending on the parquet writer, this can lead to a mismatch between a file’s parquet schema and its embedded arrow schema. The parquet schema must be treated as authoritative in such an event. See #1663 for more information

Note: this is a low-level API, most users will want to make use of the higher-level parquet_to_arrow_schema for decoding metadata from a parquet file.