pub fn parquet_to_arrow_field_levels_with_virtual(
schema: &SchemaDescriptor,
mask: ProjectionMask,
hint: Option<&Fields>,
virtual_columns: &[FieldRef],
) -> Result<FieldLevels>Expand description
Convert a parquet SchemaDescriptor to FieldLevels with support for virtual columns
Columns not included within ProjectionMask will be ignored.
The optional hint parameter is the desired Arrow schema. See the
arrow module documentation for more information.
§Arguments
schema- The Parquet schema descriptormask- Projection mask to select which columns to includehint- Optional hint for Arrow field types to use instead of defaultsvirtual_columns- Virtual columns to append to the schema (e.g., row numbers)
§Notes:
Where a field type in hint is compatible with the corresponding parquet type in schema, it
will be used, otherwise the default arrow type for the given parquet column type will be used.
Virtual columns are columns that don’t exist in the Parquet file but are generated during reading. They must have extension type names starting with “arrow.virtual.”.
This is to accommodate arrow types that cannot be round-tripped through parquet natively.
Depending on the parquet writer, this can lead to a mismatch between a file’s parquet schema
and its embedded arrow schema. The parquet schema must be treated as authoritative in such
an event. See #1663 for more information
Note: this is a low-level API, most users will want to make use of the higher-level
parquet_to_arrow_schema for decoding metadata from a parquet file.