Skip to main content

Module arrow_writer

Module arrow_writer 

Source
Expand description

Contains writer which writes arrow data into parquet data.

ModulesΒ§

byte_array πŸ”’
levels πŸ”’
Parquet definition and repetition levels

StructsΒ§

ArrowColumnChunk
The data for a single column chunk, see ArrowColumnWriter
ArrowColumnChunkData πŸ”’
A single column chunk produced by ArrowColumnWriter.
ArrowColumnWriter
Encodes ArrowLeafColumn to ArrowColumnChunk
ArrowColumnWriterFactory πŸ”’
Creates ArrowColumnWriter instances
ArrowLeafColumn
A leaf column that can be encoded by ArrowColumnWriter
ArrowPageWriter πŸ”’
ArrowRowGroupWriter πŸ”’
Encodes [RecordBatch] to a parquet row group
ArrowRowGroupWriterFactory
Factory that creates new column writers for each row group in the Parquet file.
ArrowWriter
Encodes [RecordBatch] to parquet
ArrowWriterOptions
Arrow-specific configuration settings for writing parquet files.
InMemoryPageStore
The default PageStore, holding blobs on the heap in a Vec<Bytes>.
InMemoryPageStoreFactory
Factory for InMemoryPageStore β€” the default used by ArrowWriter.
PageKey
An opaque, store-allocated handle to a blob held by a PageStore.
PageStoreArgs
Context for a single PageStoreFactory::create call.
StreamingColumnChunkReader πŸ”’
A streaming Read over one column chunk’s buffered pages, in final file order: the dictionary page (if any) first, then the data pages.

EnumsΒ§

ArrowColumnWriterImpl πŸ”’

TraitsΒ§

PageStore
A pluggable store for completed, serialized page blobs.
PageStoreFactory
Creates a fresh PageStore for each column chunk.

FunctionsΒ§

compute_leaves
Computes the ArrowLeafColumn for a potentially nested [ArrayRef]
get_bool_array_slice πŸ”’
get_column_writersDeprecated
Returns ArrowColumnWriters for each column in a given schema
get_decimal_32_array_slice πŸ”’
get_decimal_64_array_slice πŸ”’
get_decimal_128_array_slice πŸ”’
get_decimal_256_array_slice πŸ”’
get_float_16_array_slice πŸ”’
get_fsb_array_slice πŸ”’
get_interval_dt_array_slice πŸ”’
Returns 12-byte values representing 3 values of months, days and milliseconds (4-bytes each). An Arrow DayTime interval only stores days and millis, thus the first 4 bytes are not populated.
get_interval_ym_array_slice πŸ”’
Returns 12-byte values representing 3 values of months, days and milliseconds (4-bytes each). An Arrow YearMonth interval only stores months, thus only the first 4 bytes are populated.
write_leaf πŸ”’
write_primitive πŸ”’

Type AliasesΒ§

SharedColumnChunk πŸ”’
A shared ArrowColumnChunkData