Module writer

Module writer 

Source
Expand description

Core functionality for writing Arrow arrays as Avro data

Implements the primary writer interface and record encoding logic. Avro writer implementation for the arrow-avro crate.

§Overview

Use this module to serialize Arrow RecordBatch values into Avro. Two output formats are supported:

  • AvroWriter — writes an Object Container File (OCF): a self‑describing file with header (schema JSON + metadata), optional compression, data blocks, and sync markers. See Avro 1.11.1 “Object Container Files.” https://avro.apache.org/docs/1.11.1/specification/#object-container-files
  • AvroStreamWriter — writes a raw Avro binary stream (“datum” bytes) without any container framing. This is useful when the schema is known out‑of‑band (i.e., via a registry) and you want minimal overhead.

§Which format should I use?

§Choosing the Avro schema

By default, the writer converts your Arrow schema to Avro (including a top‑level record name) and stores the resulting JSON under the avro::schema metadata key. If you already have an Avro schema JSON, you want to use verbatim, put it into the Arrow schema metadata under the same key before constructing the writer. The builder will pick it up.

§Compression

For OCF, you may enable a compression codec via WriterBuilder::with_compression. The chosen codec is written into the file header and used for subsequent blocks. Raw stream writing doesn’t apply container‑level compression.


Modules§

encoder
Encodes RecordBatch into the Avro binary format. Avro Encoder for Arrow types.
format
Logic for different Avro container file formats.

Structs§

Writer
Generic Avro writer.
WriterBuilder
Builder to configure and create a Writer.

Type Aliases§

AvroStreamWriter
Alias for a raw Avro binary stream writer.
AvroWriter
Alias for an Avro Object Container File writer.