Module writer

Module writer 

Source
Expand description

Core functionality for writing Arrow arrays as Avro data

Implements the primary writer interface and record encoding logic. Avro writer implementation for the arrow-avro crate.

§Overview

Use this module to serialize Arrow RecordBatch values into Avro. Two output formats are supported:

  • AvroWriter — writes an Object Container File (OCF): a self‑describing file with header (schema JSON + metadata), optional compression, data blocks, and sync markers. See Avro 1.11.1 “Object Container Files.” https://avro.apache.org/docs/1.11.1/specification/#object-container-files
  • AvroStreamWriter — writes a Single Object Encoding (SOE) Stream (“datum” bytes) without any container framing. This is useful when the schema is known out‑of‑band (i.e., via a registry) and you want minimal overhead.

§Which format should you use?

§Choosing the Avro schema

By default, the writer converts your Arrow schema to Avro (including a top‑level record name). If you already have an Avro schema JSON you want to use verbatim, put it into the Arrow schema metadata under the avro.schema key before constructing the writer. The builder will use that schema instead of generating a new one (unless strip_metadata is set to true in the options).

§Compression

For OCF, you may enable a compression codec via WriterBuilder::with_compression. The chosen codec is written into the file header and used for subsequent blocks. SOE stream writing doesn’t apply container‑level compression.


Modules§

encoder 🔒
Encodes RecordBatch into the Avro binary format. Avro Encoder for Arrow types.
format
Logic for different Avro container file formats. Avro Writer Formats for Arrow.

Structs§

Writer
Generic Avro writer.
WriterBuilder
Builder to configure and create a Writer.

Type Aliases§

AvroStreamWriter
Alias for an Avro Single Object Encoding stream writer.
AvroWriter
Alias for an Avro Object Container File writer.