Skip to main content

Encoder

Struct Encoder 

Source
pub struct Encoder {
    schema: SchemaRef,
    encoder: RecordEncoder,
    row_capacity: Option<usize>,
    buffer: BytesMut,
    offsets: Vec<usize>,
}
Expand description

A row-by-row encoder for Avro stream/message formats (SOE / registry wire formats / raw binary).

Unlike Writer, which emits a single continuous byte stream to a std::io::Write sink, Encoder tracks row boundaries during encoding and returns an EncodedRows containing:

  • one backing buffer (Bytes)
  • row boundary offsets

This enables zero-copy per-row payloads (for instance, one Kafka message per Arrow row) without re-encoding or decoding the byte stream to recover record boundaries.

§Example

use std::sync::Arc;
use arrow_array::{ArrayRef, Int32Array, RecordBatch};
use arrow_schema::{DataType, Field, Schema};
use arrow_avro::writer::{WriterBuilder, format::AvroSoeFormat};
use arrow_avro::schema::FingerprintStrategy;

let schema = Schema::new(vec![Field::new("value", DataType::Int32, false)]);
let batch = RecordBatch::try_new(
    Arc::new(schema.clone()),
    vec![Arc::new(Int32Array::from(vec![1, 2, 3])) as ArrayRef],
)?;

// Configure the encoder (here: Confluent Wire Format with schema ID 100)
let mut encoder = WriterBuilder::new(schema)
    .with_fingerprint_strategy(FingerprintStrategy::Id(100))
    .build_encoder::<AvroSoeFormat>()?;

// Encode the batch
encoder.encode(&batch)?;

// Get the encoded rows
let rows = encoder.flush();

// Convert to owned Vec<u8> payloads (e.g., for a Kafka producer)
let payloads: Vec<Vec<u8>> = rows.iter().map(|row| row.to_vec()).collect();

assert_eq!(payloads.len(), 3);
assert_eq!(payloads[0][0], 0x00); // Magic byte

Fields§

§schema: SchemaRef§encoder: RecordEncoder§row_capacity: Option<usize>§buffer: BytesMut§offsets: Vec<usize>

Implementations§

Source§

impl Encoder

Source

pub fn encode(&mut self, batch: &RecordBatch) -> Result<(), AvroError>

Serialize one [RecordBatch] into the internal buffer.

Source

pub fn encode_batches( &mut self, batches: &[RecordBatch], ) -> Result<(), AvroError>

A convenience method to write a slice of [RecordBatch] values.

Source

pub fn flush(&mut self) -> EncodedRows

Drain and return all currently buffered encoded rows.

The returned EncodedRows provides per-row payloads as Bytes slices.

Source

pub fn schema(&self) -> SchemaRef

Returns the Arrow schema used by this encoder.

The returned schema includes metadata with the Avro schema JSON under the avro.schema key.

Source

pub fn buffered_len(&self) -> usize

Returns the number of encoded rows currently buffered.

Trait Implementations§

Source§

impl Debug for Encoder

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,