pyarrow.ipc.RecordBatchFileWriter

class pyarrow.ipc.RecordBatchFileWriter(sink, schema, *, use_legacy_format=None, options=None)[source]

Bases: _RecordBatchFileWriter

Writer to create the Arrow binary file format

Parameters:
sinkstr, pyarrow.NativeFile, or file-like Python object

Either a file path, or a writable file object.

schemapyarrow.Schema

The Arrow schema for data to be written to the file.

use_legacy_formatbool, default None

Deprecated in favor of setting options. Cannot be provided with options.

If None, False will be used unless this default is overridden by setting the environment variable ARROW_PRE_0_15_IPC_FORMAT=1

optionspyarrow.ipc.IpcWriteOptions

Options for IPC serialization.

If None, default values will be used: the legacy format will not be used unless overridden by setting the environment variable ARROW_PRE_0_15_IPC_FORMAT=1, and the V5 metadata version will be used unless overridden by setting the environment variable ARROW_PRE_1_0_METADATA_VERSION=1.

__init__(sink, schema, *, use_legacy_format=None, options=None)[source]

Methods

__init__(sink, schema, *[, ...])

close(self)

Close stream and write end-of-stream 0 marker.

write(self, table_or_batch)

Write RecordBatch or Table to stream.

write_batch(self, RecordBatch batch[, ...])

Write RecordBatch to stream.

write_table(self, Table table[, max_chunksize])

Write Table to stream in (contiguous) RecordBatch objects.

Attributes

stats

Current IPC write statistics.

close(self)

Close stream and write end-of-stream 0 marker.

stats

Current IPC write statistics.

write(self, table_or_batch)

Write RecordBatch or Table to stream.

Parameters:
table_or_batch{RecordBatch, Table}
write_batch(self, RecordBatch batch, custom_metadata=None)

Write RecordBatch to stream.

Parameters:
batchRecordBatch
custom_metadatamapping or KeyValueMetadata

Keys and values must be string-like / coercible to bytes

write_table(self, Table table, max_chunksize=None)

Write Table to stream in (contiguous) RecordBatch objects.

Parameters:
tableTable
max_chunksizeint, default None

Maximum size for RecordBatch chunks. Individual chunks may be smaller depending on the chunk layout of individual columns.