pyarrow.concat_batches#
- pyarrow.concat_batches(recordbatches, MemoryPool memory_pool=None)#
Concatenate pyarrow.RecordBatch objects.
All recordbatches must share the same Schema, the operation implies a copy of the data to merge the arrays of the different RecordBatches.
- Parameters:
- recordbatchesiterable of
pyarrow.RecordBatch
objects
Pyarrow record batches to concatenate into a single RecordBatch.
- memory_pool
MemoryPool
, defaultNone
For memory allocations, if required, otherwise use default pool.
- recordbatchesiterable of
Examples
>>> import pyarrow as pa >>> t1 = pa.record_batch([ ... pa.array([2, 4, 5, 100]), ... pa.array(["Flamingo", "Horse", "Brittle stars", "Centipede"]) ... ], names=['n_legs', 'animals']) >>> t2 = pa.record_batch([ ... pa.array([2, 4]), ... pa.array(["Parrot", "Dog"]) ... ], names=['n_legs', 'animals']) >>> pa.concat_batches([t1,t2]) pyarrow.RecordBatch n_legs: int64 animals: string ---- n_legs: [2,4,5,100,2,4] animals: ["Flamingo","Horse","Brittle stars","Centipede","Parrot","Dog"]