pyarrow.csv.CSVStreamingReader#

class pyarrow.csv.CSVStreamingReader#

Bases: RecordBatchReader

An object that reads record batches incrementally from a CSV file.

Should not be instantiated directly by user code.

Methods

`__init__`(args, *kwargs)
`cast`(self, target_schema)	Wrap this reader with one that casts each batch lazily as it is pulled.
`close`(self)	Release any resources associated with the reader.
`from_batches`(Schema schema, batches)	Create RecordBatchReader from an iterable of batches.
`from_stream`(data[, schema])	Create RecordBatchReader from a Arrow-compatible stream object.
`iter_batches_with_custom_metadata`(self)	Iterate over record batches from the stream along with their custom metadata.
`read_all`(self)	Read all record batches as a pyarrow.Table.
`read_next_batch`(self)	Read next RecordBatch from the stream.
`read_next_batch_with_custom_metadata`(self)	Read next RecordBatch from the stream along with its custom metadata.
`read_pandas`(self, **options)	Read contents of stream to a pandas.DataFrame.

Attributes

cast(self, target_schema)#

Wrap this reader with one that casts each batch lazily as it is pulled. Currently only a safe cast to target_schema is implemented.

Parameters:

target_schemaSchema: Schema to cast to, the names and order of fields must match.

Returns:

static from_batches(Schema schema, batches)#

Create RecordBatchReader from an iterable of batches.

Parameters:

Returns:

static from_stream(data, schema=None)#

Create RecordBatchReader from a Arrow-compatible stream object.

This accepts objects implementing the Arrow PyCapsule Protocol for streams, i.e. objects that have a __arrow_c_stream__ method.

Parameters:

dataArrow-compatible stream object: Any object that implements the Arrow PyCapsule Protocol for streams.
schemaSchema, default None: The schema to which the stream should be casted, if supported by the stream object.

Returns:

iter_batches_with_custom_metadata(self)#

Iterate over record batches from the stream along with their custom metadata.

Yields:

read_all(self)#

Read all record batches as a pyarrow.Table.

Returns:

read_next_batch(self)#