pyarrow.csv.read_csv#

pyarrow.csv.read_csv(input_file, read_options=None, parse_options=None, convert_options=None, MemoryPool memory_pool=None)#

Read a Table from a stream of CSV data.

Parameters:
input_filestr, path or file-like object

The location of CSV data. If a string or path, and if it ends with a recognized compressed file extension (e.g. “.gz” or “.bz2”), the data is automatically decompressed when reading.

read_optionspyarrow.csv.ReadOptions, optional

Options for the CSV reader (see pyarrow.csv.ReadOptions constructor for defaults)

parse_optionspyarrow.csv.ParseOptions, optional

Options for the CSV parser (see pyarrow.csv.ParseOptions constructor for defaults)

convert_optionspyarrow.csv.ConvertOptions, optional

Options for converting CSV data (see pyarrow.csv.ConvertOptions constructor for defaults)

memory_poolMemoryPool, optional

Pool to allocate Table memory from

Returns:
pyarrow.Table

Contents of the CSV file as a in-memory table.

Examples

Defining an example file from bytes object:

>>> import io
>>> s = (
...     "animals,n_legs,entry\n"
...     "Flamingo,2,2022-03-01\n"
...     "Horse,4,2022-03-02\n"
...     "Brittle stars,5,2022-03-03\n"
...     "Centipede,100,2022-03-04"
... )
>>> print(s)
animals,n_legs,entry
Flamingo,2,2022-03-01
Horse,4,2022-03-02
Brittle stars,5,2022-03-03
Centipede,100,2022-03-04
>>> source = io.BytesIO(s.encode())

Reading from the file

>>> from pyarrow import csv
>>> csv.read_csv(source)
pyarrow.Table
animals: string
n_legs: int64
entry: date32[day]
----
animals: [["Flamingo","Horse","Brittle stars","Centipede"]]
n_legs: [[2,4,5,100]]
entry: [[2022-03-01,2022-03-02,2022-03-03,2022-03-04]]