pyarrow.csv.read_csv¶
- pyarrow.csv.read_csv(input_file, read_options=None, parse_options=None, convert_options=None, MemoryPool memory_pool=None)¶
Read a Table from a stream of CSV data.
- Parameters:
- input_file
str
, path or file-like object The location of CSV data. If a string or path, and if it ends with a recognized compressed file extension (e.g. “.gz” or “.bz2”), the data is automatically decompressed when reading.
- read_options
pyarrow.csv.ReadOptions
, optional Options for the CSV reader (see pyarrow.csv.ReadOptions constructor for defaults)
- parse_options
pyarrow.csv.ParseOptions
, optional Options for the CSV parser (see pyarrow.csv.ParseOptions constructor for defaults)
- convert_options
pyarrow.csv.ConvertOptions
, optional Options for converting CSV data (see pyarrow.csv.ConvertOptions constructor for defaults)
- memory_pool
MemoryPool
, optional Pool to allocate Table memory from
- input_file
- Returns:
pyarrow.Table
Contents of the CSV file as a in-memory table.
Examples
Defining an example file from bytes object:
>>> import io >>> s = ( ... "animals,n_legs,entry\n" ... "Flamingo,2,2022-03-01\n" ... "Horse,4,2022-03-02\n" ... "Brittle stars,5,2022-03-03\n" ... "Centipede,100,2022-03-04" ... ) >>> print(s) animals,n_legs,entry Flamingo,2,2022-03-01 Horse,4,2022-03-02 Brittle stars,5,2022-03-03 Centipede,100,2022-03-04 >>> source = io.BytesIO(s.encode())
Reading from the file
>>> from pyarrow import csv >>> csv.read_csv(source) pyarrow.Table animals: string n_legs: int64 entry: date32[day] ---- animals: [["Flamingo","Horse","Brittle stars","Centipede"]] n_legs: [[2,4,5,100]] entry: [[2022-03-01,2022-03-02,2022-03-03,2022-03-04]]