Tabular File Formats#

CSV Files#

`ConvertOptions`([check_utf8, column_types, ...])	Options for converting CSV data.
`CSVStreamingReader`()	An object that reads record batches incrementally from a CSV file.
`CSVWriter`(sink, Schema schema, ...)	Writer to create a CSV file.
`ISO8601`	A special object indicating ISO-8601 parsing.
`ParseOptions`([delimiter, quote_char, ...])	Options for parsing CSV files.
`ReadOptions`([use_threads, block_size, ...])	Options for reading CSV files.
`WriteOptions`([include_header, batch_size, ...])	Options for writing CSV files.
`open_csv`(input_file[, read_options, ...])	Open a streaming reader of CSV data.
`read_csv`(input_file[, read_options, ...])	Read a Table from a stream of CSV data.
`write_csv`(data, output_file[, write_options])	Write record batch or table to a CSV file.
`InvalidRow`(expected_columns, actual_columns, ...)	Description of an invalid row in a CSV file.

`read_feather`(source[, columns, use_threads, ...])	Read a pandas.DataFrame from Feather format.
`read_table`(source[, columns, memory_map, ...])	Read a pyarrow.Table from Feather format
`write_feather`(df, dest[, compression, ...])	Write a pandas.DataFrame to Feather format.

`ReadOptions`([use_threads, block_size])	Options for reading JSON files.
`ParseOptions`([explicit_schema, ...])	Options for parsing JSON files.
`open_json`(input_file[, read_options, ...])	Open a streaming reader of JSON data.
`read_json`(input_file[, read_options, ...])	Read a Table from a stream of JSON data.

`ParquetDataset`(path_or_paths[, filesystem, ...])	Encapsulates details of reading a complete Parquet dataset possibly consisting of multiple files and partitions in subdirectories.
`ParquetFile`(source, *[, metadata, ...])	Reader interface for a single Parquet file.
`ParquetWriter`(where, schema[, filesystem, ...])	Class for incrementally building a Parquet file for Arrow tables.
`read_table`(source, *[, columns, ...])	Read a Table from Parquet format
`read_metadata`(where[, memory_map, ...])	Read FileMetaData from footer of a single Parquet file.
`read_pandas`(source[, columns])	Read a Table from Parquet format, also reading DataFrame index values if known in the file metadata
`read_schema`(where[, memory_map, ...])	Read effective Arrow schema from Parquet file metadata.
`write_metadata`(schema, where[, ...])	Write metadata-only Parquet file from schema.
`write_table`(table, where[, row_group_size, ...])	Write a Table to Parquet format.
`write_to_dataset`(table, root_path[, ...])	Wrapper around dataset.write_dataset for writing a Table to Parquet format by partitions.

`FileMetaData`()	Parquet metadata for a single file.
`RowGroupMetaData`()	Metadata for a single row group.
`SortingColumn`(int column_index, ...)	Sorting specification for a single column.
`ColumnChunkMetaData`()	Column metadata for a single row group.
`Statistics`()	Statistics for a single column in a single row group.
`ParquetSchema`	A Parquet schema.
`ColumnSchema`	Schema for a single column.
`ParquetLogicalType`()	Logical type of parquet type.

`CryptoFactory`(kms_client_factory)	A factory that produces the low-level FileEncryptionProperties and FileDecryptionProperties objects, from the high-level parameters.
`KmsClient`()	The abstract base class for KmsClient implementations.
`KmsConnectionConfig`([kms_instance_id, ...])	Configuration of the connection to the Key Management Service (KMS)
`EncryptionConfiguration`(footer_key[, ...])	Configuration of the encryption, such as which columns to encrypt
`DecryptionConfiguration`([cache_lifetime])	Configuration of the decryption, such as cache timeout.

`ORCFile`(source)	Reader interface for a single ORC file
`ORCWriter`(where, *[, file_version, ...])	Writer interface for a single ORC file
`read_table`(source[, columns, filesystem])	Read a Table from an ORC file.
`write_table`(table, where, *[, file_version, ...])	Write a table into an ORC file.