pyarrow.parquet.ParquetWriter

class pyarrow.parquet.ParquetWriter(where, schema, flavor=None, version='1.0', use_dictionary=True, compression='snappy', use_deprecated_int96_timestamps=None, **options)[source]

Bases: object

Class for incrementally building a Parquet file for Arrow tables

Parameters:
  • where (path or file-like object) –
  • schema (arrow Schema) –
  • version ({"1.0", "2.0"}, default "1.0") – The Parquet format version, defaults to 1.0
  • use_dictionary (bool or list) – Specify if we should use dictionary encoding in general or only for some columns.
  • use_deprecated_int96_timestamps (boolean, default None) – Write nanosecond resolution timestamps to INT96 Parquet format. Defaults to False unless enabled by flavor argument
  • coerce_timestamps (string, default None) – Cast timestamps a particular resolution. Valid values: {None, ‘ms’, ‘us’}
  • compression (str or dict) – Specify the compression codec, either on a general basis or per-column.
  • flavor ({'spark'}, default None) – Sanitize schema or set other compatibility options for compatibility
__init__(where, schema, flavor=None, version='1.0', use_dictionary=True, compression='snappy', use_deprecated_int96_timestamps=None, **options)[source]

Methods

__init__(where, schema[, flavor, version, …])
close()
write_table(table[, row_group_size])
close()[source]
write_table(table, row_group_size=None)[source]