pyarrow.parquet.write_metadata¶
- pyarrow.parquet.write_metadata(schema, where, metadata_collector=None, filesystem=None, **kwargs)[source]¶
Write metadata-only Parquet file from schema. This can be used with write_to_dataset to generate _common_metadata and _metadata sidecar files.
- Parameters:
- schema
pyarrow.Schema
- where
str
orpyarrow.NativeFile
- metadata_collector
list
where to collect metadata information.
- filesystem
FileSystem
, defaultNone
If nothing passed, will be inferred from where if path-like, else where is already a file-like object so no filesystem is needed.
- **kwargsdict,
Additional kwargs for ParquetWriter class. See docstring for ParquetWriter for more information.
- schema
Examples
Generate example data:
>>> import pyarrow as pa >>> table = pa.table({'n_legs': [2, 2, 4, 4, 5, 100], ... 'animal': ["Flamingo", "Parrot", "Dog", "Horse", ... "Brittle stars", "Centipede"]})
Write a dataset and collect metadata information.
>>> metadata_collector = [] >>> import pyarrow.parquet as pq >>> pq.write_to_dataset( ... table, 'dataset_metadata', ... metadata_collector=metadata_collector)
Write the _common_metadata parquet file without row groups statistics.
>>> pq.write_metadata( ... table.schema, 'dataset_metadata/_common_metadata')
Write the _metadata parquet file with row groups statistics.
>>> pq.write_metadata( ... table.schema, 'dataset_metadata/_metadata', ... metadata_collector=metadata_collector)