pyarrow.Codec¶
-
class
pyarrow.
Codec
(unicode compression, compression_level=None)¶ Bases:
pyarrow.lib._Weakrefable
Compression codec.
- Parameters
compression (str) – Type of compression codec to initialize, valid values are: ‘gzip’, ‘bz2’, ‘brotli’, ‘lz4’ (or ‘lz4_frame’), ‘lz4_raw’, ‘zstd’ and ‘snappy’.
compression_level (int, None) –
Optional parameter specifying how aggressively to compress. The possible ranges and effect of this parameter depend on the specific codec chosen. Higher values compress more but typically use more resources (CPU/RAM). Some codecs support negative values.
- gzip
The compression_level maps to the memlevel parameter of deflateInit2. Higher levels use more RAM but are faster and should have higher compression ratios.
- bz2
The compression level maps to the blockSize100k parameter of the BZ2_bzCompressInit function. Higher levels use more RAM but are faster and should have higher compression ratios.
- brotli
The compression level maps to the BROTLI_PARAM_QUALITY parameter. Higher values are slower and should have higher compression ratios.
- lz4/lz4_frame/lz4_raw
The compression level parameter is not supported and must be None
- zstd
The compression level maps to the compressionLevel parameter of ZSTD_initCStream. Negative values are supported. Higher values are slower and should have higher compression ratios.
- snappy
The compression level parameter is not supported and must be None
- Raises
ValueError – If invalid compression value is passed.
-
__init__
(*args, **kwargs)¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(*args, **kwargs)Initialize self.
compress
(self, buf[, asbytes, memory_pool])Compress data from buffer-like object.
decompress
(self, buf[, decompressed_size, …])Decompress data from buffer-like object.
default_compression_level
(unicode compression)Returns the compression level that Arrow will use for the codec if None is specified.
detect
(path)Detect and instantiate compression codec based on file extension.
is_available
(unicode compression)Returns whether the compression support has been built and enabled.
maximum_compression_level
(unicode compression)Returns the largest valid value for the compression level
minimum_compression_level
(unicode compression)Returns the smallest valid value for the compression level
supports_compression_level
(unicode compression)Returns true if the compression level parameter is supported for the given codec.
Attributes
Returns the compression level parameter of the codec
Returns the name of the codec
-
compress
(self, buf, asbytes=False, memory_pool=None)¶ Compress data from buffer-like object.
- Parameters
buf (pyarrow.Buffer, bytes, or other object supporting buffer protocol) –
asbytes (bool, default False) – Return result as Python bytes object, otherwise Buffer
memory_pool (MemoryPool, default None) – Memory pool to use for buffer allocations, if any
- Returns
compressed (pyarrow.Buffer or bytes (if asbytes=True))
-
compression_level
¶ Returns the compression level parameter of the codec
-
decompress
(self, buf, decompressed_size=None, asbytes=False, memory_pool=None)¶ Decompress data from buffer-like object.
- Parameters
buf (pyarrow.Buffer, bytes, or memoryview-compatible object) –
decompressed_size (int64_t, default None) – If not specified, will be computed if the codec is able to determine the uncompressed buffer size.
asbytes (boolean, default False) – Return result as Python bytes object, otherwise Buffer
memory_pool (MemoryPool, default None) – Memory pool to use for buffer allocations, if any.
- Returns
uncompressed (pyarrow.Buffer or bytes (if asbytes=True))
-
static
default_compression_level
(unicode compression)¶ Returns the compression level that Arrow will use for the codec if None is specified.
- Parameters
compression (str) – Type of compression codec, refer to Codec docstring for a list of supported ones.
-
static
detect
(path)¶ Detect and instantiate compression codec based on file extension.
- Parameters
path (str, path-like) – File-path to detect compression from.
- Raises
TypeError – If the passed value is not path-like.
ValueError – If the compression can’t be detected from the path.
- Returns
Codec
-
static
is_available
(unicode compression)¶ Returns whether the compression support has been built and enabled.
- Parameters
compression (str) – Type of compression codec, refer to Codec docstring for a list of supported ones.
- Returns
bool
-
static
maximum_compression_level
(unicode compression)¶ Returns the largest valid value for the compression level
- Parameters
compression (str) – Type of compression codec, refer to Codec docstring for a list of supported ones.
-
static
minimum_compression_level
(unicode compression)¶ Returns the smallest valid value for the compression level
- Parameters
compression (str) – Type of compression codec, refer to Codec docstring for a list of supported ones.
-
name
¶ Returns the name of the codec
-
static
supports_compression_level
(unicode compression)¶ Returns true if the compression level parameter is supported for the given codec.
- Parameters
compression (str) – Type of compression codec, refer to Codec docstring for a list of supported ones.