pyarrow.Codec¶
- class pyarrow.Codec(unicode compression, compression_level=None)¶
Bases:
_Weakrefable
Compression codec.
- Parameters:
- compression
str
Type of compression codec to initialize, valid values are: ‘gzip’, ‘bz2’, ‘brotli’, ‘lz4’ (or ‘lz4_frame’), ‘lz4_raw’, ‘zstd’ and ‘snappy’.
- compression_level
int
,None
Optional parameter specifying how aggressively to compress. The possible ranges and effect of this parameter depend on the specific codec chosen. Higher values compress more but typically use more resources (CPU/RAM). Some codecs support negative values.
- gzip
The compression_level maps to the memlevel parameter of deflateInit2. Higher levels use more RAM but are faster and should have higher compression ratios.
- bz2
The compression level maps to the blockSize100k parameter of the BZ2_bzCompressInit function. Higher levels use more RAM but are faster and should have higher compression ratios.
- brotli
The compression level maps to the BROTLI_PARAM_QUALITY parameter. Higher values are slower and should have higher compression ratios.
- lz4/lz4_frame/lz4_raw
The compression level parameter is not supported and must be None
- zstd
The compression level maps to the compressionLevel parameter of ZSTD_initCStream. Negative values are supported. Higher values are slower and should have higher compression ratios.
- snappy
The compression level parameter is not supported and must be None
- compression
- Raises:
ValueError
If invalid compression value is passed.
Examples
>>> import pyarrow as pa >>> pa.Codec.is_available('gzip') True >>> codec = pa.Codec('gzip') >>> codec.name 'gzip' >>> codec.compression_level 9
- __init__(*args, **kwargs)¶
Methods
__init__
(*args, **kwargs)compress
(self, buf[, asbytes, memory_pool])Compress data from buffer-like object.
decompress
(self, buf[, decompressed_size, ...])Decompress data from buffer-like object.
default_compression_level
(unicode compression)Returns the compression level that Arrow will use for the codec if None is specified.
detect
(path)Detect and instantiate compression codec based on file extension.
is_available
(unicode compression)Returns whether the compression support has been built and enabled.
maximum_compression_level
(unicode compression)Returns the largest valid value for the compression level
minimum_compression_level
(unicode compression)Returns the smallest valid value for the compression level
supports_compression_level
(unicode compression)Returns true if the compression level parameter is supported for the given codec.
Attributes
Returns the compression level parameter of the codec
Returns the name of the codec
- compress(self, buf, asbytes=False, memory_pool=None)¶
Compress data from buffer-like object.
- Parameters:
- buf
pyarrow.Buffer
,bytes
, or other object supporting buffer protocol - asbytesbool, default
False
Return result as Python bytes object, otherwise Buffer
- memory_pool
MemoryPool
, defaultNone
Memory pool to use for buffer allocations, if any
- buf
- Returns:
- compressed
pyarrow.Buffer
orbytes
(if asbytes=True)
- compressed
- compression_level¶
Returns the compression level parameter of the codec
- decompress(self, buf, decompressed_size=None, asbytes=False, memory_pool=None)¶
Decompress data from buffer-like object.
- Parameters:
- buf
pyarrow.Buffer
,bytes
, or memoryview-compatible object - decompressed_size
int
, defaultNone
If not specified, will be computed if the codec is able to determine the uncompressed buffer size.
- asbytesbool, default
False
Return result as Python bytes object, otherwise Buffer
- memory_pool
MemoryPool
, defaultNone
Memory pool to use for buffer allocations, if any.
- buf
- Returns:
- uncompressed
pyarrow.Buffer
orbytes
(if asbytes=True)
- uncompressed
- static default_compression_level(unicode compression)¶
Returns the compression level that Arrow will use for the codec if None is specified.
- Parameters:
- compression
str
Type of compression codec, refer to Codec docstring for a list of supported ones.
- compression
- static detect(path)¶
Detect and instantiate compression codec based on file extension.
- Parameters:
- path
str
, path-like File-path to detect compression from.
- path
- Returns:
- Raises:
TypeError
If the passed value is not path-like.
ValueError
If the compression can’t be detected from the path.
- static is_available(unicode compression)¶
Returns whether the compression support has been built and enabled.
- static maximum_compression_level(unicode compression)¶
Returns the largest valid value for the compression level
- Parameters:
- compression
str
Type of compression codec, refer to Codec docstring for a list of supported ones.
- compression
- static minimum_compression_level(unicode compression)¶
Returns the smallest valid value for the compression level
- Parameters:
- compression
str
Type of compression codec, refer to Codec docstring for a list of supported ones.
- compression
- name¶
Returns the name of the codec