pyarrow.CompressedInputStream#

class pyarrow.CompressedInputStream(stream, unicode compression)#

Bases: NativeFile

An input stream wrapper which decompresses data on the fly.

Parameters:
streamstr, path, pyarrow.NativeFile, or file-like object

Input stream object to wrap with the compression.

compressionstr

The compression type (“bz2”, “brotli”, “gzip”, “lz4” or “zstd”).

Examples

Create an output stream wich compresses the data:

>>> import pyarrow as pa
>>> data = b"Compressed stream"
>>> raw = pa.BufferOutputStream()
>>> with pa.CompressedOutputStream(raw, "gzip") as compressed:
...     compressed.write(data)
...
17

Create an input stream with decompression referencing the buffer with compressed data:

>>> cdata = raw.getvalue()
>>> with pa.input_stream(cdata, compression="gzip") as compressed:
...     compressed.read()
...
b'Compressed stream'

which actually translates to the use of BufferReader``and ``CompressedInputStream:

>>> raw = pa.BufferReader(cdata)
>>> with pa.CompressedInputStream(raw, "gzip") as compressed:
...     compressed.read()
...
b'Compressed stream'
__init__(*args, **kwargs)#

Methods

__init__(*args, **kwargs)

close(self)

download(self, stream_or_path[, buffer_size])

Read this file completely to a local path or destination stream.

fileno(self)

NOT IMPLEMENTED

flush(self)

Flush the stream, if applicable.

get_stream(self, file_offset, nbytes)

Return an input stream that reads a file segment independent of the state of the file.

isatty(self)

metadata(self)

Return file metadata

read(self[, nbytes])

Read and return up to n bytes.

read1(self[, nbytes])

Read and return up to n bytes.

read_at(self, nbytes, offset)

Read indicated number of bytes at offset from the file

read_buffer(self[, nbytes])

Read from buffer.

readable(self)

readall(self)

readinto(self, b)

Read into the supplied buffer

readline(self[, size])

NOT IMPLEMENTED.

readlines(self[, hint])

NOT IMPLEMENTED.

seek(self, int64_t position, int whence=0)

Change current file stream position

seekable(self)

size(self)

Return file size

tell(self)

Return current stream position

truncate(self)

NOT IMPLEMENTED

upload(self, stream[, buffer_size])

Write from a source stream to this file.

writable(self)

write(self, data)

Write data to the file.

writelines(self, lines)

Write lines to the file.

Attributes

closed

mode

The file mode.

close(self)#
closed#
download(self, stream_or_path, buffer_size=None)#

Read this file completely to a local path or destination stream.

This method first seeks to the beginning of the file.

Parameters:
stream_or_pathstr or file-like object

If a string, a local file path to write to; otherwise, should be a writable stream.

buffer_sizeint, optional

The buffer size to use for data transfers.

fileno(self)#

NOT IMPLEMENTED

flush(self)#

Flush the stream, if applicable.

An error is raised if stream is not writable.

get_stream(self, file_offset, nbytes)#

Return an input stream that reads a file segment independent of the state of the file.

Allows reading portions of a random access file as an input stream without interfering with each other.

Parameters:
file_offsetint
nbytesint
Returns:
streamNativeFile
isatty(self)#
metadata(self)#

Return file metadata

mode#

The file mode. Currently instances of NativeFile may support:

  • rb: binary read

  • wb: binary write

  • rb+: binary read and write

  • ab: binary append

read(self, nbytes=None)#

Read and return up to n bytes.

If nbytes is None, then the entire remaining file contents are read.

Parameters:
nbytesint, default None
Returns:
databytes
read1(self, nbytes=None)#

Read and return up to n bytes.

Unlike read(), if nbytes is None then a chunk is read, not the entire file.

Parameters:
nbytesint, default None

The maximum number of bytes to read.

Returns:
databytes
read_at(self, nbytes, offset)#

Read indicated number of bytes at offset from the file

Parameters:
nbytesint
offsetint
Returns:
databytes
read_buffer(self, nbytes=None)#

Read from buffer.

Parameters:
nbytesint, optional

maximum number of bytes read

readable(self)#
readall(self)#
readinto(self, b)#

Read into the supplied buffer

Parameters:
bbuffer-like object

A writable buffer object (such as a bytearray).

Returns:
writtenint

number of bytes written

readline(self, size=None)#

NOT IMPLEMENTED. Read and return a line of bytes from the file.

If size is specified, read at most size bytes.

Line terminator is always b”n”.

Parameters:
sizeint

maximum number of bytes read

readlines(self, hint=None)#

NOT IMPLEMENTED. Read lines of the file

Parameters:
hintint

maximum number of bytes read until we stop

seek(self, int64_t position, int whence=0)#

Change current file stream position

Parameters:
positionint

Byte offset, interpreted relative to value of whence argument

whenceint, default 0

Point of reference for seek offset

Returns:
int

The new absolute stream position.

Notes

Values of whence: * 0 – start of stream (the default); offset should be zero or positive * 1 – current stream position; offset may be negative * 2 – end of stream; offset is usually negative

seekable(self)#
size(self)#

Return file size

tell(self)#

Return current stream position

truncate(self)#

NOT IMPLEMENTED

upload(self, stream, buffer_size=None)#

Write from a source stream to this file.

Parameters:
streamfile-like object

Source stream to pipe to this file.

buffer_sizeint, optional

The buffer size to use for data transfers.

writable(self)#
write(self, data)#

Write data to the file.

Parameters:
databytes-like object or exporter of buffer protocol
Returns:
int

nbytes: number of bytes written

writelines(self, lines)#

Write lines to the file.

Parameters:
linesiterable

Iterable of bytes-like objects or exporters of buffer protocol