pyarrow.PythonFile

class pyarrow.PythonFile

Bases: pyarrow.lib.NativeFile

A stream backed by a Python file object.

This class allows using Python file objects with arbitrary Arrow functions, including functions written in another language than Python.

As a downside, there is a non-zero redirection cost in translating Arrow stream calls to Python method calls. Furthermore, Python’s Global Interpreter Lock may limit parallelism in some situations.

__init__(*args, **kwargs)

Methods

__init__(*args, **kwargs)

close(self)

download(self, stream_or_path[, buffer_size])

Read file completely to local path (rather than reading completely into memory).

fileno(self)

NOT IMPLEMENTED

flush(self)

Flush the stream, if applicable.

isatty(self)

metadata(self)

Return file metadata

read(self[, nbytes])

Read indicated number of bytes from file, or read all remaining bytes if no argument passed

read1(self[, nbytes])

Read and return up to n bytes.

read_at(self, nbytes, offset)

Read indicated number of bytes at offset from the file

read_buffer(self[, nbytes])

readable(self)

readall(self)

readinto(self, b)

Read into the supplied buffer

readline(self[, size])

readlines(self[, hint])

seek(self, int64_t position, int whence=0)

Change current file stream position

seekable(self)

size(self)

Return file size

tell(self)

Return current stream position

truncate(self[, pos])

upload(self, stream[, buffer_size])

Pipe file-like object to file

writable(self)

write(self, data)

Write byte from any object implementing buffer protocol (bytes, bytearray, ndarray, pyarrow.Buffer)

writelines(self, lines)

Attributes

closed

mode

The file mode.

close(self)
closed
download(self, stream_or_path, buffer_size=None)

Read file completely to local path (rather than reading completely into memory). First seeks to the beginning of the file.

fileno(self)

NOT IMPLEMENTED

flush(self)

Flush the stream, if applicable.

An error is raised if stream is not writable.

isatty(self)
metadata(self)

Return file metadata

mode

The file mode. Currently instances of NativeFile may support:

  • rb: binary read

  • wb: binary write

  • rb+: binary read and write

read(self, nbytes=None)

Read indicated number of bytes from file, or read all remaining bytes if no argument passed

Parameters
nbytesint, default None
Returns
databytes
read1(self, nbytes=None)

Read and return up to n bytes.

Alias for read, needed to match the IOBase interface.

read_at(self, nbytes, offset)

Read indicated number of bytes at offset from the file

Parameters
nbytesint
offsetint
Returns
databytes
read_buffer(self, nbytes=None)
readable(self)
readall(self)
readinto(self, b)

Read into the supplied buffer

Parameters
b: any python object supporting buffer interface
Returns
int

number of bytes written

readline(self, size=None)
readlines(self, hint=None)
seek(self, int64_t position, int whence=0)

Change current file stream position

Parameters
positionint

Byte offset, interpreted relative to value of whence argument

whenceint, default 0

Point of reference for seek offset

Returns
int

The new absolute stream position.

Notes

Values of whence: * 0 – start of stream (the default); offset should be zero or positive * 1 – current stream position; offset may be negative * 2 – end of stream; offset is usually negative

seekable(self)
size(self)

Return file size

tell(self)

Return current stream position

truncate(self, pos=None)
upload(self, stream, buffer_size=None)

Pipe file-like object to file

writable(self)
write(self, data)

Write byte from any object implementing buffer protocol (bytes, bytearray, ndarray, pyarrow.Buffer)

Parameters
databytes-like object or exporter of buffer protocol
Returns
int

nbytes: number of bytes written

writelines(self, lines)