pyarrow.PythonFile

class pyarrow.PythonFile

Bases: pyarrow.lib.NativeFile

A stream backed by a Python file object.

This class allows using Python file objects with arbitrary Arrow functions, including functions written in another language than Python.

As a downside, there is a non-zero redirection cost in translating Arrow stream calls to Python method calls. Furthermore, Python’s Global Interpreter Lock may limit parallelism in some situations.

__init__()

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__()

Initialize self.

close(self)

download(self, stream_or_path[, buffer_size])

Read file completely to local path (rather than reading completely into memory).

fileno(self)

NOT IMPLEMENTED

flush(self)

Flush the stream, if applicable.

isatty(self)

metadata(self)

Return file metadata

read(self[, nbytes])

Read indicated number of bytes from file, or read all remaining bytes if no argument passed

read1(self[, nbytes])

Read and return up to n bytes.

read_at(self, nbytes, offset)

Read indicated number of bytes at offset from the file

read_buffer(self[, nbytes])

readable(self)

readall(self)

readinto(self, b)

Read into the supplied buffer

readline(self[, size])

readlines(self[, hint])

seek(self, int64_t position, int whence=0)

Change current file stream position

seekable(self)

size(self)

Return file size

tell(self)

Return current stream position

truncate(self[, pos])

upload(self, stream[, buffer_size])

Pipe file-like object to file

writable(self)

write(self, data)

Write byte from any object implementing buffer protocol (bytes, bytearray, ndarray, pyarrow.Buffer)

writelines(self, lines)

Attributes

closed

mode

The file mode.

close(self)
closed
download(self, stream_or_path, buffer_size=None)

Read file completely to local path (rather than reading completely into memory). First seeks to the beginning of the file.

fileno(self)

NOT IMPLEMENTED

flush(self)

Flush the stream, if applicable.

An error is raised if stream is not writable.

isatty(self)
metadata(self)

Return file metadata

mode

The file mode. Currently instances of NativeFile may support:

  • rb: binary read

  • wb: binary write

  • rb+: binary read and write

read(self, nbytes=None)

Read indicated number of bytes from file, or read all remaining bytes if no argument passed

Parameters

nbytes (int, default None) –

Returns

data (bytes)

read1(self, nbytes=None)

Read and return up to n bytes.

Alias for read, needed to match the IOBase interface.

read_at(self, nbytes, offset)

Read indicated number of bytes at offset from the file

Parameters
  • nbytes (int) –

  • offset (int) –

Returns

data (bytes)

read_buffer(self, nbytes=None)
readable(self)
readall(self)
readinto(self, b)

Read into the supplied buffer

Parameters

b (any python object supporting buffer interface) –

Returns

number of bytes written

readline(self, size=None)
readlines(self, hint=None)
seek(self, int64_t position, int whence=0)

Change current file stream position

Parameters
  • position (int) – Byte offset, interpreted relative to value of whence argument

  • whence (int, default 0) – Point of reference for seek offset

Notes

Values of whence: * 0 – start of stream (the default); offset should be zero or positive * 1 – current stream position; offset may be negative * 2 – end of stream; offset is usually negative

Returns

new_position (the new absolute stream position)

seekable(self)
size(self)

Return file size

tell(self)

Return current stream position

truncate(self, pos=None)
upload(self, stream, buffer_size=None)

Pipe file-like object to file

writable(self)
write(self, data)

Write byte from any object implementing buffer protocol (bytes, bytearray, ndarray, pyarrow.Buffer)

Parameters

data (bytes-like object or exporter of buffer protocol) –

Returns

nbytes (number of bytes written)

writelines(self, lines)