pyarrow.PythonFile#
- class pyarrow.PythonFile#
Bases:
NativeFile
A stream backed by a Python file object.
This class allows using Python file objects with arbitrary Arrow functions, including functions written in another language than Python.
As a downside, there is a non-zero redirection cost in translating Arrow stream calls to Python method calls. Furthermore, Python’s Global Interpreter Lock may limit parallelism in some situations.
Examples
>>> import io >>> import pyarrow as pa >>> pa.PythonFile(io.BytesIO()) <pyarrow.PythonFile closed=False own_file=False is_seekable=False is_writable=True is_readable=False>
Create a stream for writing:
>>> buf = io.BytesIO() >>> f = pa.PythonFile(buf, mode = 'w') >>> f.writable() True >>> f.write(b'PythonFile') 10 >>> buf.getvalue() b'PythonFile' >>> f.close() >>> f <pyarrow.PythonFile closed=True own_file=False is_seekable=False is_writable=True is_readable=False>
Create a stream for reading:
>>> buf = io.BytesIO(b'PythonFile') >>> f = pa.PythonFile(buf, mode = 'r') >>> f.mode 'rb' >>> f.read() b'PythonFile' >>> f <pyarrow.PythonFile closed=False own_file=False is_seekable=True is_writable=False is_readable=True> >>> f.close() >>> f <pyarrow.PythonFile closed=True own_file=False is_seekable=True is_writable=False is_readable=True>
- __init__(*args, **kwargs)#
Methods
__init__
(*args, **kwargs)close
(self)download
(self, stream_or_path[, buffer_size])Read this file completely to a local path or destination stream.
fileno
(self)NOT IMPLEMENTED
flush
(self)Flush the stream, if applicable.
get_stream
(self, file_offset, nbytes)Return an input stream that reads a file segment independent of the state of the file.
isatty
(self)metadata
(self)Return file metadata
read
(self[, nbytes])Read and return up to n bytes.
read1
(self[, nbytes])Read and return up to n bytes.
read_at
(self, nbytes, offset)Read indicated number of bytes at offset from the file
read_buffer
(self[, nbytes])Read from buffer.
readable
(self)readall
(self)readinto
(self, b)Read into the supplied buffer
readline
(self[, size])Read and return a line of bytes from the file.
readlines
(self[, hint])Read lines of the file.
seek
(self, int64_t position, int whence=0)Change current file stream position
seekable
(self)size
(self)Return file size
tell
(self)Return current stream position
truncate
(self[, pos])- Parameters:
upload
(self, stream[, buffer_size])Write from a source stream to this file.
writable
(self)write
(self, data)Write data to the file.
writelines
(self, lines)Write lines to the file.
Attributes
The file mode.
- close(self)#
- closed#
- download(self, stream_or_path, buffer_size=None)#
Read this file completely to a local path or destination stream.
This method first seeks to the beginning of the file.
- fileno(self)#
NOT IMPLEMENTED
- flush(self)#
Flush the stream, if applicable.
An error is raised if stream is not writable.
- get_stream(self, file_offset, nbytes)#
Return an input stream that reads a file segment independent of the state of the file.
Allows reading portions of a random access file as an input stream without interfering with each other.
- Parameters:
- Returns:
- stream
NativeFile
- stream
- isatty(self)#
- metadata(self)#
Return file metadata
- mode#
The file mode. Currently instances of NativeFile may support:
rb: binary read
wb: binary write
rb+: binary read and write
ab: binary append
- read(self, nbytes=None)#
Read and return up to n bytes.
If nbytes is None, then the entire remaining file contents are read.
- read1(self, nbytes=None)#
Read and return up to n bytes.
Unlike read(), if nbytes is None then a chunk is read, not the entire file.
- read_at(self, nbytes, offset)#
Read indicated number of bytes at offset from the file
- read_buffer(self, nbytes=None)#
Read from buffer.
- Parameters:
- nbytes
int
, optional maximum number of bytes read
- nbytes
- readable(self)#
- readall(self)#
- readinto(self, b)#
Read into the supplied buffer
- Parameters:
- bbuffer-like object
A writable buffer object (such as a bytearray).
- Returns:
- written
int
number of bytes written
- written
- readline(self, size=None)#
Read and return a line of bytes from the file.
If size is specified, read at most size bytes.
- Parameters:
- size
int
Maximum number of bytes read
- size
- readlines(self, hint=None)#
Read lines of the file.
- Parameters:
- hint
int
Maximum number of bytes read until we stop
- hint
- seek(self, int64_t position, int whence=0)#
Change current file stream position
- Parameters:
- Returns:
int
The new absolute stream position.
Notes
Values of whence: * 0 – start of stream (the default); offset should be zero or positive * 1 – current stream position; offset may be negative * 2 – end of stream; offset is usually negative
- seekable(self)#
- size(self)#
Return file size
- tell(self)#
Return current stream position
- upload(self, stream, buffer_size=None)#
Write from a source stream to this file.
- Parameters:
- streamfile-like object
Source stream to pipe to this file.
- buffer_size
int
, optional The buffer size to use for data transfers.
- writable(self)#