pyarrow.PythonFile#
- class pyarrow.PythonFile#
- Bases: - NativeFile- A stream backed by a Python file object. - This class allows using Python file objects with arbitrary Arrow functions, including functions written in another language than Python. - As a downside, there is a non-zero redirection cost in translating Arrow stream calls to Python method calls. Furthermore, Python’s Global Interpreter Lock may limit parallelism in some situations. - Examples - >>> import io >>> import pyarrow as pa >>> pa.PythonFile(io.BytesIO()) <pyarrow.PythonFile closed=False own_file=False is_seekable=False is_writable=True is_readable=False> - Create a stream for writing: - >>> buf = io.BytesIO() >>> f = pa.PythonFile(buf, mode = 'w') >>> f.writable() True >>> f.write(b'PythonFile') 10 >>> buf.getvalue() b'PythonFile' >>> f.close() >>> f <pyarrow.PythonFile closed=True own_file=False is_seekable=False is_writable=True is_readable=False> - Create a stream for reading: - >>> buf = io.BytesIO(b'PythonFile') >>> f = pa.PythonFile(buf, mode = 'r') >>> f.mode 'rb' >>> f.read() b'PythonFile' >>> f <pyarrow.PythonFile closed=False own_file=False is_seekable=True is_writable=False is_readable=True> >>> f.close() >>> f <pyarrow.PythonFile closed=True own_file=False is_seekable=True is_writable=False is_readable=True> - __init__(*args, **kwargs)#
 - Methods - __init__(*args, **kwargs)- close(self)- download(self, stream_or_path[, buffer_size])- Read this file completely to a local path or destination stream. - fileno(self)- NOT IMPLEMENTED - flush(self)- Flush the stream, if applicable. - get_stream(self, file_offset, nbytes)- Return an input stream that reads a file segment independent of the state of the file. - isatty(self)- metadata(self)- Return file metadata - read(self[, nbytes])- Read and return up to n bytes. - read1(self[, nbytes])- Read and return up to n bytes. - read_at(self, nbytes, offset)- Read indicated number of bytes at offset from the file - read_buffer(self[, nbytes])- Read from buffer. - readable(self)- readall(self)- readinto(self, b)- Read into the supplied buffer - readline(self[, size])- Read and return a line of bytes from the file. - readlines(self[, hint])- Read lines of the file. - seek(self, int64_t position, int whence=0)- Change current file stream position - seekable(self)- size(self)- Return file size - tell(self)- Return current stream position - truncate(self[, pos])- Parameters:
 - upload(self, stream[, buffer_size])- Write from a source stream to this file. - writable(self)- write(self, data)- Write data to the file. - writelines(self, lines)- Write lines to the file. - Attributes - close(self)#
 - closed#
 - download(self, stream_or_path, buffer_size=None)#
- Read this file completely to a local path or destination stream. - This method first seeks to the beginning of the file. 
 - fileno(self)#
- NOT IMPLEMENTED 
 - flush(self)#
- Flush the stream, if applicable. - An error is raised if stream is not writable. 
 - get_stream(self, file_offset, nbytes)#
- Return an input stream that reads a file segment independent of the state of the file. - Allows reading portions of a random access file as an input stream without interfering with each other. - Parameters:
- Returns:
- streamNativeFile
 
- stream
 
 - isatty(self)#
 - metadata(self)#
- Return file metadata 
 - mode#
- The file mode. Currently instances of NativeFile may support: - rb: binary read 
- wb: binary write 
- rb+: binary read and write 
- ab: binary append 
 
 - read(self, nbytes=None)#
- Read and return up to n bytes. - If nbytes is None, then the entire remaining file contents are read. 
 - read1(self, nbytes=None)#
- Read and return up to n bytes. - Unlike read(), if nbytes is None then a chunk is read, not the entire file. 
 - read_at(self, nbytes, offset)#
- Read indicated number of bytes at offset from the file 
 - read_buffer(self, nbytes=None)#
- Read from buffer. - Parameters:
- nbytesint, optional
- maximum number of bytes read 
 
- nbytes
 
 - readable(self)#
 - readall(self)#
 - readinto(self, b)#
- Read into the supplied buffer - Parameters:
- bbuffer-like object
- A writable buffer object (such as a bytearray). 
 
- Returns:
- writtenint
- number of bytes written 
 
- written
 
 - readline(self, size=None)#
- Read and return a line of bytes from the file. - If size is specified, read at most size bytes. - Parameters:
- sizeint
- Maximum number of bytes read 
 
- size
 
 - readlines(self, hint=None)#
- Read lines of the file. - Parameters:
- hintint
- Maximum number of bytes read until we stop 
 
- hint
 
 - seek(self, int64_t position, int whence=0)#
- Change current file stream position - Parameters:
- Returns:
- int
- The new absolute stream position. 
 
 - Notes - Values of whence: * 0 – start of stream (the default); offset should be zero or positive * 1 – current stream position; offset may be negative * 2 – end of stream; offset is usually negative 
 - seekable(self)#
 - size(self)#
- Return file size 
 - tell(self)#
- Return current stream position 
 - upload(self, stream, buffer_size=None)#
- Write from a source stream to this file. - Parameters:
- streamfile-like object
- Source stream to pipe to this file. 
- buffer_sizeint, optional
- The buffer size to use for data transfers. 
 
 
 - writable(self)#
 
 
    