Expand description
Pass Arrow objects from and to PyArrow, using Arrow’s C Data Interface and pyo3.
For underlying implementation, see the [ffi] module.
One can use these to write Python functions that take and return PyArrow objects, with automatic conversion to corresponding arrow-rs types.
#[pyfunction]
fn double_array(array: PyArrowType<ArrayData>) -> PyResult<PyArrowType<ArrayData>> {
let array = array.0; // Extract from PyArrowType wrapper
let array: Arc<dyn Array> = make_array(array); // Convert ArrayData to ArrayRef
let array: &Int32Array = array.as_any().downcast_ref()
.ok_or_else(|| PyValueError::new_err("expected int32 array"))?;
let array: Int32Array = array.iter().map(|x| x.map(|x| x * 2)).collect();
Ok(PyArrowType(array.into_data()))
}
pyarrow type | arrow-rs type |
---|---|
pyarrow.DataType | [DataType] |
pyarrow.Field | [Field] |
pyarrow.Schema | [Schema] |
pyarrow.Array | [ArrayData] |
pyarrow.RecordBatch | [RecordBatch] |
pyarrow.RecordBatchReader | [ArrowArrayStreamReader] / Box<dyn RecordBatchReader + Send> (1) |
(1) pyarrow.RecordBatchReader
can be imported as [ArrowArrayStreamReader]. Either
[ArrowArrayStreamReader] or Box<dyn RecordBatchReader + Send>
can be exported
as pyarrow.RecordBatchReader
. (Box<dyn RecordBatchReader + Send>
is typically
easier to create.)
PyArrow has the notion of chunked arrays and tables, but arrow-rs doesn’t
have these same concepts. A chunked table is instead represented with
Vec<RecordBatch>
. A pyarrow.Table
can be imported to Rust by calling
pyarrow.Table.to_reader()
and then importing the reader as a [ArrowArrayStreamReader].
Structs§
- Arrow
Exception - A Rust type representing an exception defined in Python code.
- PyArrow
Type - A newtype wrapper for types implementing
FromPyArrow
orIntoPyArrow
.
Traits§
- From
PyArrow - Trait for converting Python objects to arrow-rs types.
- Into
PyArrow - Convert an arrow-rs type into a PyArrow object.
- ToPy
Arrow - Create a new PyArrow object from a arrow-rs type.
Functions§
Type Aliases§
- PyArrow
Exception - Represents an exception raised by PyArrow.