Expand description
Pass Arrow objects from and to PyArrow, using Arrow’s C Data Interface and pyo3.
For underlying implementation, see the [ffi] module.
One can use these to write Python functions that take and return PyArrow objects, with automatic conversion to corresponding arrow-rs types.
#[pyfunction]
fn double_array(array: PyArrowType<ArrayData>) -> PyResult<PyArrowType<ArrayData>> {
let array = array.0; // Extract from PyArrowType wrapper
let array: Arc<dyn Array> = make_array(array); // Convert ArrayData to ArrayRef
let array: &Int32Array = array.as_any().downcast_ref()
.ok_or_else(|| PyValueError::new_err("expected int32 array"))?;
let array: Int32Array = array.iter().map(|x| x.map(|x| x * 2)).collect();
Ok(PyArrowType(array.into_data()))
}| pyarrow type | arrow-rs type |
|---|---|
pyarrow.DataType | [DataType] |
pyarrow.Field | [Field] |
pyarrow.Schema | [Schema] |
pyarrow.Array | [ArrayData] |
pyarrow.RecordBatch | [RecordBatch] |
pyarrow.RecordBatchReader | [ArrowArrayStreamReader] / Box<dyn RecordBatchReader + Send> (1) |
pyarrow.Table | Table (2) |
(1) pyarrow.RecordBatchReader can be imported as [ArrowArrayStreamReader]. Either
[ArrowArrayStreamReader] or Box<dyn RecordBatchReader + Send> can be exported
as pyarrow.RecordBatchReader. (Box<dyn RecordBatchReader + Send> is typically
easier to create.)
(2) Although arrow-rs offers Table, a convenience wrapper for pyarrow.Table
that internally holds Vec<RecordBatch>, it is meant primarily for use cases where you already
have Vec<RecordBatch> on the Rust side and want to export that in bulk as a pyarrow.Table.
In general, it is recommended to use streaming approaches instead of dealing with data in bulk.
For example, a pyarrow.Table (or any other object that implements the ArrayStream PyCapsule
interface) can be imported to Rust through PyArrowType<ArrowArrayStreamReader> instead of
forcing eager reading into Vec<RecordBatch>.
Structs§
- Arrow
Exception - A Rust type representing an exception defined in Python code.
- PyArrow
Type - A newtype wrapper for types implementing
FromPyArroworIntoPyArrow. - Table
- This is a convenience wrapper around
Vec<RecordBatch>that tries to simplify conversion from and topyarrow.Table.
Traits§
- From
PyArrow - Trait for converting Python objects to arrow-rs types.
- Into
PyArrow - Convert an arrow-rs type into a PyArrow object.
- ToPy
Arrow - Create a new PyArrow object from a arrow-rs type.
Functions§
Type Aliases§
- PyArrow
Exception - Represents an exception raised by PyArrow.