Crate arrow_pyarrow

Crate arrow_pyarrow 

Source
Expand description

Pass Arrow objects from and to PyArrow, using Arrow’s C Data Interface and pyo3.

For underlying implementation, see the [ffi] module.

One can use these to write Python functions that take and return PyArrow objects, with automatic conversion to corresponding arrow-rs types.

#[pyfunction]
fn double_array(array: PyArrowType<ArrayData>) -> PyResult<PyArrowType<ArrayData>> {
    let array = array.0; // Extract from PyArrowType wrapper
    let array: Arc<dyn Array> = make_array(array); // Convert ArrayData to ArrayRef
    let array: &Int32Array = array.as_any().downcast_ref()
        .ok_or_else(|| PyValueError::new_err("expected int32 array"))?;
    let array: Int32Array = array.iter().map(|x| x.map(|x| x * 2)).collect();
    Ok(PyArrowType(array.into_data()))
}
pyarrow typearrow-rs type
pyarrow.DataType[DataType]
pyarrow.Field[Field]
pyarrow.Schema[Schema]
pyarrow.Array[ArrayData]
pyarrow.RecordBatch[RecordBatch]
pyarrow.RecordBatchReader[ArrowArrayStreamReader] / Box<dyn RecordBatchReader + Send> (1)
pyarrow.TableTable (2)

(1) pyarrow.RecordBatchReader can be imported as [ArrowArrayStreamReader]. Either [ArrowArrayStreamReader] or Box<dyn RecordBatchReader + Send> can be exported as pyarrow.RecordBatchReader. (Box<dyn RecordBatchReader + Send> is typically easier to create.)

(2) Although arrow-rs offers Table, a convenience wrapper for pyarrow.Table that internally holds Vec<RecordBatch>, it is meant primarily for use cases where you already have Vec<RecordBatch> on the Rust side and want to export that in bulk as a pyarrow.Table. In general, it is recommended to use streaming approaches instead of dealing with data in bulk. For example, a pyarrow.Table (or any other object that implements the ArrayStream PyCapsule interface) can be imported to Rust through PyArrowType<ArrowArrayStreamReader> instead of forcing eager reading into Vec<RecordBatch>.

Structs§

ArrowException
A Rust type representing an exception defined in Python code.
PyArrowType
A newtype wrapper for types implementing FromPyArrow or IntoPyArrow.
Table
This is a convenience wrapper around Vec<RecordBatch> that tries to simplify conversion from and to pyarrow.Table.

Traits§

FromPyArrow
Trait for converting Python objects to arrow-rs types.
IntoPyArrow
Convert an arrow-rs type into a PyArrow object.
ToPyArrow
Create a new PyArrow object from a arrow-rs type.

Functions§

to_py_err 🔒
validate_class 🔒
validate_pycapsule 🔒

Type Aliases§

PyArrowException
Represents an exception raised by PyArrow.