PyArrow allows converting back and forth from NumPy arrays to Arrow Arrays.
NumPy to Arrow¶
To convert a NumPy array to Arrow, one can simply call the
>>> import numpy as np >>> import pyarrow as pa >>> data = np.arange(10, dtype='int16') >>> arr = pa.array(data) >>> arr <pyarrow.lib.Int16Array object at 0x7fb1d1e6ae58> [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ]
Converting from NumPy supports a wide range of input dtypes, including structured dtypes or strings.
Arrow to NumPy¶
In the reverse direction, it is possible to produce a view of an Arrow Array
for use with NumPy using the
This is limited to primitive types for which NumPy has the same physical
representation as Arrow, and assuming the Arrow data has no nulls.
>>> import numpy as np >>> import pyarrow as pa >>> arr = pa.array([4, 5, 6], type=pa.int32()) >>> view = arr.to_numpy() >>> view array([4, 5, 6], dtype=int32)
For more complex data types, you have to use the
method (which will construct a Numpy array with Pandas semantics for, e.g.,
representation of null values).