Python#

PyArrow - Apache Arrow Python bindings#

This is the documentation of the Python API of Apache Arrow.

Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enable big data systems to store, process and move data fast.

See the parent documentation for additional details on the Arrow Project itself, on the Arrow format and the other language bindings.

The Arrow Python bindings (also named “PyArrow”) have first-class integration with NumPy, pandas, and built-in Python objects. They are based on the C++ implementation of Arrow.

Here we will detail the usage of the Python API for Arrow and the leaf libraries that add additional functionality such as reading Apache Parquet files into Arrow structures.