Installing PyArrow#

System Compatibility#

PyArrow is regularly built and tested on Windows, macOS and various Linux distributions. We strongly recommend using a 64-bit system.

Python Compatibility#

PyArrow is currently compatible with Python 3.8, 3.9, 3.10 and 3.11.

Using Conda#

Install the latest version of PyArrow from conda-forge using Conda:

conda install -c conda-forge pyarrow

Using Pip#

Install the latest version from PyPI (Windows, Linux, and macOS):

pip install pyarrow

If you encounter any importing issues of the pip wheels on Windows, you may need to install the Visual C++ Redistributable for Visual Studio 2015.

Warning

On Linux, you will need pip >= 19.0 to detect the prebuilt binary packages.

Installing nightly packages or from source#

See Python Development.

Dependencies#

Required dependency

  • NumPy 1.16.6 or higher.

Optional dependencies

  • pandas 1.0 or higher,

  • cffi.

Additional packages PyArrow is compatible with are fsspec and pytz, dateutil or tzdata package for timezones.

tzdata on Windows#

While Arrow uses the OS-provided timezone database on Linux and macOS, it requires a user-provided database on Windows. To download and extract the text version of the IANA timezone database follow the instructions in the C++ Runtime Dependencies or use pyarrow utility function pyarrow.util.download_tzdata_on_windows() that does the same.

By default, the timezone database will be detected at %USERPROFILE%\Downloads\tzdata. If the database has been downloaded in a different location, you will need to set a custom path to the database from Python:

>>> import pyarrow as pa
>>> pa.set_timezone_db_path("custom_path")