Installing PyArrow#
System Compatibility#
PyArrow is regularly built and tested on Windows, macOS and various Linux distributions. We strongly recommend using a 64-bit system.
Python Compatibility#
PyArrow is currently compatible with Python 3.8, 3.9, 3.10 and 3.11.
Using Conda#
Install the latest version of PyArrow from conda-forge using Conda:
conda install -c conda-forge pyarrow
Using Pip#
Install the latest version from PyPI (Windows, Linux, and macOS):
pip install pyarrow
If you encounter any importing issues of the pip wheels on Windows, you may need to install the Visual C++ Redistributable for Visual Studio 2015.
Warning
On Linux, you will need pip >= 19.0 to detect the prebuilt binary packages.
Installing nightly packages or from source#
See Python Development.
Dependencies#
Required dependency
NumPy 1.16.6 or higher.
Optional dependencies
pandas 1.0 or higher,
cffi.
Additional packages PyArrow is compatible with are fsspec and pytz, dateutil or tzdata package for timezones.
tzdata on Windows#
While Arrow uses the OS-provided timezone database on Linux and macOS, it requires a user-provided database on Windows. To download and extract the text version of the IANA timezone database follow the instructions in the C++ Runtime Dependencies or use pyarrow utility function pyarrow.util.download_tzdata_on_windows() that does the same.
By default, the timezone database will be detected at %USERPROFILE%\Downloads\tzdata
.
If the database has been downloaded in a different location, you will need to set
a custom path to the database from Python:
>>> import pyarrow as pa
>>> pa.set_timezone_db_path("custom_path")