Apache Arrow 0.9.0 Release
22 Mar 2018
By Wes McKinney (wesm)
The Apache Arrow team is pleased to announce the 0.9.0 release. It is the product of over 3 months of development and includes 260 resolved JIRAs.
While we made some of backwards-incompatible columnar binary format changes in last December’s 0.8.0 release, the 0.9.0 release is backwards-compatible with 0.8.0. We will be working toward a 1.0.0 release this year, which will mark longer-term binary stability for the Arrow columnar format and metadata.
We discuss some highlights from the release and other project news in this post. This release has been overall focused more on bug fixes, compatibility, and stability compared with previous releases which have pushed more on new and expanded features.
New Arrow committers and PMC members
Since the last release, we have added 2 new Arrow committers: Brian Hulette and Robert Nishihara. Additionally, Phillip Cloud and Philipp Moritz have been promoted from committer to PMC member. Congratulations and thank you for your contributions!
Plasma Object Store Improvements
The Plasma Object Store now supports managing interprocess shared memory on CUDA-enabled GPUs. We are excited to see more GPU-related functionality develop in Apache Arrow, as this has become a key computing environment for scalable machine learning.
Antoine Pitrou has joined the Python development efforts and helped significantly this release with interoperability with built-in CPython data structures and NumPy structured data types.
- New experimental support for reading Apache ORC files
pyarrow.arraynow accepts lists of tuples or Python dicts for creating Arrow struct type arrays.
- NumPy structured dtypes (which are row/record-oriented) can be directly converted to Arrow struct (column-oriented) arrays
- Python 3.6
pathlibobjects for file paths are now accepted in many file APIs, including for Parquet files
- Arrow integer arrays with nulls can now be converted to NumPy object arrays
pyarrow.foreign_bufferAPI for interacting with memory blocks located at particular memory addresses
Java now fully supports the
FixedSizeBinary data type.
making separate Apache releases (most recently
which are being published to NPM.
In the coming months, we will be working to move Apache Arrow closer to a 1.0.0 release. We will also be discussing plans to develop native Arrow-based computational libraries within the project.