Apache Arrow 0.12.0 Release

Published 21 Jan 2019
By Wes McKinney ()

The Apache Arrow team is pleased to announce the 0.12.0 release. This is the largest release yet in the project, covering 3 months of development work and includes 614 resolved issues from 77 distinct contributors.

See the Install Page to learn how to get the libraries for your platform. The complete changelog is also available.

It’s a huge release, but we’ll give some brief highlights and new from the project to help guide you to the parts of the project that may be of interest.

New committers and PMC member

The Arrow team is growing! Since the 0.11.0 release we have added 3 new committers:

We also pleased to announce that Krisztián Szűcs has been promoted from committer to PMC (Project Management Committee) member.

Thank you for all your contributions!

Code donations

Since the last release, we have received 3 code donations into the Apache project.

We are excited to continue to grow the Apache Arrow development community.

Combined project-level documentation

Since the last release, we have merged the Python and C++ documentation to create a combined project-wide documentation site: https://arrow.apache.org/docs. There is now some prose documentation about many parts of the C++ library. We intend to keep adding documentation for other parts of Apache Arrow to this site.

Packages

We start providing the official APT and Yum repositories for C++ and GLib (C). See the install document for details.

C++ notes

Much of the C++ development work the last 3 months concerned internal code refactoring and performance improvements. Some user-visible highlights of note:

Since the LLVM-based Gandiva expression compiler was donated to Apache Arrow during the last release cycle, development there has been moving along. We expect to have Windows support for Gandiva and to ship this in downstream packages (like Python) in the 0.13 release time frame.

Go notes

The Arrow Go development team has been expanding. The Go library has gained support for many missing features from the columnar format as well as semantic constructs like chunked arrays and tables that are used heavily in the C++ project.

GLib and Ruby notes

Development of the GLib-based C bindings and corresponding Ruby interfaces have advanced in lock-step with the C++, Python, and R libraries. In this release, there are many new features in C and Ruby:

Python notes

We fixed a ton of bugs and made many improvements throughout the Python project. Some highlights from the Python side include:

R notes

The R library made huge progress in 0.12, with work led by new committer Romain Francois. The R project’s features are not far behind the Python library, and we are hoping to be able to make the R library available to CRAN users for use with Apache Spark or for reading and writing Parquet files over the next quarter.

Users of the feather R library will see significant speed increases in many cases when reading Feather files with the new Arrow R library.

Rust notes

Rust development had an active last 3 months; see the changelog for details.

A native Rust implementation was just donated to the project, and the community intends to provide a similar level of functionality for reading and writing Parquet files using the Arrow in-memory columnar format as an intermediary.

Upcoming Roadmap, Outlook for 2019

Apache Arrow has become a large, diverse open source project. It is now being used in dozens of downstream open source and commercial projects. Work will be proceeding in many areas in 2019:

It promises to be an exciting 2019. We look forward to having you involved in the development community.