Apache Arrow 6.0.1 Release


Published 22 Nov 2021
By The Apache Arrow PMC (pmc)

The Apache Arrow team is pleased to announce the 6.0.1 release. This is mostly a bugfix release that includes 30 resolved issues from 16 distinct contributors. See the Install Page to learn how to get the libraries for your platform.

The release notes below are not exhaustive and only expose selected highlights of the release. Many other bugfixes and improvements have been made: we refer you to the complete changelog.

Community

Since the 6.0.0 release, Joris Van den Bossche has joined the Project Management Committee (PMC). Thanks for your contributions and participation in the project!

Documentation

A version switcher is now available in the Python and C++ documentation to access different versions of the documentation.

C++ notes

  • Trying to join with a list column will now report an unsupported operation error instead of crashing.
  • Dictionaries are now supported as an input in hash joins
  • Fixed a potential data loss in S3 multipart upload

Python notes

  • Dataset api now supports existing_data_behavior option when writing datasets.
  • Installing pyarrow from source distribution now works with setuptools 58.5

R notes

  • Fixed a crash when summarizing after filtering to no rows
  • Added bindings for str_count

For more details, see the complete R changelog.

Go notes

  • Arrow and Parquet modules are properly brought together under a top level github.com/apache/arrow/go module which can be installed as github.com/apache/arrow/go/v6/arrow@v6.0.1.