Apache Arrow 24.0.0 Release


Published 21 Apr 2026
By The Apache Arrow PMC (pmc)

The Apache Arrow team is pleased to announce the 24.0.0 release. This release covers over 3 months of development work and includes 259 resolved issues on 325 distinct commits from 57 distinct contributors. See the Install Page to learn how to get the libraries for your platform.

The release notes below are not exhaustive and only expose selected highlights of the release. Many other bugfixes and improvements have been made: we refer you to the complete changelog.

Community

We recently published our Community Highlights for 2025, check those out.

Thanks everyone for your contributions and participation in the project!

Format Notes

We have written a project-wide Security Model outlining what users should expect when dealing with Arrow data, especially coming from untrusted sources GH-48868.

Arrow Flight RPC Notes

The ODBC driver is still a work-in-progress. The driver now builds on Linux, but currently no builds are distributed (for any platform) (GH-49463).

In C++, we have refactored serialization/deserialization to make low-level functionality accessible for advanced usage (GH-49548).

C++ Notes

In addition to the aforementioned project-wide Security Model, we have written a specific Security Model for Arrow C++ covering more concrete topics such as API usage and parameter validity GH-49274.

Compute

Extension Types

The canonical type VariableShapeTensor was finally implemented GH-38007.

Parquet

Breaking change: The Arrow extension type name for Parquet Variant columns used to be parquet.variant but has been changed to arrow.parquet.variant GH-49081.

While Parquet C++ could only read unencrypted bloom filters, it now supports reading encrypted bloom filters as well GH-48334. In addition, it can also write bloom filters, though only unencrypted GH-34785.

An ambitious rewrite of the bit-unpacking utilities and optimizations has led to significant performance improvements on reading some Parquet columns, up to 50% faster in some cases GH-48277. This rewrite is described in more detail in an accompanying blog post.

The performance of reading DELTA_BINARY_PACKED-encoded integers has been improved in some favorable cases GH-49266.

Miscellaneous C++ changes

We have migrated to C++20 std::span, removing our home-grown implementation in arrow::util::span GH-48588.

A bunch of previously deprecated APIs have been removed GH-49356.

Linux Packaging Notes

Added support for Ubuntu 26.04, the next LTS GH-49341

MATLAB Notes

No major notes for this release on MATLAB.

Python Notes

Compatibility notes

  • pyarrow.gandiva is deprecated and will be removed in a future version GH-49227

New features

  • Type annotations work is starting to be included (GH-49102 and GH-49452)
  • Basic arithmetic on arrays and scalars is now supported GH-32007
  • Options to control writing of Parquet Bloom filters are added to parquet.write_table GH-49376
  • OpenTelemetry is enabled in PyArrow wheels GH-49382
  • AzureFileSystem is now included in the Windows wheels GH-44655

Other improvements

  • Scikit-build-core is now used as the PyArrow build system GH-36411
  • UUID objects are now inferred automatically in pa.scalar() and pa.array() without the need to specify the type explicitly GH-48241
  • Constructing an extension array via pa.array() from a list of extension-type scalars is now supported GH-48470
  • There have been some improvements in the documentation (GH-49278, GH-49269 and GH-28859)
  • CSV and JSON options have improved repr/str methods GH-47389

Relevant bug fixes

  • SparseCOOTensor.__repr__ missing f-string prefix is now fixed GH-49108
  • Pickling SubTreeFileSystem(base_path, AzureFileSystem(...)) is fixed GH-49078
  • Casting from StringArray to pandas 3.* when element is None is fixed GH-49002
  • Dictionary key order is now preserved when inferring struct type GH-40053
  • Duplicate csv header when table batches start with empty is now fixed GH-36889

R Notes

New Features

Compatibility notes

  • Arrow no longer builds with GCS enabled on CRAN to avoid failures in their build systems. If you would like a full-featured build of Arrow, we recommend installing from R-universe; see the Using cloud storage article in the docs for more information. GH-49067

Relevant bug fixes

  • to_arrow() now retains grouping GH-40640

Ruby and C GLib Notes

  • Fixed GC related problems.
  • GArrowListArray: Added support for returning offset buffer.
  • GArrowLargeListArray: Added support for returning offset buffer.
  • GArrowUnionArray: Added support for returning fields.
  • Deprecated Feather features.

Ruby

We've added pure Ruby Apache Arrow writer implementation to the red-arrow-format gem.

We've marked pure Ruby Apache Arrow reader implementation in the red-arrow-formatgem as stable because it passes integration tests with other implementations. But it still has some missing features.

The red-arrow gem:

  • Add support for converting to raw Ruby objects of the following arrays:
    • Arrow::LargeBinaryArray
    • Arrow::LargeUTF8Array
    • Arrow::LargeListArray
    • Arrow::FixedSizeListArray
    • Arrow::DurationArray
    • Arrow::DictionaryArray with Arrow::LargeBinaryArray or Arrow::LargeUTF8Array

C GLib

No C GLib only notes.

Java, JavaScript, Go, .NET, Swift and Rust Notes

The Java, JavaScript, Go, .NET, Swift and Rust projects have moved to separate repositories outside the main Arrow monorepo.