Apache Arrow 23.0.0 Release

Published 18 Jan 2026
By The Apache Arrow PMC (pmc)

The Apache Arrow team is pleased to announce the 23.0.0 release. This release covers over 3 months of development work and includes 336 resolved issues on 417 distinct commits from 71 distinct contributors. See the Install Page to learn how to get the libraries for your platform.

The release notes below are not exhaustive and only expose selected highlights of the release. Many other bugfixes and improvements have been made: we refer you to the complete changelog.

Community

As per our newly started tradition of rotating the PMC chair once a year Antoine Pitrou was elected as the new PMC chair and VP, succeeding Neal Richardson.

Thanks for your contributions and participation in the project!

Arrow Flight RPC Notes

An ODBC driver for Apache Arrow Flight SQL has been completed. Currently it is not packaged for release, but can be built from source.

C++ Notes

The C++ standard has been updated to C++ 20 GH-45885 and the minimum GCC to 8.

Some improvements to leverage C++ 20 GH-48592,

Compute

Graceful error handling for decimal binary arithmetic and comparison instead of firing confusing assertions. GH-35957
Fixed an issue where the MinMax kernel was emitting -inf/inf for all-NaN input. GH-46063
Avoid ZeroCopyCastExec when casting between Binary offset types to avoid high overheads. GH-43660
Enhanced type checking for hash join residual filter in Acero. GH-48268

Format

Clarified that empty compressed buffers can omit the length header. GH-47918

Parquet

A new setting to limit the number of rows written per page has been added. GH-47030
A arrow::Result version of parquet::arrow::FileReader::Make() has been added. GH-44810
Support for reading INT-encoded Decimal statistics as Arrow scalars. GH-47955

Several bug fixes including:

Fixed invalid Parquet files written when dictionary encoded pages are large. GH-47973
Fixed pre-1970 INT96 timestamps roundtrip. GH-48246
Fixed potential crash when reading invalid Parquet data. GH-48308
Added compatibility with non-compliant RLE streams. GH-47981
Fixed Util & Level Conversion logic on big-endian systems. GH-48218

Encryption

Simplified nested field encryption configuration. GH-41246
Improved column encryption API. GH-48337
Better fuzzing support for encrypted files. GH-48335

Miscellaneous C++ changes

Added support for CUDA 13 GH-47677
Drop support for gold linker GH-45484
Leverage CMake 3.25 upgrade by reducing complexity and maintenance burden on our third party dependency management GH-48317, GH-48316, GH-48315, GH-48248, GH-48181, GH-48178, GH-48091, GH-48074

Linux Packaging Notes

Fixed a bug that the parquet-devel RPM package depends on parquet-glib-devel.

MATLAB Notes

Added support for building against MATLAB R2025b GH-48154.

Python Notes

Compatibility notes

Deprecated Array.format is removed GH-48102.
Experimental tag has been removed for Arrow PyCapsule Interface GH-47975.
PyWeakref_GetRef has replaced the use of PyWeakref_GET_OBJECT to support Python 3.15 GH-47823.

New features

Bindings for scatter and inverse_permutationare added GH-48167.
max_rows_per_page argument is now exposed in parquet.WriterProperties GH-48096.
External key material and rotation is enabled for individual Parquet files GH-31869.

Other improvements

Nested field encryption configuration has been simplified GH-41246.
Reading INT-encoded Decimal statistics with StatisticsAsScalars is now supported GH-47955.
Unsigned dictionary indices are now supported in pandas conversion GH-47022.
Added code examples for compute functions min, max and min_max GH-48668.
Add temporal unit checking in NumPyDtypeUnifier GH-48625
Error message is improved when mixing numpy.datetime64 values with different units (e.g., datetime64[s] and datetime64[ms]) in a single array GH-48463.
The source argument is now checked in pyarrow.parquet.read_table GH-47728.

Relevant bug fixes

ipc.Message __repr__ has been corrected to use f-string GH-48608.
Failures when reading parquet files written with non-compliant RLE encoders have been fixed in C++ with adding compatibility GH-47981.
Memory usage is now reduced when using to_pandas() with many extension arrays columns GH-47861.
Missing required argument error in FSSpecHandler delete_root_dir_contents has been fixed GH-47559.
Invalid RecordBatch.from_struct_array batch for sliced arrays with offset zero has been fixed in the C++ GH-44318.

R Notes

Compatibility notes

GCS have been turned off by default GH-48342.
OpenSSL 1.x builds have been removed GH-45449

Relevant bug fixes

Fixed a segfault that could be raised when concatenatig tables GH-47000.

Several Continuous integration fixes and minor bugs have also been added to the release for a full list check the release notes.

Ruby and C GLib Notes

All missing compute function options have been added. So we can use all compute functions from Ruby and C GLib. This is done by Sten Larsson.

Fixed size list array support has been added.

Ruby

Experimental Pure Ruby Apache Arrow reader implementation has been added as red-arrow-format gem.

C GLib

We use Arrow-${MAJOR}.${MINOR}.{gir,typelib} not Arrow-1.0.{gir,typelib} for .gir and .typelib file names. It's for co-existent multiple C GLib versions in the same system.

Java, JavaScript, Go, .NET, Swift and Rust Notes

The Java, JavaScript, Go, .NET, Swift and Rust projects have moved to separate repositories outside the main Arrow monorepo.

For notes on the latest release of the Java implementation, see the latest Arrow Java changelog.
For notes on the latest release of the JavaScript implementation, see the latest Arrow JavaScript changelog.
For notes on the latest release of the Rust implementation see the latest Arrow Rust changelog.
For notes on the latest release of the Go implementation, see the latest Arrow Go changelog.
For notes on the latest release of the .NET implementation, see the latest Arrow .NET changelog.
For notes on the latest release of the Swift implementation, see the latest Arrow Swift changelog.