Apache Arrow 23.0.0 Release
Published
18 Jan 2026
By
The Apache Arrow PMC (pmc)
The Apache Arrow team is pleased to announce the 23.0.0 release. This release covers over 3 months of development work and includes 336 resolved issues on 417 distinct commits from 71 distinct contributors. See the Install Page to learn how to get the libraries for your platform.
The release notes below are not exhaustive and only expose selected highlights of the release. Many other bugfixes and improvements have been made: we refer you to the complete changelog.
Community
As per our newly started tradition of rotating the PMC chair once a year Antoine Pitrou was elected as the new PMC chair and VP, succeeding Neal Richardson.
Thanks for your contributions and participation in the project!
Arrow Flight RPC Notes
An ODBC driver for Apache Arrow Flight SQL has been completed. Currently it is not packaged for release, but can be built from source.
C++ Notes
The C++ standard has been updated to C++ 20 GH-45885 and the minimum GCC to 8.
Some improvements to leverage C++ 20 GH-48592,
Compute
- Graceful error handling for decimal binary arithmetic and comparison instead of firing confusing assertions. GH-35957
- Fixed an issue where the MinMax kernel was emitting -inf/inf for all-NaN input. GH-46063
- Avoid ZeroCopyCastExec when casting between Binary offset types to avoid high overheads. GH-43660
- Enhanced type checking for hash join residual filter in Acero. GH-48268
Format
- Clarified that empty compressed buffers can omit the length header. GH-47918
Parquet
- A new setting to limit the number of rows written per page has been added. GH-47030
- A arrow::Result version of parquet::arrow::FileReader::Make() has been added. GH-44810
- Support for reading INT-encoded Decimal statistics as Arrow scalars. GH-47955
Several bug fixes including:
- Fixed invalid Parquet files written when dictionary encoded pages are large. GH-47973
- Fixed pre-1970 INT96 timestamps roundtrip. GH-48246
- Fixed potential crash when reading invalid Parquet data. GH-48308
- Added compatibility with non-compliant RLE streams. GH-47981
- Fixed Util & Level Conversion logic on big-endian systems. GH-48218
Encryption
- Simplified nested field encryption configuration. GH-41246
- Improved column encryption API. GH-48337
- Better fuzzing support for encrypted files. GH-48335
Miscellaneous C++ changes
- Added support for CUDA 13 GH-47677
- Drop support for gold linker GH-45484
- Leverage CMake 3.25 upgrade by reducing complexity and maintenance burden on our third party dependency management GH-48317, GH-48316, GH-48315, GH-48248, GH-48181, GH-48178, GH-48091, GH-48074
Linux Packaging Notes
Fixed a bug that the parquet-devel RPM package depends on
parquet-glib-devel.
See also: GH-48044
CentOS 7 support has been dropped.
See also: GH-40735
MATLAB Notes
Added support for building against MATLAB R2025b GH-48154.
Python Notes
Compatibility notes
- Deprecated
Array.formatis removed GH-48102. - Experimental tag has been removed for Arrow PyCapsule Interface GH-47975.
-
PyWeakref_GetRefhas replaced the use ofPyWeakref_GET_OBJECTto support Python 3.15 GH-47823.
New features
- Bindings for
scatterandinverse_permutationare added GH-48167. -
max_rows_per_pageargument is now exposed inparquet.WriterPropertiesGH-48096. - External key material and rotation is enabled for individual Parquet files GH-31869.
Other improvements
- Nested field encryption configuration has been simplified GH-41246.
- Reading INT-encoded
Decimalstatistics withStatisticsAsScalarsis now supported GH-47955. - Unsigned dictionary indices are now supported in pandas conversion GH-47022.
- Added code examples for compute functions
min,maxandmin_maxGH-48668. - Add temporal unit checking in NumPyDtypeUnifier GH-48625
- Error message is improved when mixing
numpy.datetime64values with different units (e.g., datetime64[s] and datetime64[ms]) in a single array GH-48463. - The source argument is now checked in
pyarrow.parquet.read_tableGH-47728.
Relevant bug fixes
-
ipc.Message __repr__has been corrected to use f-string GH-48608. - Failures when reading parquet files written with non-compliant RLE encoders have been fixed in C++ with adding compatibility GH-47981.
- Memory usage is now reduced when using
to_pandas()with many extension arrays columns GH-47861. - Missing required argument error in
FSSpecHandlerdelete_root_dir_contentshas been fixed GH-47559. - Invalid
RecordBatch.from_struct_arraybatch for sliced arrays with offset zero has been fixed in the C++ GH-44318.
R Notes
Compatibility notes
Relevant bug fixes
- Fixed a segfault that could be raised when concatenatig tables GH-47000.
Several Continuous integration fixes and minor bugs have also been added to the release for a full list check the release notes.
Ruby and C GLib Notes
All missing compute function options have been added. So we can use all compute functions from Ruby and C GLib. This is done by Sten Larsson.
Fixed size list array support has been added.
See also: GH-48362
Changing thread pool configuration support in Acero has been added. This is done by Sten Larsson.
Duration support has been added.
CSV writer support has been added.
See also: GH-48680
Ruby
Experimental Pure Ruby Apache Arrow reader implementation has been
added as red-arrow-format gem.
See also: GH-48132
We'll add experimental writer implementation in the next release.
Arrow::Column#to_arrow{,_array,_chunked_array} have been added. They
are for convenient.
See also: GH-48292
Auto Apache Arrow type detection in Arrow::Array.new has been
improved for nested integer list case.
See also:
Arrow::FixedSizeListArray.new(data_type, values) support has been
added.
See also: GH-48610
C GLib
We use Arrow-${MAJOR}.${MINOR}.{gir,typelib} not
Arrow-1.0.{gir,typelib} for .gir and .typelib file names. It's
for co-existent multiple C GLib versions in the same system.
See also: GH-48616
Java, JavaScript, Go, .NET, Swift and Rust Notes
The Java, JavaScript, Go, .NET, Swift and Rust projects have moved to separate repositories outside the main Arrow monorepo.
- For notes on the latest release of the Java implementation, see the latest Arrow Java changelog.
- For notes on the latest release of the JavaScript implementation, see the latest Arrow JavaScript changelog.
- For notes on the latest release of the Rust implementation see the latest Arrow Rust changelog.
- For notes on the latest release of the Go implementation, see the latest Arrow Go changelog.
- For notes on the latest release of the .NET implementation, see the latest Arrow .NET changelog.
- For notes on the latest release of the Swift implementation, see the latest Arrow Swift changelog.