Apache Arrow 22.0.0 Release
  Published
  
    24 Oct 2025
  
  
  By
  
    The Apache Arrow PMC (pmc) 
  
  
The Apache Arrow team is pleased to announce the 22.0.0 release. This release covers over 3 months of development work and includes 213 resolved issues on 255 distinct commits from 60 distinct contributors. See the Install Page to learn how to get the libraries for your platform.
The release notes below are not exhaustive and only expose selected highlights of the release. Many other bugfixes and improvements have been made: we refer you to the complete changelog.
Community
Since the 21.0.0 release, Kyle Barron has been invited to be committer.
Matthijs Brobbel, Adam Reeve and Rossi Sun have been joined the Project Management Committee (PMC).
Thanks for your contributions and participation in the project!
The first Apache Arrow Summit was held on October 2nd 2025 in Paris, France as part of PyData Paris. Program details and agenda can be found here: https://www.meetup.com/pydata-paris/events/310646396/
There were around 35 attendees, of which ~20 were existing core developers or PMC members. The Summit was overwhelmingly described as a success, with a friendly atmosphere between all participants. Unfortunately, no Audio / Video recording system was available for this event.
Arrow Flight RPC Notes
Support for dictionary replacement and dictionary encoding has been added to the DoGet and DoExchange methods. (GH-45056, GH-45055 and GH-26727).
As part of supporting dictionary replacement we have also exposed the ipc::ReadStats on the FlightStreamReader in order to facilitate debugging. (GH-47422)
C++ Notes
Compute
Timezone aware kernels can now handle timezone offset strings. (GH-30036)
Better decimal support has been added introducing a structure MatchConstraint for applying extra (and optional) matching constraint for kernel signature matching. (GH-47287, GH-41336)
The scatter function has been moved to Arrow core from Arrow Compute. (GH-47375)
Filesystems
The Request ID has been added when the AWS client raises an error. (GH-47349)
Format
Several improvements around Half Float (Float16) support. (GH-46860, GH-46739)
Parquet
Better Fuzzing support for Parquet and several related fixes. (GH-47803, GH-47740, GH-47655, GH-47597, GH-47184)
Rework around the RLE decoder in order to extract a RLE parser to drive further optimisations. (GH-47112)
Dynamic dispatch support has been added to Byte Stream Split. (GH-46962)
Now some statistics, i.e. null count. will not be discarded when the sort order of the column is unknown. (GH-47449)
is_min_value_exact and is_max_value_exact now are exposed in Parquet Statistics if present when reading. (GH-46905)
We now reserve values correctly when reading BYTE_ARRAY and FLBA. (GH-47012)
Encryption
String based Parquet encrption methods have been deprecated. (GH-47338)
Memory usage required by decryption buffers when reading encrypted Parquet has been reduced. (GH-46971)
Type support
Improvements on the Parquet Variant type support. (GH-47241, GH-47838)
Better support for Decimal32 and Decimal64. (GH-44345)
Gandiva
Support for LLVM 21.1.0 has been added. (GH-47469)
Miscellaneous C++ changes
Add support for further Arrow Statistics. (GH-47102, GH-47101)
Support for shared memory comparison in arrow::RecordBatch has been added. (GH-47149)
arrow::Table::Equals now allows an optional arrow::EqualOptions argument. (GH-46937)
Skyhook integration has been removed from the main repository and has been moved to its own repository. (GH-47225)
Linux Packaging Notes
Support for Debian forky has been added. (GH-47312)
MATLAB Notes
- 
NumNullsproperty was added toarrow.array.Arrayandarrow.arrray.ChunkedArray. (GH-47263, GH-38422)
Python Notes
Compatibility notes:
- Support for Python 3.9 has been dropped (GH-47443) and support for Python 3.14, regular and free-threaded has been added, (GH-47438).
- Cython 3.1 is now required build-time dependency (GH-47370).
- 
project.optional-dependencieshas been replaced withdependency-groups(GH-47137).
New features:
- CSV writer option quoting_headeris now exposed (GH-47575).
Other improvements:
- Support for pandas DataFrame.attrsduring conversion between a dataframe and a Parquet file has been added (GH-45382).
- A utility function to create Arrow table instead of pandas dataframe has been added (GH-47172).
- IPC and Flight options now have a nice repr/str methods (GH-47358).
- Access to Request ID in AWS client error is now available from Python (GH-47349).
- Public Type Enums are added (GH-47123).
- Python Development documentation section has been restructured in order to make it easier for contributors to build and develop PyArrow (GH-20125.
Relevant bug fixes:
- Schema is now hashable when metadata is set (GH-47602).
- 
MapScalar.as_py(maps_as_pydicts="strict")option now works for nested maps (GH-47380).
- 
FileFragment.open()no longer segfaults on file-like objects (GH-47301).
- 
pa.compute.fill_nullregression on Windows due to a compiler bug has been fixed (GH-47234).
- Integer dictionary bitwidth preservation no longer breaks multi-file read behaviour as DatasetFactory.inspectmethod now acceptspromote_optionsandfragmentsparameters (GH-46629).
- 
FileSystem.from_uriis reverted to be a staticmethod again (GH-47179).
Java, JavaScript, Go, .NET, Swift and Rust Notes
The Java, JavaScript, Go, .NET, Swift and Rust projects have moved to separate repositories outside the main Arrow monorepo.
- For notes on the latest release of the Java implementation, see the latest Arrow Java changelog.
- For notes on the latest release of the JavaScript implementation, see the latest Arrow JavaScript changelog.
- For notes on the latest release of the Rust implementation see the latest Arrow Rust changelog.
- For notes on the latest release of the Go implementation, see the latest Arrow Go changelog.
- For notes on the latest release of the .NET implementation, see the latest Arrow .NET changelog.
- For notes on the latest release of the Swift implementation, see the latest Arrow Swift changelog.