Apache Arrow 20.0.0 Release
  Published
  
    27 Apr 2025
  
  
  By
  
    The Apache Arrow PMC (pmc) 
  
  
The Apache Arrow team is pleased to announce the 20.0.0 release. This release covers over 2 months of development work and includes 259 resolved issues on 327 distinct commits from 63 distinct contributors. See the Install Page to learn how to get the libraries for your platform.
The release notes below are not exhaustive and only expose selected highlights of the release. Many other bugfixes and improvements have been made: we refer you to the complete changelog.
Community
Since the 19.0.0 release, Ed Seidl, Jean-Baptiste Onofré and Matthijs Brobbel have been invited to become committers. Bryce Mecum, Ian Cook, Jacob Wujciak-Jens and Rok Mihevc have been invited to join the Project Management Committee (PMC).
Thanks for your contributions and participation in the project!
C++ Notes
Compute
We’ve added several new compute functions: inverse_permutation/scatter (GH-44393), pivot_wider/hash_pivot_wider (GH-45269), rank_normal (GH-45572), skew/kurtosis (GH-45676), and winsorize (GH-45755).
Acero
We've significantly improved the Hash Join in terms of overflow-safety (GH-44513, GH-45334, GH-45506), memory consumption (peak memory usage reduced by half: GH-45551), and performance (up to several dozen times faster: GH-45611, GH-45917).
Flight RPC
- The experimental Flight over UCX feature has been removed. (#43296)
C# Notes
- Added support for the OrderedandAppMetaDatafields to FlightInfo (#45753)
- FlightClient can now be integrated with Grpc.Net.ClientFactory (#45451)
Linux Packaging Notes
https://apache.jfrog.io/ is still available but https://packages.apache.org/ is preferred because the latter uses the apache.org domain.
Python Notes
Compatibility notes:
- Minimum supported Cython has been raised to 3 and higher GH-45237 .
- A subset of deprecated APIs have been removed
GH-45680:
PARQUET_2_0GH-45848,use_legacy_datasetGH-44790, serialize/deserialize PyArrow C++ code GH-43587 .
New features:
- Large variable width types are supported in NumPy conversion GH-35289.
- Biased/unbiased option are available in skew and kurtosis compute functions GH-45733.
- Support for SAS token in the AzureFileSystemhas been added GH-45705.
- Interchange of  decimal32,decimal64anddecimal256data type objects between Pandas and PyArrow is now supported GH-45582, GH-45570.
- 
pyarrow.ArrayStatisticsandpyarrow.Array.statistics()are added GH-45457.
- Bindings for JSON streaming reader are added GH-14932.
- Bindings for MemoryPool::total_bytes_allocatedandMemoryPool::num_allocationsare added. Also allocator-specific statistics can now be printed to stderr GH-45358.
- A new maps_as_pydictsparameter is introduced toto_pylist,to_pydictandas_pymethods enabling deserialization into Python dictionary instead of list of tuples GH-39010.
Other improvements:
- Source (sdist) and binary distribution (wheels) are now uploaded to GitHub Releases GH-45920.
- Cython code has been cleaned up as we now require at least Cython 3.0 GH-45433.
- Building of free-threaded wheels on Windows is enabled GH-44421 . *Wheels for Alpine Linux are now provided GH-18036 .
Relevant bug fixes:
- Pandas conversion roundtrip with bytes column names error is fixed GH-44188.
- Exceptions are raised instead of showing segfaults when users try to instantiate internal Parquet metadata classes GH-36628.
R Notes
- Binary Arrays now inherit from blob::blobin addition toarrow_binarywhen converted to R objects. This change is the first step in eventually deprecating thearrow_binaryclass in favor of theblobclass in theblobpackage (See GH-45709).
Ruby and C GLib Notes
Improvements
- 
garrow_array_validate()/Arrow::Array#validate: Added.
- 
garrow_array_validate_full()/Arrow::Array#validate_full: Added.
- 
garrow_record_batch_validate()/Arrow::RecordBatch#validate: Added.
- 
garrow_record_batch_validate_full()/Arrow::RecordBatch#validate_full: Added.
- 
garrow_table_validate()/Arrow::Table#validate: Added.
- 
garrow_table_validate_full()/Arrow::Table#validate_full: Added.
- 
GArrowArrayStatistics()/Arrow::ArrayStatistics: Added.
- Changed to require Meson 0.61.2 or later.
- 
GArrowBinaryViewArray()/Arrow::BinaryViewArray: Added.
- 
GArrowStringViewArray()/Arrow::StringViewArray: Added.
- Added support for rubygems-requirements-system
Incompatible changes
- 
gparquet_arrow_file_writer_new_row_group()/Parquet:ArrowFileWriter#new_row_group: Removedchunk_sizeargument.
- 
garrow_record_batch_new()/Arrow::RecordBatch#initialize: Stopped validating automatically. If you want to validate a created record batch, callgarrow_record_batch_validate()/Arrow::RecordBatch#validateexplicitly.
- 
garrow_table_new()/Arrow::Table#initialize: Stopped validating automatically. If you want to validate a created table, callgarrow_table_validate()/Arrow::Table#validateexplicitly.
Java, Go, and Rust Notes
The Java, Go, and Rust Go projects have moved to separate repositories outside the main Arrow monorepo.
- For notes on the latest release of the Java implementation, see the latest Arrow Java changelog.
- For notes on the latest release of the Rust implementation see the latest Arrow Rust changelog.
- For notes on the latest release of the Go implementation, see the latest Arrow Go changelog.