Apache Arrow 21.0.0 (17 July 2025)
This is a major release covering more than 2 months of development.
Download
- Source Artifacts
- Binary Artifacts
- Git tag
Contributors
This release includes 400 commits from 82 distinct contributors.
$ git shortlog -sn apache-arrow-20.0.0..apache-arrow-21.0.0
78 Sutou Kouhei
37 Raúl Cumplido
33 Hiroyuki Sato
30 William Ayd
22 Antoine Pitrou
19 Bryce Mecum
18 Nic Crane
11 Alenka Frim
10 Dewey Dunnington
9 Jacob Wujciak-Jens
9 dependabot[bot]
8 mwish
7 Jonathan Keane
7 Rossi Sun
6 Sarah Gilmore
5 Arash Andishgar
4 Dongjoon Hyun
4 takuya kodama
3 David Li
3 Eddie Chang
3 Enrico Minack
3 Ian Cook
3 Lester Fan
3 Ziy
3 abandy
2 Abhinav
2 David Sherrier
2 Krisztián Szűcs
2 Rok Mihevc
2 gitmodimo
1 Adam Reeve
1 Akum Kang
1 Alina (Xi) Li
1 Anatolii Tsyplenkov
1 Antoine Prouvost
1 Benjamin Kietzman
1 Brian Hulette
1 Bruno
1 Carsten Haubold
1 ChiLin Chiu
1 Colin
1 DenisTarasyuk
1 Eric Dinse
1 Etienne Bacher
1 Even Rouault
1 Gang Wu
1 Guilherme Martins Crocetti
1 Hadrian Reppas
1 HyunWoo Oh
1 Igor Antropov
1 JB Onofré
1 Joshua
1 Junwang Zhao
1 Kevin Gurney
1 Kevin Wilson
1 Kirill Tsyganov
1 Konstantin Malanchev
1 Kyle Hemker
1 Lukas
1 Mateusz Rzeszutek
1 Matt Topol
1 Michael
1 Michael Chirico
1 NazilaAk
1 Patrick Walsh
1 Ranjit Ranjan
1 Roman Karlstetter
1 Saurabh Singh
1 Thomas Newton
1 Tommy Hughes IV
1 Xingyu Long
1 Zihan Qi
1 bw513
1 dawg
1 koenvo
1 leopardracer
1 lriggs
1 neilechao
1 omahs
1 shu-kitamura
1 yuri@FreeBSD
1 yyossy
Patch Committers
The following Apache committers merged contributed patches to the repository.
$ git shortlog -sn --group=trailer:signed-off-by apache-arrow-20.0.0..apache-arrow-21.0.0
179 Sutou Kouhei
51 Antoine Pitrou
47 Raúl Cumplido
15 Nic Crane
13 AlenkaF
13 Bryce Mecum
12 Jacob Wujciak-Jens
7 David Li
7 Dewey Dunnington
7 mwish
6 Rossi Sun
5 Curt Hagenlocher
5 Jonathan Keane
5 Sarah Gilmore
4 Rok Mihevc
3 Gang Wu
3 Will Ayd
2 Kevin Gurney
2 Krisztian Szucs
1 Benjamin Kietzman
Changelog
Bug Fixes
- GH-32276 - [C++][FlightRPC] Add option to align RecordBatch buffers given to IPC reader (#44279)
- GH-35166 - [C++][Compute] Increase precision of decimals in sum aggregates (#44184)
- GH-39811 - [R] better documentation for col_types argument in open_delim_dataset (#45719)
- GH-40756 - [C++] Remove dead Boost urls (#46452)
- GH-43132 - [CI] Fix pre-commit Rat check (#46541)
- GH-44366 - [Python][Acero] RecordBatch.filter on expression raises error if result set is empty (#46057)
- GH-44502 - [R] Negative fractional dates must be converted to integers by floor, not trunc (#46873)
- GH-44910 - [Swift] Fix IPC stream reader and writer impl (#45029)
- GH-45292 - [Python] test_dtypes hypotesis test fails sporadically (#46029)
- GH-45532 - [C++]
RunEndEncodedBuilder
should clear dimensions after aFinish()
call (#45533) - GH-45534 - [C++] Test:
RunEndEncodeTableColumns
should update REE columns’ schema types (#45535) - GH-45608 - [C++][Flight] Fix compilation for clang (#46264)
- GH-45716 - [R][CI] Refactor skip_on_python_older_than to not initialize reticulate (#46079)
- GH-45735 - [C++] Broken tests for extract_regex compute funcion (#45900)
- GH-45853 - [C++][Dev] Fix Meson compilation issues in Docker builds (#45858)
- GH-46011 - [C++] Hide DCHECK family from public headers (#46015)
- GH-46025 - [C++] Use ARROW_CUDA_EXPORT instead of ARROW_EXPORT for libarrow_cuda (#46030)
- GH-46052 - [C++][Benchmarking] Don’t build grouper benchmark without ARROW_COMPUTE=ON (#46053)
- GH-46065 - [Release] Don’t use
--verify-tag
forgh release upload
in02-source.sh
(#46066) - GH-46068 - [Release] Remove needless
docs:rc
task from 05-binary-upload.sh (#46069) - GH-46070 - [C++] Remove duplicate storage_type in JsonExtension (#46071)
- GH-46080 - [Python][Docs] Provide guidance for tzdata related issues if installing with pip (#46591)
- GH-46084 - [C++] Always use ARROW_VCPKG to detect vcpkg mode (#46467)
- GH-46090 - [C++] Set default IPC option to enabled in Meson (#46114)
- GH-46094 - [C++][Docs] Add note to RleDecoder::Get’s doc comment (#46874)
- GH-46121 - [Python] Add missing
column_index
argument toArrowReaderProperties::read_dictionary
’s Cython binding (#46122) - GH-46127 - [CI][Release] Make 02-source.sh test passable on fork (#46143)
- GH-46146 - [C++] Merge metadata in SchemaBuidler::AddMetadata (#46654)
- GH-46149 - [C++] Opening dataset fails with sshfs-3.7.3 due to F_RDADVISE error (#46346)
- GH-46157 - [C++] Move test utility RunEndEncodeTableColumns that uses REE to test_util_internal on acero instead of common gtest_util (#46161)
- GH-46174 - [Python] Failing tests in python minimal builds (#46175)
- GH-46192 - [C++] Add
substrait
dep to third party download script (#46191) - GH-46197 - [C++] Tests use legacy timezones (#46201)
- GH-46214 - [C++] Improve S3 client initialization (#46723)
- GH-46224 - [C++][Acero] Fix the hang in asof join (#46300)
- GH-46231 - [C++][CMake] Fix
arrow_bundled_dependencies
to be externally accessible by FetchContent (#46232) - GH-46233 - [C++] Fix missing nested braces in QueuedTask initialization (#46234)
- GH-46236 - [Release][Packaging] Fix
dev/release/post-03-binary.sh
errors (#46237) - GH-46238 - [Release][Python] Use array to avoid empty argument in
dev/release/post-11-python.sh
(#46239) - GH-46240 - [Release][Packaging] Fix a bug that existing APT repositories’ metadata are lost (#46287)
- GH-46242 - [Release] Don’t show gpg signature when getting release time (#46243)
- GH-46259 - [CI] Remove deprecated flag from mamba info (#46260)
- GH-46262 - [CI][Ruby] Don’t update GCC of MSYS2 (#46278)
- GH-46268 - [C++] Improve ArrayData docstrings (#46271)
- GH-46270 - [C++][Parquet] Clarify GeoStatistics docstring (#46649)
- GH-46284 - [Release][Packaging] Add missing APT metadata for .ddeb (#46288)
- GH-46296 - [Swift] Add support for reading struct (#46302)
- GH-46299 - [C++][Compute] Don’t use
static inline const
for default options (#46303) - GH-46304 - [Release][Packaging] Use optimized debug build for .deb (#46392)
- GH-46306 - [C++][Parquet] Should use LoadEnumSafe for geo enum (#46307)
- GH-46314 - [C++][Parquet] Fix valgrind error when collecting parameterized tests for MakeWKBPoint (#46320)
- GH-46326 - [C++][Parquet] Fix stack overflow in rapidjson value comparison to integer (#46327)
- GH-46333 - [CI] Always pass
--yes
tomamba clean
(#46341) - GH-46333 - [CI] Explicitly pass
--yes
tomamba clean
(#46334) - GH-46343 - [CI][Python] Remove workaround for gdb packaging issue (#46848)
- GH-46343 - [CI] Avoid installing gdb 16.3 on python 3.10 jobs to fix CI (#46511)
- GH-46344 - [CI][Python] Skip doctest for s3.get_file_info to avoid bucket restrictions (#46345)
- GH-46351 - [Archery][Docs] Fix the cli argument parsing bug in docker subcommand (#46352)
- GH-46355 - [Python] Fix table.to_struct_array with an empty table (#46357)
- GH-46359 - [C++][Thirdparty] Bump Apache ORC to 2.1.2 (#46360)
- GH-46362 - [CGLib][Packaging] Use -fPIE explicitly for g-ir-scanner (#46366)
- GH-46363 - [CI][Packaging] Use mono from community repository on Alpine instead of from testing (#46364)
- GH-46394 - [C++][R] gcc-UBSAN errors on CRAN (#46397)
- GH-46395 - [C++][Statistics] Use EqualOptions for min and max in arrow::ArrayStatistics::Equals() (#46422)
- GH-46407 - [C++] Fix IPC serialization of sliced list arrays (#46408)
- GH-46414 - [C++] Fix GCS filesystem getFileInfo method (#46416)
- GH-46417 - [C++][Parquet] Fix UB in LoadEnumSafe for EdgeInterpolationAlgorithm (#46418)
- GH-46419 - [C++] Remove duplicate declaration and sync arg names on acero test_util_internal functions (#45400)
- GH-46420 - [C++][Dataset] Fix DatasetWriter deadlock on writting batch greater than max_rows_queued (#46139)
- GH-46424 - [C++][Parquet] Fix erroneous unit test skip (#46425)
- GH-46435 - [Parquet][C++] Fix uninitialized value in writer test (#46533)
- GH-46442 - [R] hms::as_hms tests fail on some of our crossbow builds (#46443)
- GH-46456 - [GLib] Add missing
since:
tag (#46457) - GH-46478 - [C++] Implement recent JSON changes into Meson configuration (#46479)
- GH-46481 - [C++][Python] Allow nullable schema in FlightInfo (#46489)
- GH-46512 - [CI][C++] Install the llvm package explicitly on MSYS2 (#46525)
- GH-46516 - [CI][Python] Force Cython>3.1.1 for docs builds (#46770)
- GH-46523 - [GLib] Fix compiler warning: use gsize instead of int (#46524)
- GH-46538 - [CI][Packaging][AlmaLinux8] Ensure pip3 (#46539)
- GH-46564 - [C++] Export ARROW_VCPKG in ArrowConfig.cmake (#46565)
- GH-46576 - [C++] Suppress
codecvt_utf8
deprecation warning (#46622) - GH-46589 - [C++] Fix utf8_is_digit to support full Unicode digit range (#46590)
- GH-46593 - [CI][Integration] Disable nested log grouping (#46594)
- GH-46598 - [Dev] Use language name for alias (#46602)
- GH-46599 - [C++][Doc][Parquet] Update supported types documentation (#46620)
- GH-46605 - [CI][Release][C#] Update download URL for dotnet on verification script (#46612)
- GH-46606 - [Python] Do not require numpy when normalizing slice (#46732)
- GH-46609 - [Release][CI] Use System GTest for macos verification (#46823)
- GH-46610 - [CI][Release] Use Python 3.12 on AlmaLinux 8 (#46621)
- GH-46611 - [Python][C++] Allow building float16 arrays without numpy (#46618)
- GH-46623 - [C++][Compute] Fix the failure of large memory test in arrow-compute-row-test (#46635)
- GH-46636 - [R] Fix evaluation of external objects not in global environment in
case_when()
(#46667) - GH-46659 - [C++] Fix export of extension arrays with binary view/string view storage (#46660)
- GH-46673 - [CI][R][Docs] Accept empty INSTALL_ARGS again (#46682)
- GH-46674 - [C++] Construct Array from ExtensionType Scalar (#46675)
- GH-46684 - [C++] Fix Meson configuration issue on Windows (#46685)
- GH-46688 - [Ruby] Fix a typo (#46689)
- GH-46691 - [CI][Packaging] Update platform tag on generated wheel name to match newest auditwheel naming (#46705)
- GH-46693 - [CI] Update GitHub hosted runner from deprecated windows-2019 to windows-2022 (#46694)
- GH-46704 - [C++] Fix OSS-Fuzz build failure (#46706)
- GH-46708 - [C++][Gandiva] Added zero return values for castDECIMAL_utf8 (#46709)
- GH-46710 - [C++] Fix ownership and lifetime issues in Dataset Writer (#46711)
- GH-46717 - [R][Docs] Add missing “internal” keywords for internal function (#46722)
- GH-46724 - [C++][Parquet] OSSFuzz: Prevent from Bad-cast in handling statistics (#46725)
- GH-46729 - [Python] Allow constructing InMemoryDataset from RecordBatchReader (#46731)
- GH-46736 - [CI] Disable Parquet in conan-minimum (#46744)
- GH-46761 - [C++] Add executable detection on FreeBSD (#46759)
- GH-46764 - [C++][Gandiva] Fix wrong
.bc
depends (#46765) - GH-46777 - [C++] Use SimplifyIsIn only when the value_set of the expression is lower than a threshold (#46859)
- GH-46782 - [Docs] Link to same version of docs from Implementations page
- GH-46805 - [CI][Dev] Fix caching for R hooks in lint job (#46812)
- GH-46809 - [CI][Packaging] Stop trying to add headers from arrow/compu… (#46810)
- GH-46811 - [C++][Python] Fix crash on FileReaderImpl::GetRecordBatchReader (#46931)
- GH-46816 - [Docs] Fix links to Swift docs and source (#46817)
- GH-46827 - [C++] Update Meson Configuration for compute shared lib (#46839)
- GH-46831 - [C++][R] Remove some pending references to CMake < 3.25 (docs + minor CMake references) (#46834)
- GH-46841 - [C++][Gandiva] Fix date trunc edge case (#46842)
- GH-46863 - [CI][C++] Suppress a false positive UBSAN error in AWS SDK for C++ (#46870)
- GH-46871 - [C++][Parquet] Restore implementation of 3 arrow::FileReader::GetRecordBatchReader() functions (#46868)
- GH-46879 - [CI][Packaging][Linux] Don’t check example build with old CMake (#46880)
- GH-46888 - [C++] Remove override of default buildtype in Meson config (#46919)
- GH-46915 - [C++][Compute] Initialize Compute kernels on benchmarks that require extra kernels (#46922)
- GH-46916 - [R] Test for negative fractional dates fails on older R versions due to change in base R as.Date() (#46917)
- GH-46920 - [FlightRPC] Fix Flight SQL ColumnMetadata retrieval (#46921)
- GH-46934 - [C++][Parquet] Trying to fix ub in AttachStatistics (#46940)
- GH-46947 - [R][Packaging] Add src/arrow/flight/sql/odbc to source excludes (#46948)
- GH-46964 - [CI][Packaging][Conan] Ensure using upper case for config suffix (#46967)
- GH-46986 - [CI][C++] Fix a build error with C++20 (#46987)
- GH-46988 - [C++][Parquet] Fix FLBA DecodeArrow multiply overflow (#46991)
- GH-46989 - [CI][R] Use Ubuntu 20.04 instead of OpenSUSE for R 4.1 (#46990)
- GH-46995 - [CI][R][C++] Use system memory allocator in sanitizer jobs (#47007)
- GH-46998 - [C++] Fix mockfs.cc compiling error with C++23 (#46999)
- GH-47015 - [CI][C++] Use mold on conda-cpp to work around issues with GNU ld (#47028)
- GH-47033 - [C++][Compute] Never use custom gtest main with MSVC (#47049)
- GH-47037 - [CI][C++] Fix Fedora 39 CI jobs (#47038)
- GH-47061 - [Release] Fix wrong variable name for signing (#47062)
- GH-47063 - [Release] Define missing RELEASE_TARBALL (#47064)
- GH-47065 - [Release] Fix timeout key in verify_rc.yml (#47066)
- GH-47067 - [Release] Fix wrong GitHub Actions context in verify_rc.yml (#47068)
- GH-47069 - [Release] Add missing “needs: target” (#47070)
- GH-47071 - [Release] Dereference all hard links in source archive (#47072)
- GH-47074 - [Release] Use reproducible mtime for csharp/ in source archive (#47076)
- GH-47078 - [Release] Ensure using cloned apache/arrow for reproducible check (#47079)
- GH-47092 - [Release] Binary verification CI jobs are failing
New Features and Improvements
- GH-25025 - [C++] Move non core compute kernels into separate shared library (#46261)
- GH-26818 - [C++][Python] Preserve order when writing dataset multi-threaded (#44470)
- GH-35419 - [GLib] Add GArrowFixedShapeTensorDataType (#46305)
- GH-35644 - [MATLAB] Add tests verifying
arrow.array.<Type>Array.fromMATLAB()
throws an exception if given an array with the wrong type. (#47020) - GH-36753 - [C++] Properly pretty-print and diff HalfFloatArrays (#46857)
- GH-37027 - [C++] Add float16 kernels to if-else and vector-replace functions (#46446)
- GH-37561 - [Ruby] Add empty chunked array tests for Arrow::Table#each_raw_records (#46862)
- GH-37577 - [MATLAB] Create a superclass for
DateType
-related MATLAB tests (#46923) - GH-37677 - [C++][FlightRPC] Allow FlightInfo.schema to be nullable
- GH-37891 - [C++][Parquet] Refine several classes in Parquet encryption (#46202)
- GH-37891 - [C++] Followup Buffer change to use sptr move (#46027)
- GH-38214 - [MATLAB] Add a common
arrow.tabular.Tabular
MATLAB interface (#47014) - GH-38369 - [MATLAB] Create utility functions for simplifying management of
Proxy
instances forArray
s (#46907) - GH-38903 - [R][Docs] Improve documentation of col_types (#46145)
- GH-38914 - [Python] Add EncryptionConfiguration.uniform_encryption (#46347)
- GH-39294 - [C++][Python] DLPack on Tensor class (#42118)
- GH-39759 - [Docs] Update pydata-sphinx-theme to 0.16.1 (#46943)
- GH-40278 - [C++] Support casting string to duration in CSV converter (#46035)
- GH-40343 - [C++] Move S3FileSystem to the registry (#41559)
- GH-40754 - [Python] Expose tls_ca_file_path to S3FileSystem (#45881)
- GH-41496 - [Python][Azure][Docs] Turn on azure on debian-docs (#46892)
- GH-41672 - [Python][Doc] Clarify docstring of FixedSizeListArray.values that it ignores the offset (#46144)
- GH-41973 - Expose new S3 option check_directory_existence_before_creation - manual rebase (#46619)
- GH-42012 - [Python] Add Schema with_field or set_field method (#46348)
- GH-43041 - [C++][Python] Read/write Parquet BYTE_ARRAY as Large/View types directly (#46532)
- GH-43170 - [Swift] Add StructArray support to ArrowWriter (#43439)
- GH-43623 - [R] remove libarrow backwards compatibility enforcement (#46491)
- GH-43807 - [C++][Python] Add UUID extension type conversion support to/from Parquet (#45866)
- GH-43891 - [C++][Parquet] Faster reading of FIXED_LEN_BYTE_ARRAY data (#46886)
- GH-44208 - [R] Adding test to ensure bit64’s new semantic works with arrow (#46651)
- GH-44435 - [GLib] Add distinct count support to GArrowArrayStatistics (#46894)
- GH-44500 - [Python][Parquet] Map Parquet logical types to Arrow extension types by default (#46772)
- GH-44900 - [Python] Support explicit
fsspec+{protocol}
andhf://
filesystem URIs (#45089) - GH-44953 - [R] Add R bindings for new compute functions (#44971)
- GH-45028 - [C++][Compute] Allow cast to reorder struct fields (#45246)
- GH-45083 - [C++] Add HalfFloat kernels for is_nan, is_inf, is_finite, negate, negate_checked, sign (#46866)
- GH-45195 - [C++] Update bundled AWS SDK for C++ to 1.11.587 (#45306)
- GH-45229 - [Python] Migrate from scipy.spmatrix to scipy.sparray (#46423)
- GH-45229 - [Python] skip scipy.sparse roundtrip tests for float16 (#46413)
- GH-45290 - [Docs][Release] Change show_version_warning_banner substitution (#46883)
- GH-45522 - [Parquet][C++] Parquet GEOMETRY and GEOGRAPHY logical type implementations (#45459)
- GH-45531 - [Python] Add the
dim_names
argument tofrom_numpy_ndarray
(#46170) - GH-45619 - [Python] Use f-string instead of string.format (#45629)
- GH-45643 - [R] Implement hms functions to create and manipulate time of day variables (#46206)
- GH-45653 - [Python] Scalar subclasses should implement Python protocols (#45818)
- GH-45664 - [C++] Allow LargeString,LargeBinary,FixedSizeBinary,StringView and BinaryView for RecordBatch::MakeStatisticsArray() (#46031)
-
GH-45713 - [GLib] Add garrow_chunked_array_(import export)() (#46876) - GH-45750 - [C++][Python][Parquet] Implement Content-Defined Chunking for the Parquet writer (#45360)
- GH-45794 - [C++] Add array directory to Meson configuration (#45795)
- GH-45796 - [C++] Add integration directory to Meson configuration (#45797)
- GH-45798 - [C++] Add extension directory to Meson (#45799)
- GH-45800 - [C++] Implement util configuration in Meson (#45824)
- GH-45829 - [C++] Add compute directory to Meson configuration (#45830)
- GH-45833 - [C++] Add JSON directory to Meson configuration (#45834)
- GH-45865 - [C++] Create dedicated benchmark dependency in Meson (#45909)
- GH-45908 - [C++][Docs] Rename and expose basic {Array,…}FromJSON helpers as public APIs (#46180)
- GH-45957 - [C++][Python] Expose
allow_delayed_open
on S3FileSystem (#46078) - GH-45978 - [C++] Bump bundled mimalloc version (#45979)
- GH-45991 - [C++] Bump bundled nlohmann_json to v3.12.0 (#46112)
- GH-45992 - [C++] Bump bundled utf8proc version to 2.10.0 (#46032)
- GH-46019 - [Python] Raise TypeError on feather read_table if columns is not a Sequence (#46038)
- GH-46054 - [Python][Packaging] Re-enable pandas on Windows free-threaded wheel (#46109)
- GH-46058 - [Python] Run Python in AppVeyor outside of source directory (#46059)
- GH-46087 - [FlightSQL] Allow returning column remarks in FlightSQL’s CommandGetTables (#46110)
- GH-46091 - [C++] Use feature options in Meson configuration (#46204)
- GH-46092 - [C++] Add filesystem related options to Meson (#46101)
- GH-46104 - GH-45937: [C++][Parquet] Logical type definition for variant
- GH-46115 - [C++] Implement compression libraries in Meson (#46358)
- GH-46116 - [C++] Implement IPC directory in Meson (#46117)
- GH-46118 - [C++] Add tensor directory to Meson (#46119)
- GH-46130 - [Python] Remove
use_legacy_format
in favour of settingIpcWriteOptions
(#46131) - GH-46132 - [C++][Parquet] Remove deprecated parquet APIs from 19.0.0 (#46133)
- GH-46141 - [C++] Add flight directory to Meson configuration (#46142)
- GH-46153 - [C++] Implement acero directory in Meson (#46154)
- GH-46155 - [C++] Implement Tensorflow directory in Meson (#46156)
- GH-46163 - [C++] Add vendored directory to Meson (#46164)
- GH-46189 - [C#] Use pooled buffers in ArrowStreamWriter (#46190)
- GH-46196 - [C++] Remove ARROW_USE_PRECOMPILED_HEADERS and related logic (#46200)
- GH-46198 - [Python] Remove deprecated PyExtensionType (#46199)
- GH-46207 - [C++] Rename arrow::util::StringBuilder and move to internal namespace (#46813)
- GH-46209 - [Documentation][C++][Compute] Add cpp developer documentation for row table (#46210)
- GH-46215 - [C++][Docs] Add README for Meson subprojects directory (#46216)
- GH-46217 - [C++][Parquet] Update the timestamp of parquet::encryption::TwoLevelCacheWithExpiration correctly (#46283)
- GH-46219 - [C++][Parquet] Remove PARQUET_MINIMAL_DEPENDENCY option (#46274)
- GH-46222 - [Python] Allow to specify footer metadata when opening IPC file for writing (#46354)
- GH-46241 - [Release][Packaging] Add support for regenerating metadata of APT repositories (#46277)
- GH-46245 - [Swift] Upgrade
FlatBuffers
to v25.2.10 (#46246) - GH-46250 - [Swift] Update
swift-tools-version
to 5.10 (#46252) - GH-46285 - [C++] Add support for Decimal32/64 and HalfFloat to run_end_encode/run_end_decode (#46286)
- GH-46289 - [Release][Packaging] Verify APT/Yum repositories keeps working for old versions (#46292)
- GH-46290 - [Swift] Upgrade
grpc-swift
to1.25.0
andswift-protobuf
to1.29.0
(#46291) - GH-46318 - [Docs][C++] Add Extension Array/Type documents (#46319)
- GH-46321 - [C++][Doc] Better explain ArrayData IsValid and GetNullCount (#46332)
- GH-46336 - [Release][Packaging] Add support for Reproducible Builds for source archive (#46342)
- GH-46338 - [C++] Add compile step for Meson in cpp_build.sh (#46339)
- GH-46349 - [Python] Move parquet definitions to pyarrow/includes/libparquet.pxd (#46437)
- GH-46367 - [C++] Prevent Meson from using git info if built as subproject (#46368)
- GH-46373 - [Python] Exercise fallback case on tests for parquet.read_table in case dataset is not available (#46550)
- GH-46376 - [Docs] Replace Xitter link with BlueSky link (#46402)
- GH-46378 - [Docs] Remove references to autotune from the docs (#46379)
- GH-46380 - [GLib] Add GArrowFixedShapeDataType#shape (#46381)
- GH-46386 - [C++] Ensure using our CMake packages not Find*.cmake (#46387)
- GH-46388 - [C++] Check
Snappy::snappy{,-static}
inFindSnappyAlt.cmake
(#46389) - GH-46396 - [C++][Documentation][Statistics] Revise the documentation to clarify that arrow::ArrayStatistics is ignored during arrow::Array comparisons (#46470)
- GH-46398 - [GLib] Add GArrowFixedShapeTensorDataType#n_dimensions (#46399)
- GH-46400 - [GLib] Add GArrowFixedShapeDataType#permutation (#46401)
- GH-46403 - [C++] Add support for limiting element size when printing data (#46536)
- GH-46433 - [GLib] Add GArrowFixedShapeDataType#dim_names (#46434)
- GH-46439 - [C++] Use result pattern for all FromJSONString Helpers (#46696)
- GH-46439 - [C++] Rename internal Converter class in from_string.cc (#46697)
- GH-46439 - [C++] Remove unneeded namespace prefix in test_util_internal.h (#46695)
- GH-46444 - [Documentation][C++][Acero] Move internal Swiss table doc into public C++ developer doc (#46445)
- GH-46450 - [GLib] Add GArrowFixedShapeDataType#strides (#46451)
- GH-46459 - [C++] Make some arrow/util headers internal (#46721)
- GH-46462 - [C++][Parquet] Expose currently thrown EncodedStatistics when checking is_stats_set (#46463)
- GH-46473 - [C++][Docs] Fix typos in decimal comments (#46474)
- GH-46475 - [Documentation][C++][Compute] Consolidate Acero developer docs (#46476)
- GH-46477 - [C++] Use vendored flatbuffers in Meson configuration (#46484)
- GH-46482 - [CI][Dev] Add shellcheck files without change (#46483)
- GH-46487 - [C++] Refactor lz4 from ExternalProject to FetchContent (#46390)
- GH-46490 - [CI][Dev] Add shellcheck ci/scripts/install_ccache.sh (#46492)
- GH-46494 - [CI][Dev] Add shellcheck files without change (#46495)
- GH-46496 - [CI][Dev] Fix shellcheck SC2086 errors in ci/scripts directory (#46497)
- GH-46499 - [CI][Crossbow][C++] Use apache/arrow for Meson (#46501)
- GH-46500 - [CI][Java] Remove CI scripts for Java (#46502)
- GH-46508 - [C++] Upgrade OpenTelemetry cpp to avoid build error on recent Clang (#46509)
- GH-46520 - [Docs] Fix variety of warnings and errors in the docs build (#46521)
- GH-46522 - [C++][FlightRPC] Add Arrow Flight SQL ODBC driver (#40939)
- GH-46526 - [CI][Dev] Fix shellcheck SC2086 and SC2223 errors ci/scripts directory (#46527)
- GH-46528 - [CI][Dev] Remove “archery lint” (#46686)
- GH-46529 - [C++] Convert static inline type trait functions to constexpr (#46559)
- GH-46537 - [Docs][C++] Add RunEndEncodedArray, FlatArray, and PrimitiveArray API Docs (#46540)
- GH-46544 - [CI][Dev][Python] Use pre-commit for autopep8 (#46552)
- GH-46545 - [CI][Dev][Python] Update pre-commit for cython-lint (#46580)
- GH-46546 - [CI][Dev][Python] Use pre-commit for numpydoc (#46595)
- GH-46547 - [CI][Dev][R] Use pre-commit for lintr (#46581)
- GH-46548 - [CI][Dev][R] Use pre-commit for cpplint (#46549)
- GH-46551 - [C++] Use
std::string_view
for type schema API (#46553) - GH-46556 - [GLib] Add GArrowUUIDDataType (#46558)
- GH-46569 - [CI][Integration] Use apache/arrow-js for JS (#46570)
- GH-46572 - [Python] expose filter option to python for join (#46566)
- GH-46585 - [JS][Dev] Remove dependabot configuration for JS (#46586)
- GH-46587 - [CI][JS] Remove JS related test CI (#46588)
- GH-46603 - [JS][Release] Remove JavaScript related release code (#46604)
- GH-46613 - [GLib] Add GArrowBaseListDataType (#46615)
- GH-46632 - [R][Docs] Add docs for arrow::one (#46648)
- GH-46633 - [Docs][C++][Python] Update CombineChunks documentation to specify that binary columns can be combined into multiple chunks (#46638)
- GH-46642 - [Format] Add footnote clarifying REE layout has O(log n) random access (#46643)
- GH-46645 - [CI][Dev][R] Use pre-commit for styler (#46664)
- GH-46652 - [Python][Docs] Update language for row_group_size parameter (#46653)
- GH-46656 - [CI][Dev] Fix shellcheck SC2034 and SC2086 errors in ci/scripts directory (#46657)
- GH-46662 - [CI][Dev] Fix shellcheck SC2148 errors in ci/scripts directory (#46663)
- GH-46665 - [CI][Crossbow][C++] Use apache/arrow for Alpine Linux (#46666)
- GH-46676 - [C++][Python][Parquet] Allow reading Parquet LIST data as LargeList directly (#46678)
- GH-46679 - [C++][Meson] Use WrapDB entry for gflags instead of CMake wrapper (#46680)
- GH-46683 - [C++][Python] Add utf8_zero_fill compute function for sign-aware zero padding (#46815)
- GH-46699 - [CI][Dev] fix shellcheck errors in the ci/scripts/cpp_test.sh (#46700)
- GH-46702 - [JS] Remove js/ (#46703)
- GH-46714 - [C++] Use hidden symbol visibility in Meson configuration (#46715)
- GH-46719 - [R] Add 32 and 64 bit Decimal types (#46720)
- GH-46726 - [CI][Dev] fix shellcheck errors in the ci/scripts/conan_build.sh (#46727)
- GH-46740 - [C++] Update bundled Thrift
- GH-46745 - [C++] Update bundled Boost to 1.88.0 and Apache Thrift to 0.22.0 (#46912)
- GH-46746 - [C++] Assume AWS SDK >= 1.11.0 (#46742)
- GH-46748 - [C++] Initial port on AIX (#46749)
- GH-46757 - [CI][Packaging][Conan] Synchronize upstream conan (#46758)
- GH-46763 - [CI][Dev] fix shellcheck errors in the ci/scripts/ccache_setup.sh (#46766)
- GH-46767 - [C++] Enable EqualOptions::use_atol_ for arrow::Array, arrow::Scalar, arrow::RecordBatch, and arrow::ChuckedArray (#46779)
- GH-46771 - [Python][C++] Implement pa.arange function to generate array sequences (#46778)
- GH-46773 - [GLib] Add GArrowFixedSizeListDataType (#46774)
- GH-46775 - [Docs] Fix navigation issues (#46784)
- GH-46785 - [CI][Dev][C++] Suppress needless outputs of cpplint with pre-commit (#46786)
- GH-46787 - [CI][Integration] Use Node.js 20 (#46790)
- GH-46788 - [C++][Parquet] Enable SIMD for byte stream split with 2 streams (#46789)
- GH-46791 - [C++] Add
Status::OrElse
,IntoStatus<T>
andToStatus
(#46792) - GH-46794 - [CI][Dev] Fix shellcheck errors in the ci/scripts/csharp_test.sh (#46795)
- GH-46798 - [CI][Dev] Add support for pre-commit 2.17.0 (#46799)
- GH-46801 - [Dev] Remove some leftovers for Java, Go, JS and Swift on some config files (#46802)
- GH-46803 - [Swift] Remove swift implementation from apache/arrow after migration to new repository (#46804)
- GH-46806 - [Ci][Dev][Swift] Remove Swift related settings (#46807)
- GH-46820 - [CI][Integration] Use Node.js 20 by default (#46821)
- GH-46833 - [Python] Expose ConfigureManagedIdentityCredential and ConfigureClientSecretCredential to AzureFileSystem on PyArrow (#46837)
- GH-46843 - [C++] Don’t use unity build for bundled AWS SDK for C++ (#46845)
- GH-46846 - [CI][Dev] Fix shellcheck errors in the ci/scripts/install_dask.sh (#46847)
- GH-46854 - [CI][MATLAB][Packaging] Add support for MATLAB
R2025a
in CI and crossbow packaging workflows (#46855) - GH-46864 - [C++] Add half-float test for
ArrayFromJSONString
(#46865) - GH-46869 - [C++][Parquet] Deprecate
arrow::Status parquet::arrow::FileReadeder::GetRecordBatchReader()
(#46932) - GH-46877 - [MATLAB] Add
arrow.tabular.Table.fromRecordBatches
static method (#46885) - GH-46881 - [CI][Dev] Fix shellcheck errors in the ci/scripts/install_gcs_testbench.sh (#46882)
- GH-46895 - [CI][Dev] Fix shellcheck errors in the ci/scripts/install_minio.sh (#46896)
- GH-46899 - [CI][Dev] Fix shellcheck errors in the ci/scripts/install_numba.sh (#46900)
- GH-46909 - [CI][Dev] Fix shellcheck errors in the ci/scripts/install_sccache.sh (#46910)
- GH-46911 - [Packaging] Add support for AlmaLinux 10 (#46933)
- GH-46952 - [Packaging] Drop support for CentOS Stream 8 (#46953)
- GH-46959 - [Python][Packaging] Drop support for manylinux2014 (#46965)
- GH-46968 - [CI][Packaging] Synchronize conan files for 20.0.0 (#46966)
- GH-46974 - [Integration][Archery] Add support for ARROW_JS_ROOT (#46975)
- GH-47025 - [C++][Docs] Increase minimum gcc for building from 7.1 to 9 (#47026)
- GH-47081 - [Release] Revisit reproducible source archive verification