Implementation Status#
The following tables summarize the features available in the various official Arrow libraries. All libraries currently follow version 1.0.0 of the Arrow format. See versioning for details about versioning. Unless otherwise stated, the Python, R, Ruby and C/GLib libraries follow the C++ Arrow library.
Data Types#
Data type (primitive) |
C++ |
Java |
Go |
JavaScript |
C# |
Rust |
Julia |
---|---|---|---|---|---|---|---|
Null |
β |
β |
β |
β |
β |
||
Boolean |
β |
β |
β |
β |
β |
β |
β |
Int8/16/32/64 |
β |
β |
β |
β |
β |
β |
β |
UInt8/16/32/64 |
β |
β |
β |
β |
β |
β |
β |
Float16 |
β |
β |
|||||
Float32/64 |
β |
β |
β |
β |
β |
β |
β |
Decimal128 |
β |
β |
β |
β |
β |
β |
|
Decimal256 |
β |
β |
β |
β |
|||
Date32/64 |
β |
β |
β |
β |
β |
β |
β |
Time32/64 |
β |
β |
β |
β |
β |
β |
|
Timestamp |
β |
β |
β |
β |
β |
β |
β |
Duration |
β |
β |
β |
β |
β |
||
Interval |
β |
β |
β |
β |
β |
||
Fixed Size Binary |
β |
β |
β |
β |
β |
β |
β |
Binary |
β |
β |
β |
β |
β |
β |
β |
Large Binary |
β |
β |
β |
β |
β |
β |
|
Utf8 |
β |
β |
β |
β |
β |
β |
β |
Large Utf8 |
β |
β |
β |
β |
β |
β |
Data type (nested) |
C++ |
Java |
Go |
JavaScript |
C# |
Rust |
Julia |
---|---|---|---|---|---|---|---|
Fixed Size List |
β |
β |
β |
β |
β |
β |
|
List |
β |
β |
β |
β |
β |
β |
β |
Large List |
β |
β |
β |
β |
|||
Struct |
β |
β |
β |
β |
β |
β |
β |
Map |
β |
β |
β |
β |
β |
||
Dense Union |
β |
β |
β |
||||
Sparse Union |
β |
β |
β |
Data type (special) |
C++ |
Java |
Go |
JavaScript |
C# |
Rust |
Julia |
---|---|---|---|---|---|---|---|
Dictionary |
β |
β (1) |
β |
β (1) |
β (1) |
β (1) |
β |
Extension |
β |
β |
β |
β |
Notes:
(1) Nested dictionaries not supported
See also
The Arrow Columnar Format specification.
IPC Format#
IPC Feature |
C++ |
Java |
Go |
JavaScript |
C# |
Rust |
Julia |
---|---|---|---|---|---|---|---|
Arrow stream format |
β |
β |
β |
β |
β |
β |
β |
Arrow file format |
β |
β |
β |
β |
β |
β |
β |
Record batches |
β |
β |
β |
β |
β |
β |
β |
Dictionaries |
β |
β |
β |
β |
β |
β |
β |
Replacement dictionaries |
β |
β |
β |
β |
|||
Delta dictionaries |
β (1) |
β (1) |
β |
β |
|||
Tensors |
β |
||||||
Sparse tensors |
β |
||||||
Buffer compression |
β |
β (3) |
β |
β |
|||
Endianness conversion |
β (2) |
||||||
Custom schema metadata |
β |
β |
β |
β |
β |
β |
Notes:
(1) Delta dictionaries not supported on nested dictionaries
(2) Data with non-native endianness can be byte-swapped automatically when reading.
(3) LZ4 Codec currently is quite inefficient. ARROW-11901 tracks improving performance.
See also
The Serialization and Interprocess Communication (IPC) specification.
Flight RPC#
Note
Flight RPC is still experimental.
Flight RPC Transport |
C++ |
Java |
Go |
JavaScript |
C# |
Rust |
Julia |
---|---|---|---|---|---|---|---|
gRPC transport (grpc:, grpc+tcp:) |
β |
β |
β |
β |
|||
gRPC domain socket transport (grpc+unix:) |
β |
β |
β |
β |
|||
gRPC + TLS transport (grpc+tls:) |
β |
β |
β |
β |
|||
UCX transport (ucx:) |
β |
Supported features in the gRPC transport:
Flight RPC Feature |
C++ |
Java |
Go |
JavaScript |
C# |
Rust |
Julia |
---|---|---|---|---|---|---|---|
All RPC methods |
β |
β |
β |
Γ (1) |
β |
||
Authentication handlers |
β |
β |
β |
β (2) |
β |
||
Call timeouts |
β |
β |
β |
β |
|||
Call cancellation |
β |
β |
β |
β |
|||
Concurrent client calls (3) |
β |
β |
β |
β |
β |
||
Custom middleware |
β |
β |
β |
β |
|||
RPC error codes |
β |
β |
β |
β |
β |
Supported features in the UCX transport:
Flight RPC Feature |
C++ |
Java |
Go |
JavaScript |
C# |
Rust |
Julia |
---|---|---|---|---|---|---|---|
All RPC methods |
Γ (4) |
||||||
Authentication handlers |
|||||||
Call timeouts |
|||||||
Call cancellation |
|||||||
Concurrent client calls |
β (5) |
||||||
Custom middleware |
|||||||
RPC error codes |
β |
Notes:
(1) No support for handshake or DoExchange.
(2) Support using AspNetCore authentication handlers.
(3) Whether a single client can support multiple concurrent calls.
(4) Only support for DoExchange, DoGet, DoPut, and GetFlightInfo.
(5) Each concurrent call is a separate connection to the server (unlike gRPC where concurrent calls are multiplexed over a single connection). This will generally provide better throughput but consumes more resources both on the server and the client.
See also
The Arrow Flight RPC specification.
C Data Interface#
Feature |
C++ |
Python |
R |
Rust |
Go |
Java |
C/GLib |
Ruby |
---|---|---|---|---|---|---|---|---|
Schema export |
β |
β |
β |
β |
β |
β |
β |
β |
Array export |
β |
β |
β |
β |
β |
β |
β |
β |
Schema import |
β |
β |
β |
β |
β |
β |
β |
β |
Array import |
β |
β |
β |
β |
β |
β |
β |
β |
See also
The C Data Interface specification.
C Stream Interface (experimental)#
Feature |
C++ |
Python |
Go |
C/GLib |
Ruby |
---|---|---|---|---|---|
Stream export |
β |
β |
β |
β |
|
Stream import |
β |
β |
β |
β |
β |
See also
The C Stream Interface specification.
Third-Party Data Formats#
Format |
C++ |
Java |
Go |
JavaScript |
C# |
Rust |
Julia |
---|---|---|---|---|---|---|---|
Avro |
R |
||||||
CSV |
R/W |
R/W |
R/W |
R/W |
|||
ORC |
R/W |
R (2) |
|||||
Parquet |
R/W |
R (3) |
R/W |
R/W (1) |
Notes:
R = Read supported
W = Write supported
(1) Nested read/write not supported.
(2) Through JNI bindings. (Provided by
org.apache.arrow.orc:arrow-orc
)(3) Through JNI bindings to Arrow C++ Datasets. (Provided by
org.apache.arrow:arrow-dataset
)