C Data Interface¶
Arrow supports exchanging data without copying or serialization within the same process through the The Arrow C data interface, even between different language runtimes.
Java to Python¶
See Integrating PyArrow with Java to implement Java to Python communication using the C Data Interface.
Java to C++¶
Example: Share an Int64 array from C++ to Java:
C++ Side
See Building Arrow C++ to build the Arrow C++ libraries:
$ git clone https://github.com/apache/arrow.git
$ cd arrow/cpp
$ mkdir build # from inside the `cpp` subdirectory
$ cd build
$ cmake .. --preset ninja-debug-minimal
$ cmake --build .
$ tree debug/
debug/
├── libarrow.800.0.0.dylib
├── libarrow.800.dylib -> libarrow.800.0.0.dylib
└── libarrow.dylib -> libarrow.800.dylib
Implement a function in CDataCppBridge.h that exports an array via the C Data Interface:
#include <iostream>
#include <arrow/api.h>
#include <arrow/c/bridge.h>
void FillInt64Array(const uintptr_t c_schema_ptr, const uintptr_t c_array_ptr) {
arrow::Int64Builder builder;
builder.Append(1);
builder.Append(2);
builder.Append(3);
builder.AppendNull();
builder.Append(5);
builder.Append(6);
builder.Append(7);
builder.Append(8);
builder.Append(9);
builder.Append(10);
std::shared_ptr<arrow::Array> array = *builder.Finish();
struct ArrowSchema* c_schema = reinterpret_cast<struct ArrowSchema*>(c_schema_ptr);
auto c_schema_status = arrow::ExportType(*array->type(), c_schema);
if (!c_schema_status.ok()) c_schema_status.Abort();
struct ArrowArray* c_array = reinterpret_cast<struct ArrowArray*>(c_array_ptr);
auto c_array_status = arrow::ExportArray(*array, c_array);
if (!c_array_status.ok()) c_array_status.Abort();
}
Java Side
For this example, we will use JavaCPP to call our C++ function from Java, without writing JNI bindings ourselves.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.example</groupId>
<artifactId>java-cdata-example</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
<arrow.version>8.0.0</arrow.version>
</properties>
<dependencies>
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>javacpp</artifactId>
<version>1.5.7</version>
</dependency>
<dependency>
<groupId>org.apache.arrow</groupId>
<artifactId>arrow-c-data</artifactId>
<version>${arrow.version}</version>
</dependency>
<dependency>
<groupId>org.apache.arrow</groupId>
<artifactId>arrow-vector</artifactId>
<version>${arrow.version}</version>
</dependency>
<dependency>
<groupId>org.apache.arrow</groupId>
<artifactId>arrow-memory-core</artifactId>
<version>${arrow.version}</version>
</dependency>
<dependency>
<groupId>org.apache.arrow</groupId>
<artifactId>arrow-memory-netty</artifactId>
<version>${arrow.version}</version>
</dependency>
<dependency>
<groupId>org.apache.arrow</groupId>
<artifactId>arrow-format</artifactId>
<version>${arrow.version}</version>
</dependency>
</dependencies>
</project>
import org.bytedeco.javacpp.annotation.Platform;
import org.bytedeco.javacpp.annotation.Properties;
import org.bytedeco.javacpp.tools.InfoMap;
import org.bytedeco.javacpp.tools.InfoMapper;
@Properties(
target = "CDataJavaToCppExample",
value = @Platform(
include = {
"CDataCppBridge.h"
},
compiler = {"cpp11"},
linkpath = {"/arrow/cpp/build/debug/"},
link = {"arrow"}
)
)
public class CDataJavaConfig implements InfoMapper {
@Override
public void map(InfoMap infoMap) {
}
}
# Compile our Java code
$ javac -cp javacpp-1.5.7.jar CDataJavaConfig.java
# Generate CDataInterfaceLibrary
$ java -jar javacpp-1.5.7.jar CDataJavaConfig.java
# Generate libjniCDataInterfaceLibrary.dylib
$ java -jar javacpp-1.5.7.jar CDataJavaToCppExample.java
# Validate libjniCDataInterfaceLibrary.dylib created
$ otool -L macosx-x86_64/libjniCDataJavaToCppExample.dylib
macosx-x86_64/libjniCDataJavaToCppExample.dylib:
libjniCDataJavaToCppExample.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libarrow.800.dylib (compatibility version 800.0.0, current version 800.0.0)
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1200.3.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.0.0)
Java Test
Let’s create a Java class to test our bridge:
import org.apache.arrow.c.ArrowArray;
import org.apache.arrow.c.ArrowSchema;
import org.apache.arrow.c.Data;
import org.apache.arrow.memory.BufferAllocator;
import org.apache.arrow.memory.RootAllocator;
import org.apache.arrow.vector.BigIntVector;
public class TestCDataInterface {
public static void main(String[] args) {
try(
BufferAllocator allocator = new RootAllocator();
ArrowSchema arrowSchema = ArrowSchema.allocateNew(allocator);
ArrowArray arrowArray = ArrowArray.allocateNew(allocator)
){
CDataJavaToCppExample.FillInt64Array(
arrowSchema.memoryAddress(), arrowArray.memoryAddress());
try(
BigIntVector bigIntVector = (BigIntVector) Data.importVector(
allocator, arrowArray, arrowSchema, null)
){
System.out.println("C++-allocated array: " + bigIntVector);
}
}
}
}
C++-allocated array: [1, 2, 3, null, 5, 6, 7, 8, 9, 10]