The Gandiva Expression Compiler#

Gandiva is a runtime expression compiler that uses LLVM to generate efficient native code for compute on Arrow record batches. Gandiva only handles projections and filters; for other transformations, see Compute Functions.

Gandiva was designed to take advantage of the Arrow memory format and modern hardware. From the Arrow memory model, since Arrow arrays have separate buffers for values and validity bitmaps, values and their null status can often be processed independently, allowing for better instruction pipelining. On modern hardware, compiling expressions using LLVM allows the execution to be optimized to the local runtime environment and hardware, including available SIMD instructions. To reduce optimization overhead, many Gandiva functions are pre-compiled into LLVM IR (intermediate representation).

Expression, Projector and Filter#

To effectively utilize Gandiva, you will construct expression trees with TreeExprBuilder, including the creation of function nodes, if-else logic, and boolean expressions. Subsequently, leverage Projector or Filter execution kernels to efficiently evaluate these expressions. See Gandiva Expression, Projector, and Filter for more details.

External Functions Development#

Gandiva offers the capability of integrating external functions, encompassing both C functions and IR functions. This feature broadens the spectrum of functions that can be applied within Gandiva expressions. For developers looking to customize and enhance their computational solutions, Gandiva provides the opportunity to develop and register their own external functions, thus allowing for a more tailored and flexible use of the Gandiva environment. See Gandiva External Functions Development Guide for more details.