Configuration Settings

The following configuration options can be passed to SessionConfig to control various aspects of query execution.

For applications which do not expose SessionConfig, like datafusion-cli, these options may also be set via environment variables. To construct a session with options from the environment, use SessionConfig::from_env. The name of the environment variable is the option’s key, transformed to uppercase and with periods replaced with underscores. For example, to configure datafusion.execution.batch_size you would set the DATAFUSION_EXECUTION_BATCH_SIZE environment variable. Values are parsed according to the same rules used in casts from Utf8. If the value in the environment variable cannot be cast to the type of the configuration option, the default value will be used instead and a warning emitted. Environment variables are read during SessionConfig initialisation so they must be set beforehand and will not affect running sessions.

key

type

default

description

datafusion.execution.batch_size

UInt64

8192

Default batch size while creating new batches, it’s especially useful for buffer-in-memory batches since creating tiny batches would results in too much metadata memory consumption.

datafusion.execution.coalesce_batches

Boolean

true

When set to true, record batches will be examined between each operator and small batches will be coalesced into larger batches. This is helpful when there are highly selective filters or joins that could produce tiny output batches. The target batch size is determined by the configuration setting ‘datafusion.execution.coalesce_target_batch_size’.

datafusion.execution.coalesce_target_batch_size

UInt64

4096

Target batch size when coalescing batches. Uses in conjunction with the configuration setting ‘datafusion.execution.coalesce_batches’.

datafusion.optimizer.filter_null_join_keys

Boolean

false

When set to true, the optimizer will insert filters before a join between a nullable and non-nullable column to filter out nulls on the nullable side. This filter can add additional overhead when the file format does not fully support predicate push down.