pyarrow.acero.Declaration#

class pyarrow.acero.Declaration(factory_name, ExecNodeOptions options, inputs=None)#

Bases: _Weakrefable

Helper class for declaring the nodes of an ExecPlan.

A Declaration represents an unconstructed ExecNode, and potentially more since its inputs may also be Declarations or when constructed with from_sequence.

The possible ExecNodes to use are registered with a name, the “factory name”, and need to be specified using this name, together with its corresponding ExecNodeOptions subclass.

Parameters:
factory_namestr

The ExecNode factory name, such as “table_source”, “filter”, “project” etc. See the ExecNodeOptions subclasses for the exact factory names to use.

optionsExecNodeOptions

Corresponding ExecNodeOptions subclass (matching the factory name).

inputslist of Declaration, optional

Input nodes for this declaration. Optional if the node is a source node, or when the declaration gets combined later with from_sequence.

Returns:
Declaration
__init__(*args, **kwargs)#

Methods

__init__(*args, **kwargs)

from_sequence(decls)

Convenience factory for the common case of a simple sequence of nodes.

to_reader(self, bool use_threads=True)

Run the declaration and return results as a RecordBatchReader.

to_table(self, bool use_threads=True)

Run the declaration and collect the results into a table.

static from_sequence(decls)#

Convenience factory for the common case of a simple sequence of nodes.

Each of the declarations will be appended to the inputs of the subsequent declaration, and the final modified declaration will be returned.

Parameters:
declslist of Declaration
Returns:
Declaration
to_reader(self, bool use_threads=True)#

Run the declaration and return results as a RecordBatchReader.

For details about the parameters, see to_table.

Returns:
pyarrow.RecordBatchReader
to_table(self, bool use_threads=True)#

Run the declaration and collect the results into a table.

This method will implicitly add a sink node to the declaration to collect results into a table. It will then create an ExecPlan from the declaration, start the exec plan, block until the plan has finished, and return the created table.

Parameters:
use_threadsbool, default True

If set to False, then all CPU work will be done on the calling thread. I/O tasks will still happen on the I/O executor and may be multi-threaded (but should not use significant CPU resources).

Returns:
pyarrow.Table