pyarrow.Column

class pyarrow.Column

Bases: object

Named vector of elements of equal type.

Warning

Do not call this class’s constructor directly.

__init__()

Initialize self. See help(type(self)) for accurate signature.

Methods

cast(self, target_type, bool safe=True) Cast column values to another data type
dictionary_encode(self) Compute dictionary-encoded representation of array
equals(self, Column other) Check if contents of two columns are equal
flatten(self, MemoryPool memory_pool=None) Flatten this Column.
from_array(*args)
length(self)
to_pandas(self, …) Convert the arrow::Column to a pandas.Series
to_pylist(self) Convert to a list of native Python objects.
unique(self) Compute distinct elements in array

Attributes

data The underlying data
field
name Label of the column
null_count Number of null entires
shape Dimensions of this columns
type Type information for this column
cast(self, target_type, bool safe=True)

Cast column values to another data type

Parameters:
  • target_type (DataType) – Type to cast to
  • safe (boolean, default True) – Check for overflows or other unsafe conversions
Returns:

casted (Column)

data

The underlying data

Returns:pyarrow.ChunkedArray
dictionary_encode(self)

Compute dictionary-encoded representation of array

Returns:pyarrow.Column – Same chunking as the input, all chunks share a common dictionary.
equals(self, Column other)

Check if contents of two columns are equal

Parameters:other (pyarrow.Column) –
Returns:are_equal (boolean)
field
flatten(self, MemoryPool memory_pool=None)

Flatten this Column. If it has a struct type, the column is flattened into one column per struct field.

Parameters:memory_pool (MemoryPool, default None) – For memory allocations, if required, otherwise use default pool
Returns:result (List[Column])
static from_array(*args)
length(self)
name

Label of the column

Returns:str
null_count

Number of null entires

Returns:int
shape

Dimensions of this columns

Returns:(int,)
to_pandas(self, bool strings_to_categorical=False, bool zero_copy_only=False, bool integer_object_nulls=False, bool date_as_object=False)

Convert the arrow::Column to a pandas.Series

Parameters:
  • strings_to_categorical (boolean, default False) – Encode string (UTF8) and binary types to pandas.Categorical
  • zero_copy_only (boolean, default False) – Raise an ArrowException if this function call would require copying the underlying data
  • integer_object_nulls (boolean, default False) – Cast integers with nulls to objects
  • date_as_object (boolean, default False) – Cast dates to objects
Returns:

pandas.Series

to_pylist(self)

Convert to a list of native Python objects.

type

Type information for this column

Returns:pyarrow.DataType
unique(self)

Compute distinct elements in array

Returns:pyarrow.Array