pyarrow.PyExtensionType¶
- class pyarrow.PyExtensionType(DataType storage_type)¶
Bases:
ExtensionType
Concrete base class for Python-defined extension types based on pickle for (de)serialization.
- Parameters:
- storage_type
DataType
The storage type for which the extension is built.
- storage_type
Examples
Define a UuidType extension type subclassing PyExtensionType:
>>> import pyarrow as pa >>> class UuidType(pa.PyExtensionType): ... def __init__(self): ... pa.PyExtensionType.__init__(self, pa.binary(16)) ... def __reduce__(self): ... return UuidType, () ...
Create an instance of UuidType extension type:
>>> uuid_type = UuidType() >>> uuid_type UuidType(FixedSizeBinaryType(fixed_size_binary[16]))
Inspect the extension type:
>>> uuid_type.extension_name 'arrow.py_extension_type' >>> uuid_type.storage_type FixedSizeBinaryType(fixed_size_binary[16])
Wrap an array as an extension array:
>>> import uuid >>> storage_array = pa.array([uuid.uuid4().bytes for _ in range(4)], ... pa.binary(16)) >>> uuid_type.wrap_array(storage_array) <pyarrow.lib.ExtensionArray object at ...> [ ... ]
Or do the same with creating an ExtensionArray:
>>> pa.ExtensionArray.from_storage(uuid_type, ... storage_array) <pyarrow.lib.ExtensionArray object at ...> [ ... ]
- __init__(*args, **kwargs)¶
Methods
__init__
(*args, **kwargs)equals
(self, other, *[, check_metadata])Return true if type is equivalent to passed value.
field
(self, i)to_pandas_dtype
(self)Return the equivalent NumPy / Pandas dtype.
wrap_array
(self, storage)Wrap the given storage array as an extension array.
Attributes
Bit width for fixed width type.
The extension type name.
Number of data buffers required to construct Array type excluding children.
The number of child fields.
The underlying storage type.
- bit_width¶
Bit width for fixed width type.
Examples
>>> import pyarrow as pa >>> pa.int64() DataType(int64) >>> pa.int64().bit_width 64
- equals(self, other, *, check_metadata=False)¶
Return true if type is equivalent to passed value.
- Parameters:
- Returns:
- is_equalbool
Examples
>>> import pyarrow as pa >>> pa.int64().equals(pa.string()) False >>> pa.int64().equals(pa.int64()) True
- extension_name¶
The extension type name.
- id¶
- num_buffers¶
Number of data buffers required to construct Array type excluding children.
Examples
>>> import pyarrow as pa >>> pa.int64().num_buffers 2 >>> pa.string().num_buffers 3
- num_fields¶
The number of child fields.
Examples
>>> import pyarrow as pa >>> pa.int64() DataType(int64) >>> pa.int64().num_fields 0 >>> pa.list_(pa.string()) ListType(list<item: string>) >>> pa.list_(pa.string()).num_fields 1 >>> struct = pa.struct({'x': pa.int32(), 'y': pa.string()}) >>> struct.num_fields 2
- storage_type¶
The underlying storage type.
- to_pandas_dtype(self)¶
Return the equivalent NumPy / Pandas dtype.
Examples
>>> import pyarrow as pa >>> pa.int64().to_pandas_dtype() <class 'numpy.int64'>
- wrap_array(self, storage)¶
Wrap the given storage array as an extension array.
- Parameters:
- storage
Array
orChunkedArray
- storage
- Returns:
- array
Array
orChunkedArray
Extension array wrapping the storage array
- array