Data Types and Schemas#

Factory Functions#

These should be used to create Arrow data types and schemas.

null()

Create instance of null type.

bool_()

Create instance of boolean type.

int8()

Create instance of signed int8 type.

int16()

Create instance of signed int16 type.

int32()

Create instance of signed int32 type.

int64()

Create instance of signed int64 type.

uint8()

Create instance of unsigned int8 type.

uint16()

Create instance of unsigned uint16 type.

uint32()

Create instance of unsigned uint32 type.

uint64()

Create instance of unsigned uint64 type.

float16()

Create half-precision floating point type.

float32()

Create single-precision floating point type.

float64()

Create double-precision floating point type.

time32(unit)

Create instance of 32-bit time (time of day) type with unit resolution.

time64(unit)

Create instance of 64-bit time (time of day) type with unit resolution.

timestamp(unit[, tz])

Create instance of timestamp type with resolution and optional time zone.

date32()

Create instance of 32-bit date (days since UNIX epoch 1970-01-01).

date64()

Create instance of 64-bit date (milliseconds since UNIX epoch 1970-01-01).

duration(unit)

Create instance of a duration type with unit resolution.

month_day_nano_interval()

Create instance of an interval type representing months, days and nanoseconds between two dates.

binary(int length=-1)

Create variable-length binary type.

string()

Create UTF8 variable-length string type.

utf8()

Alias for string().

large_binary()

Create large variable-length binary type.

large_string()

Create large UTF8 variable-length string type.

large_utf8()

Alias for large_string().

decimal128(int precision, int scale=0)

Create decimal type with precision and scale and 128-bit width.

list_(value_type, int list_size=-1)

Create ListType instance from child data type or field.

large_list(value_type)

Create LargeListType instance from child data type or field.

map_(key_type, item_type[, keys_sorted])

Create MapType instance from key and item data types or fields.

struct(fields)

Create StructType instance from fields.

dictionary(index_type, value_type, ...)

Dictionary (categorical, or simply encoded) type.

field(name, type, bool nullable=True[, metadata])

Create a pyarrow.Field instance.

schema(fields[, metadata])

Construct pyarrow.Schema from collection of fields.

from_numpy_dtype(dtype)

Convert NumPy dtype to pyarrow.DataType.

Utility Functions#

unify_schemas(schemas)

Unify schemas by merging fields by name.

Type Classes#

Do not instantiate these classes directly. Instead, call one of the factory functions above.

DataType()

Base class of all Arrow data types.

DictionaryType

Concrete class for dictionary data types.

ListType

Concrete class for list data types.

MapType

Concrete class for map data types.

StructType

Concrete class for struct data types.

UnionType

Base class for union data types.

TimestampType

Concrete class for timestamp data types.

Time32Type

Concrete class for time32 data types.

Time64Type

Concrete class for time64 data types.

FixedSizeBinaryType

Concrete class for fixed-size binary data types.

Decimal128Type

Concrete class for decimal128 data types.

Field()

A named field, with a data type, nullability, and optional metadata.

Schema()

A named collection of types a.k.a schema.

Specific classes and functions for extension types.

ExtensionType(DataType storage_type, ...)

Concrete base class for Python-defined extension types.

PyExtensionType(DataType storage_type)

Concrete base class for Python-defined extension types based on pickle for (de)serialization.

register_extension_type(ext_type)

Register a Python extension type.

unregister_extension_type(type_name)

Unregister a Python extension type.

Type Checking#

These functions are predicates to check whether a DataType instance represents a given data type (such as int32) or general category (such as “is a signed integer”).

is_boolean(t)

Return True if value is an instance of a boolean type.

is_integer(t)

Return True if value is an instance of any integer type.

is_signed_integer(t)

Return True if value is an instance of any signed integer type.

is_unsigned_integer(t)

Return True if value is an instance of any unsigned integer type.

is_int8(t)

Return True if value is an instance of an int8 type.

is_int16(t)

Return True if value is an instance of an int16 type.

is_int32(t)

Return True if value is an instance of an int32 type.

is_int64(t)

Return True if value is an instance of an int64 type.

is_uint8(t)

Return True if value is an instance of an uint8 type.

is_uint16(t)

Return True if value is an instance of an uint16 type.

is_uint32(t)

Return True if value is an instance of an uint32 type.

is_uint64(t)

Return True if value is an instance of an uint64 type.

is_floating(t)

Return True if value is an instance of a floating point numeric type.

is_float16(t)

Return True if value is an instance of a float16 (half-precision) type.

is_float32(t)

Return True if value is an instance of a float32 (single precision) type.

is_float64(t)

Return True if value is an instance of a float64 (double precision) type.

is_decimal(t)

Return True if value is an instance of a decimal type.

is_list(t)

Return True if value is an instance of a list type.

is_large_list(t)

Return True if value is an instance of a large list type.

is_struct(t)

Return True if value is an instance of a struct type.

is_union(t)

Return True if value is an instance of a union type.

is_nested(t)

Return True if value is an instance of a nested type.

is_temporal(t)

Return True if value is an instance of date, time, timestamp or duration.

is_timestamp(t)

Return True if value is an instance of a timestamp type.

is_date(t)

Return True if value is an instance of a date type.

is_date32(t)

Return True if value is an instance of a date32 (days) type.

is_date64(t)

Return True if value is an instance of a date64 (milliseconds) type.

is_time(t)

Return True if value is an instance of a time type.

is_time32(t)

Return True if value is an instance of a time32 type.

is_time64(t)

Return True if value is an instance of a time64 type.

is_null(t)

Return True if value is an instance of a null type.

is_binary(t)

Return True if value is an instance of a variable-length binary type.

is_unicode(t)

Alias for is_string.

is_string(t)

Return True if value is an instance of string (utf8 unicode) type.

is_large_binary(t)

Return True if value is an instance of a large variable-length binary type.

is_large_unicode(t)

Alias for is_large_string.

is_large_string(t)

Return True if value is an instance of large string (utf8 unicode) type.

is_fixed_size_binary(t)

Return True if value is an instance of a fixed size binary type.

is_map(t)

Return True if value is an instance of a map logical type.

is_dictionary(t)

Return True if value is an instance of a dictionary-encoded type.