Schema/Data Type Objects#
- class ExtensionAccessor(schema)#
Accessor for extension type parameters
- property metadata: bytes | None#
Extension metadata for this extension type if present
- property name: str#
Extension name for this extension type
- property storage#
Storage type for this extension type
- class Schema(obj, *, name=None, nullable=None, metadata=None, fields=None, **params)#
Create a nanoarrow Schema
The Schema is nanoarrow’s high-level data type representation, encompassing the role of PyArrow’s
Schema
,Field
, andDataType
. This scope maps to that of the ArrowSchema in the Arrow C Data interface.Parameters#
- obj :
A
Type
specifier or a schema-like object. A schema-like object includes: * Apyarrow.Schema
, pyarrow.Field`, orpyarrow.DataType
* A nanoarrowSchema
,CSchema
, orType
* Any object implementing the Arrow PyCapsule interface protocol method.- namestr, optional
An optional name to bind to this field.
- nullablebool, optional
Explicitly specify field nullability. Fields are nullable by default.
- metadatamapping, optional
Explicitly specify field metadata.
- params :
Type-specific parameters when
obj
is aType
.
Examples#
>>> import nanoarrow as na >>> import pyarrow as pa >>> na.Schema(na.Type.INT32) <Schema> int32 >>> na.Schema(na.Type.DURATION, unit=na.TimeUnit.SECOND) <Schema> duration('s') >>> na.Schema(pa.int32()) <Schema> int32
- property byte_width: int | None#
Element byte width for fixed-size binary type
Returns
None
for types for which this property is not relevant.>>> import nanoarrow as na >>> na.fixed_size_binary(123).byte_width 123
- property dictionary_ordered: bool | None#
Dictionary ordering
For dictionary types, returns
True
if the order of dictionary values are meaningful.>>> import nanoarrow as na >>> na.dictionary(na.int32(), na.string()).dictionary_ordered False
- property extension: ExtensionAccessor | None#
Access extension type attributes
>>> import nanoarrow as na >>> schema = na.extension_type(na.int32(), "arrow.example", b"{}") >>> schema.extension.name 'arrow.example' >>> schema.extension.metadata b'{}'
- field(i) Schema #
Extract a child Schema
>>> import nanoarrow as na >>> schema = na.struct({"col1": na.int32()}) >>> schema.field(0) <Schema> 'col1': int32
- property fields: List[Schema]#
Iterate over child Schemas
>>> import nanoarrow as na >>> schema = na.struct({"col1": na.int32()}) >>> for field in schema.fields: ... print(field.name) ... col1
- property index_type: Schema | None#
Dictionary index type
For dictionary types, the type corresponding to the indices. See also
value_type
.>>> import nanoarrow as na >>> na.dictionary(na.int32(), na.string()).index_type <Schema> int32
- property key_type: Schema | None#
Map key type
>>> import nanoarrow as na >>> na.map_(na.int32(), na.string()).key_type <Schema> 'key': non-nullable int32
- property list_size: int | None#
Fixed-size list element size
>>> import nanoarrow as na >>> na.fixed_size_list(na.int32(), 123).list_size 123
- property metadata: Mapping[bytes, bytes]#
Access field metadata of this field
>>> import nanoarrow as na >>> schema = na.Schema(na.int32(), metadata={"key": "value"}) >>> dict(schema.metadata.items()) {b'key': b'value'}
- property n_fields: int#
Number of child Schemas
>>> import nanoarrow as na >>> schema = na.struct({"col1": na.int32()}) >>> schema.n_fields 1
- property name: str | None#
Field name of this Schema
>>> import nanoarrow as na >>> schema = na.struct({"col1": na.int32()}) >>> schema.field(0).name 'col1'
- property nullable: bool#
Nullability of this field
>>> import nanoarrow as na >>> na.int32().nullable True >>> na.int32(nullable=False).nullable False
- property params: Mapping#
Get parameter names and values for this type
Returns a dictionary of parameters that can be used to reconstruct this type together with its type identifier.
>>> import nanoarrow as na >>> na.fixed_size_binary(123).params {'byte_width': 123}
- property precision: int#
Decimal precision
>>> import nanoarrow as na >>> na.decimal128(10, 3).precision 10
- property scale: int#
Decimal scale
>>> import nanoarrow as na >>> na.decimal128(10, 3).scale 3
- serialize(dst=None) bytes | None #
Write this Schema into dst as an encapsulated IPC message
Parameters#
- dstfile-like, optional
If present, a file-like object into which the schema should be serialized. If omitted, this will create a
io.BytesIO()
and return the serialized result.
- property timezone: str | None#
Timezone for timestamp types
Returns
None
for types for which this property is not relevant or for timezone types for which the timezone is not set.>>> import nanoarrow as na >>> na.timestamp(na.TimeUnit.SECOND, timezone="America/Halifax").timezone 'America/Halifax'
- property type: Type#
Type enumerator value of this Schema
>>> import nanoarrow as na >>> na.int32().type <Type.INT32: 8>
- property unit: TimeUnit | None#
TimeUnit for timestamp, time, and duration types
Returns
None
for types for which this property is not relevant.>>> import nanoarrow as na >>> na.timestamp(na.TimeUnit.SECOND).unit <TimeUnit.SECOND: 0>
- property value_type: Schema | None#
Dictionary, map, or list value type
>>> import nanoarrow as na >>> na.list_(na.int32()).value_type <Schema> 'item': int32 >>> na.map_(na.int32(), na.string()).value_type <Schema> 'value': string >>> na.dictionary(na.int32(), na.string()).value_type <Schema> string
- class TimeUnit(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)#
Unit enumerator for timestamp, duration, and time types.
- static create(obj)#
Create a TimeUnit from parameter input.
This constructor will accept the abbreviations “s”, “ms”, “us”, and “ns” and return the appropriate enumerator value.
>>> import nanoarrow as na >>> na.TimeUnit.create("s") <TimeUnit.SECOND: 0>
- class Type(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)#
The Type enumerator provides a means by which the various type categories can be identified. Type values can be used in place of
Schema
instances in most places for parameter-free types.
- binary(nullable: bool = True) Schema #
Create an instance of a variable or fixed-width binary type.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.binary() <Schema> binary
- binary_view(nullable: bool = True) Schema #
Create an instance of a binary view type.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.binary_view() <Schema> binary_view
- bool_(nullable: bool = True) Schema #
Create an instance of a boolean type.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.bool_() <Schema> bool
- date32(nullable: bool = True) Schema #
Create an instance of a 32-bit date type (days since 1970-01-01).
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.date32() <Schema> date32
- date64(nullable: bool = True) Schema #
Create an instance of a 64-bit date type (milliseconds since 1970-01-01).
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.date64() <Schema> date64
- decimal128(precision: int, scale: int, nullable: bool = True) Schema #
Create an instance of a 128-bit decimal type.
Parameters#
- precisionint
The number of significant digits representable by this type. Must be between 1 and 38.
- scaleint
The number of digits after the decimal point for values of this type.
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.decimal128(10, 3) <Schema> decimal128(10, 3)
- decimal256(precision: int, scale: int, nullable: bool = True) Schema #
Create an instance of a 256-bit decimal type.
Parameters#
- precisionint
The number of significant digits representable by this type. Must be between 1 and 76.
- scaleint
The number of digits after the decimal point for values of this type.
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.decimal256(10, 3) <Schema> decimal256(10, 3)
- dictionary(index_type, value_type, dictionary_ordered: bool = False) Schema #
Create a type representing dictionary-encoded values
Parameters#
- index_typeschema-like
The data type of the indices. Must be an integral type.
- value_typeschema-like
The type of the dictionary array.
- ordered: bool, optional
Use
True
if the order of values in the dictionary array is meaningful.- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.dictionary(na.int32(), na.string()) <Schema> dictionary(int32)<string>
- duration(unit, nullable: bool = True) Schema #
Create an instance of a duration type.
Parameters#
- unitstr or
TimeUnit
The unit of values stored by this type.
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.duration("s") <Schema> duration('s')
- unitstr or
- extension_type(storage_schema, extension_name: str, extension_metadata: str | bytes | None = None, nullable: bool = True) Schema #
Create an Arrow extension type
Parameters#
- extension_name: str
The extension name to associate with this type.
- extension_metadata: str or bytes, optional
Extension metadata containing extension parameters associated with this extension type.
- nullablebool, optional
Use
False
to mark this field as non-nullable.
- fixed_size_binary(byte_width: int, nullable: bool = True) Schema #
Create an instance of a variable or fixed-width binary type.
Parameters#
- byte_widthint
The width of each element in bytes.
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.fixed_size_binary(123) <Schema> fixed_size_binary(123)
- fixed_size_list(value_type, list_size: int, nullable: bool = True) Schema #
Create a type representing a fixed-size list of some other type.
Parameters#
- value_typeschema-like
The type of values in each list element.
- list_sizeint
The number of values in each list element.
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.fixed_size_list(na.int32(), 123) <Schema> fixed_size_list(123)<item: int32>
- float16(nullable: bool = True) Schema #
Create an instance of a 16-bit floating-point type.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.float16() <Schema> half_float
- float32(nullable: bool = True) Schema #
Create an instance of a 32-bit floating-point type.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.float32() <Schema> float
- float64(nullable: bool = True) Schema #
Create an instance of a 64-bit floating-point type.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.float64() <Schema> double
- int16(nullable: bool = True) Schema #
Create an instance of a signed 16-bit integer type.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.int16() <Schema> int16
- int32(nullable: bool = True) Schema #
Create an instance of a signed 32-bit integer type.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.int32() <Schema> int32
- int64(nullable: bool = True) Schema #
Create an instance of a signed 32-bit integer type.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.int64() <Schema> int64
- int8(nullable: bool = True) Schema #
Create an instance of a signed 8-bit integer type.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.int8() <Schema> int8
- interval_day_time(nullable: bool = True) Schema #
Create an instance of an interval type measured as a day/time pair.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.interval_day_time() <Schema> interval_day_time
- interval_month_day_nano(nullable: bool = True) Schema #
Create an instance of an interval type measured as a month/day/nanosecond tuple.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.interval_month_day_nano() <Schema> interval_month_day_nano
- interval_months(nullable: bool = True) Schema #
Create an instance of an interval type measured in months.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.interval_months() <Schema> interval_months
- large_binary(nullable: bool = True) Schema #
Create an instance of a variable-length binary type that uses 64-bit offsets.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.large_binary() <Schema> large_binary
- large_list(value_type, nullable: bool = True) Schema #
Create a type representing a variable-size list of some other type.
Unlike
list_()
, the func:large_list can accomodate arrays with more than2 ** 31 - 1
items in the values array.Parameters#
- value_typeschema-like
The type of values in each list element.
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.large_list(na.int32()) <Schema> large_list<item: int32>
- large_string(nullable: bool = True) Schema #
Create an instance of a variable-length UTF-8 encoded string type that uses 64-bit offsets.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.large_string() <Schema> large_string
- list_(value_type, nullable: bool = True) Schema #
Create a type representing a variable-size list of some other type.
Parameters#
- value_typeschema-like
The type of values in each list element.
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.list_(na.int32()) <Schema> list<item: int32>
- map_(key_type, value_type, keys_sorted: bool = False, nullable: bool = True)#
Create a type representing a list of key/value mappings
Note that each element in the list contains potentially many key/value pairs (and that a map array contains potentially many individual mappings).
Parameters#
- value_typeschema-like
The type of keys in each map element.
- value_typeschema-like
The type of values in each map element
- keys_sortedbool, optional
True if keys within each map element are sorted.
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.map_(na.int32(), na.string()) <Schema> map<entries: struct<key: int32, value: string>>
- null(nullable: bool = True) Schema #
Create an instance of a null type.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.null() <Schema> na
- schema(obj, **kwargs) Schema #
Alias for the
Schema
class constructor. The use ofnanoarrow.Schema()
is preferred overnanoarrow.schema()
.
- string(nullable: bool = True) Schema #
Create an instance of a variable-length UTF-8 encoded string type.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.string() <Schema> string
- string_view(nullable: bool = True) Schema #
Create an instance of a string view type.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.string_view() <Schema> string_view
- struct(fields, nullable: bool = True) Schema #
Create a type representing a named sequence of fields.
Parameters#
- fields :
A dictionary whose keys are field names and values are schema-like objects
An iterable whose items are a schema like objects where the field name is inherited from the schema-like object.
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.struct([na.int32()]) <Schema> struct<: int32> >>> na.struct({"col1": na.int32()}) <Schema> struct<col1: int32>
- time32(unit: str | TimeUnit, nullable: bool = True) Schema #
Create an instance of a 32-bit time of day type.
Parameters#
- unitstr or
TimeUnit
The unit of values stored by this type.
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.time32("s") <Schema> time32('s')
- unitstr or
- time64(unit: str | TimeUnit, nullable: bool = True) Schema #
Create an instance of a 64-bit time of day type.
Parameters#
- unitstr or
TimeUnit
The unit of values stored by this type.
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.time64("us") <Schema> time64('us')
- unitstr or
- timestamp(unit: str | TimeUnit, timezone: str | None = None, nullable: bool = True) Schema #
Create an instance of a timestamp type.
Parameters#
- unitstr or
TimeUnit
The unit of values stored by this type.
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.timestamp("s") <Schema> timestamp('s', '') >>> na.timestamp("s", timezone="America/Halifax") <Schema> timestamp('s', 'America/Halifax')
- unitstr or
- uint16(nullable: bool = True) Schema #
Create an instance of an unsigned 16-bit integer type.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.uint16() <Schema> uint16
- uint32(nullable: bool = True) Schema #
Create an instance of an unsigned 32-bit integer type.
Parameters#
- nullablebool, optional
Use
False
to mark this field as non-nullable.
Examples#
>>> import nanoarrow as na >>> na.uint32() <Schema> uint32