pyarrow.Schema

class pyarrow.Schema

Bases: pyarrow.lib._Weakrefable

__init__(*args, **kwargs)

Methods

__init__(*args, **kwargs)

add_metadata(self, metadata)

append(self, Field field)

Append a field at the end of the schema.

empty_table(self)

Provide an empty table according to the schema.

equals(self, Schema other, ...)

Test if this schema is equal to the other

field(self, i)

Select a field by its column name or numeric index.

field_by_name(self, name)

Access a field by its name rather than the column index.

from_pandas(type cls, df[, preserve_index])

Returns implied schema from dataframe

get_all_field_indices(self, name)

Return sorted list of indices for fields with the given name

get_field_index(self, name)

Return index of field with given unique name.

insert(self, int i, Field field)

Add a field at position i to the schema.

remove(self, int i)

Remove the field at index i from the schema.

remove_metadata(self)

Create new schema without metadata, if any

serialize(self[, memory_pool])

Write Schema to Buffer as encapsulated IPC message

set(self, int i, Field field)

Replace a field at position i in the schema.

to_string(self[, truncate_metadata, ...])

Return human-readable representation of Schema

with_metadata(self, metadata)

Add metadata as dict of string keys and values to Schema

Attributes

metadata

names

The schema's field names.

pandas_metadata

Return deserialized-from-JSON pandas metadata field (if it exists)

types

The schema's field types.

add_metadata(self, metadata)
append(self, Field field)

Append a field at the end of the schema.

In contrast to Python’s list.append() it does return a new object, leaving the original Schema unmodified.

Parameters
field: Field
Returns
schema: Schema

New object with appended field.

empty_table(self)

Provide an empty table according to the schema.

Returns
table: pyarrow.Table
equals(self, Schema other, bool check_metadata=False)

Test if this schema is equal to the other

Parameters
otherpyarrow.Schema
check_metadatabool, default False

Key/value metadata must be equal too

Returns
is_equalbool
field(self, i)

Select a field by its column name or numeric index.

Parameters
iint or str
Returns
pyarrow.Field
field_by_name(self, name)

Access a field by its name rather than the column index.

Parameters
name: str
Returns
field: pyarrow.Field
from_pandas(type cls, df, preserve_index=None)

Returns implied schema from dataframe

Parameters
dfpandas.DataFrame
preserve_indexbool, default True

Whether to store the index as an additional column (or columns, for MultiIndex) in the resulting Table. The default of None will store the index as a column, except for RangeIndex which is stored as metadata only. Use preserve_index=True to force it to be stored as a column.

Returns
pyarrow.Schema

Examples

>>> import pandas as pd
>>> import pyarrow as pa
>>> df = pd.DataFrame({
    ...     'int': [1, 2],
    ...     'str': ['a', 'b']
    ... })
>>> pa.Schema.from_pandas(df)
int: int64
str: string
__index_level_0__: int64
get_all_field_indices(self, name)

Return sorted list of indices for fields with the given name

get_field_index(self, name)

Return index of field with given unique name. Returns -1 if not found or if duplicated

insert(self, int i, Field field)

Add a field at position i to the schema.

Parameters
i: int
field: Field
Returns
schema: Schema
metadata
names

The schema’s field names.

Returns
list of str
pandas_metadata

Return deserialized-from-JSON pandas metadata field (if it exists)

remove(self, int i)

Remove the field at index i from the schema.

Parameters
i: int
Returns
schema: Schema
remove_metadata(self)

Create new schema without metadata, if any

Returns
schemapyarrow.Schema
serialize(self, memory_pool=None)

Write Schema to Buffer as encapsulated IPC message

Parameters
memory_poolMemoryPool, default None

Uses default memory pool if not specified

Returns
serializedBuffer
set(self, int i, Field field)

Replace a field at position i in the schema.

Parameters
i: int
field: Field
Returns
schema: Schema
to_string(self, truncate_metadata=True, show_field_metadata=True, show_schema_metadata=True)

Return human-readable representation of Schema

Parameters
truncate_metadatabool, default True

Limit metadata key/value display to a single line of ~80 characters or less

show_field_metadatabool, default True

Display Field-level KeyValueMetadata

show_schema_metadatabool, default True

Display Schema-level KeyValueMetadata

Returns
strthe formatted output
types

The schema’s field types.

Returns
list of DataType
with_metadata(self, metadata)

Add metadata as dict of string keys and values to Schema

Parameters
metadatadict

Keys and values must be string-like / coercible to bytes

Returns
schemapyarrow.Schema