pyarrow.Schema¶
- class pyarrow.Schema¶
Bases:
pyarrow.lib._Weakrefable
- __init__(*args, **kwargs)¶
Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(*args, **kwargs)Initialize self.
add_metadata
(self, metadata)append
(self, Field field)Append a field at the end of the schema.
empty_table
(self)Provide an empty table according to the schema.
equals
(self, Schema other, …)Test if this schema is equal to the other
field
(self, i)Select a field by its column name or numeric index.
field_by_name
(self, name)Access a field by its name rather than the column index.
from_pandas
(type cls, df[, preserve_index])Returns implied schema from dataframe
get_all_field_indices
(self, name)Return sorted list of indices for fields with the given name
get_field_index
(self, name)Return index of field with given unique name.
insert
(self, int i, Field field)Add a field at position i to the schema.
remove
(self, int i)Remove the field at index i from the schema.
remove_metadata
(self)Create new schema without metadata, if any
serialize
(self[, memory_pool])Write Schema to Buffer as encapsulated IPC message
set
(self, int i, Field field)Replace a field at position i in the schema.
to_string
(self[, truncate_metadata, …])Return human-readable representation of Schema
with_metadata
(self, metadata)Add metadata as dict of string keys and values to Schema
Attributes
The schema’s field names.
Return deserialized-from-JSON pandas metadata field (if it exists)
The schema’s field types.
- add_metadata(self, metadata)¶
- append(self, Field field)¶
Append a field at the end of the schema.
In contrast to Python’s
list.append()
it does return a new object, leaving the original Schema unmodified.- Parameters
field (Field) –
- Returns
schema (Schema) – New object with appended field.
- empty_table(self)¶
Provide an empty table according to the schema.
- Returns
table (pyarrow.Table)
- equals(self, Schema other, bool check_metadata=False)¶
Test if this schema is equal to the other
- Parameters
other (pyarrow.Schema) –
check_metadata (bool, default False) – Key/value metadata must be equal too
- Returns
is_equal (bool)
- field(self, i)¶
Select a field by its column name or numeric index.
- Parameters
i (int or string) –
- Returns
pyarrow.Field
- field_by_name(self, name)¶
Access a field by its name rather than the column index.
- Parameters
name (str) –
- Returns
field (pyarrow.Field)
- from_pandas(type cls, df, preserve_index=None)¶
Returns implied schema from dataframe
- Parameters
df (pandas.DataFrame) –
preserve_index (bool, default True) – Whether to store the index as an additional column (or columns, for MultiIndex) in the resulting Table. The default of None will store the index as a column, except for RangeIndex which is stored as metadata only. Use
preserve_index=True
to force it to be stored as a column.
- Returns
pyarrow.Schema
Examples
>>> import pandas as pd >>> import pyarrow as pa >>> df = pd.DataFrame({ ... 'int': [1, 2], ... 'str': ['a', 'b'] ... }) >>> pa.Schema.from_pandas(df) int: int64 str: string __index_level_0__: int64
- get_all_field_indices(self, name)¶
Return sorted list of indices for fields with the given name
- get_field_index(self, name)¶
Return index of field with given unique name. Returns -1 if not found or if duplicated
- insert(self, int i, Field field)¶
Add a field at position i to the schema.
- Parameters
i (int) –
field (Field) –
- Returns
schema (Schema)
- metadata¶
- names¶
The schema’s field names.
- Returns
list of str
- pandas_metadata¶
Return deserialized-from-JSON pandas metadata field (if it exists)
- remove(self, int i)¶
Remove the field at index i from the schema.
- Parameters
i (int) –
- Returns
schema (Schema)
- remove_metadata(self)¶
Create new schema without metadata, if any
- Returns
schema (pyarrow.Schema)
- serialize(self, memory_pool=None)¶
Write Schema to Buffer as encapsulated IPC message
- Parameters
memory_pool (MemoryPool, default None) – Uses default memory pool if not specified
- Returns
serialized (Buffer)
- set(self, int i, Field field)¶
Replace a field at position i in the schema.
- Parameters
i (int) –
field (Field) –
- Returns
schema (Schema)
- to_string(self, truncate_metadata=True, show_field_metadata=True, show_schema_metadata=True)¶
Return human-readable representation of Schema
- Parameters
truncate_metadata (boolean, default True) – Limit metadata key/value display to a single line of ~80 characters or less
show_field_metadata (boolean, default True) – Display Field-level KeyValueMetadata
show_schema_metadata (boolean, default True) – Display Schema-level KeyValueMetadata
- Returns
str (the formatted output)
- types¶
The schema’s field types.
- Returns
list of DataType
- with_metadata(self, metadata)¶
Add metadata as dict of string keys and values to Schema
- Parameters
metadata (dict) – Keys and values must be string-like / coercible to bytes
- Returns
schema (pyarrow.Schema)