A Schema is a list of Fields, which map names to Arrow data types. Create a Schema when you want to convert an R data.frame to Arrow but don't want to rely on the default mapping of R types to Arrow types, such as when you want to choose a specific numeric precision, or when creating a Dataset and you want to ensure a specific schema rather than inferring it from the various files.

Many Arrow objects, including Table and Dataset, have a $schema method (active binding) that lets you access their schema.

schema(...)

Arguments

...

named list of data types

Methods

  • $ToString(): convert to a string

  • $field(i): returns the field at index i (0-based)

  • $GetFieldByName(x): returns the field with name x

  • $WithMetadata(metadata): returns a new Schema with the key-value metadata set. Note that all list elements in metadata will be coerced to character.

Active bindings

  • $names: returns the field names (called in names(Schema))

  • $num_fields: returns the number of fields (called in length(Schema))

  • $fields: returns the list of Fields in the Schema, suitable for iterating over

  • $HasMetadata: logical: does this Schema have extra metadata?

  • $metadata: returns the key-value metadata as a named list. Modify or replace by assigning in (sch$metadata <- new_metadata). All list elements are coerced to string.

Examples

# \donttest{ df <- data.frame(col1 = 2:4, col2 = c(0.1, 0.3, 0.5)) tab1 <- Table$create(df) tab1$schema
#> Schema #> col1: int32 #> col2: double #> #> See $metadata for additional Schema metadata
tab2 <- Table$create(df, schema = schema(col1 = int8(), col2 = float32())) tab2$schema
#> Schema #> col1: int8 #> col2: float #> #> See $metadata for additional Schema metadata
# }