-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic org.apache.avro.Schema
createAvroSchema
(List<Field> arrowFields) Overload provided for convenience, sets name = GENERIC_RECORD_TYPE_NAME.static org.apache.avro.Schema
createAvroSchema
(List<Field> arrowFields, String typeName) Overload provided for convenience, sets namespace = null.static org.apache.avro.Schema
createAvroSchema
(List<Field> arrowFields, String typeName, String namespace) Create an Avro record schema for a given list of Arrow fields.static CompositeAvroProducer
createCompositeProducer
(List<FieldVector> vectors) Create a composite Avro producer for a set of field vectors (typically the root set of a VSR).
-
Field Details
-
GENERIC_RECORD_TYPE_NAME
- See Also:
-
-
Constructor Details
-
ArrowToAvroUtils
public ArrowToAvroUtils()
-
-
Method Details
-
createAvroSchema
public static org.apache.avro.Schema createAvroSchema(List<Field> arrowFields, String typeName, String namespace) Create an Avro record schema for a given list of Arrow fields.This method currently performs following type mapping for Avro data types to corresponding Arrow data types.
Arrow type Avro encoding ArrowType.Null NULL ArrowType.Bool BOOLEAN ArrowType.Int(64 bit, unsigned 32 bit) LONG ArrowType.Int(signed 32 bit, < 32 bit) INT ArrowType.FloatingPoint(double) DOUBLE ArrowType.FloatingPoint(single, half) FLOAT ArrowType.Utf8 STRING ArrowType.LargeUtf8 STRING ArrowType.Binary BYTES ArrowType.LargeBinary BYTES ArrowType.FixedSizeBinary FIXED ArrowType.Decimal decimal (FIXED) ArrowType.Date date (INT) ArrowType.Time (SEC | MILLI) time-millis (INT) ArrowType.Time (MICRO | NANO) time-micros (LONG) ArrowType.Timestamp (NANOSECONDS, TZ != NULL) time-nanos (LONG) ArrowType.Timestamp (MICROSECONDS, TZ != NULL) time-micros (LONG) ArrowType.Timestamp (MILLISECONDS | SECONDS, TZ != NULL) time-millis (LONG) ArrowType.Timestamp (NANOSECONDS, TZ == NULL) local-time-nanos (LONG) ArrowType.Timestamp (MICROSECONDS, TZ == NULL) local-time-micros (LONG) ArrowType.Timestamp (MILLISECONDS | SECONDS, TZ == NULL) local-time-millis (LONG) ArrowType.Duration duration (FIXED) ArrowType.Interval duration (FIXED) ArrowType.Struct record ArrowType.List array ArrowType.LargeList array ArrowType.FixedSizeList array ArrowType.Map map ArrowType.Union union Nullable fields are represented as a union of [base-type | null]. Special treatment is given to nullability of unions - a union is considered nullable if any of its child fields are nullable. The schema for a nullable union will always contain a null type as its first member, with none of the child types being nullable.
List fields must contain precisely one child field, which may be nullable. Map fields are represented as a list of structs, where the struct fields are "key" and "value". The key field must always be of type STRING (Utf8) and cannot be nullable. The value can be of any type and may be nullable. Record types must contain at least one child field and cannot contain multiple fields with the same name
- Parameters:
arrowFields
- The arrow fields used to generate the Avro schematypeName
- Name of the top level Avro record typenamespace
- Namespace of the top level Avro record type- Returns:
- An Avro record schema for the given list of fields, with the specified name and namespace
-
createAvroSchema
Overload provided for convenience, sets namespace = null. -
createAvroSchema
Overload provided for convenience, sets name = GENERIC_RECORD_TYPE_NAME. -
createCompositeProducer
Create a composite Avro producer for a set of field vectors (typically the root set of a VSR).- Parameters:
vectors
- The vectors that will be used to produce Avro data- Returns:
- The resulting composite Avro producer
-