Data Types

DataFusion uses Arrow, and thus the Arrow type system, for query execution. The SQL types from sqlparser-rs are mapped to Arrow data types according to the following table. This mapping occurs when defining the schema in a CREATE EXTERNAL TABLE command or when performing a SQL CAST operation.

You can see the corresponding Arrow type for any SQL expression using the arrow_typeof function. For example:

select arrow_typeof(interval '1 month');
+---------------------------------------------------------------------+
| arrow_typeof(IntervalMonthDayNano("79228162514264337593543950336")) |
+---------------------------------------------------------------------+
| Interval(MonthDayNano)                                              |
+---------------------------------------------------------------------+

You can cast a SQL expression to a specific Arrow type using the arrow_cast function For example, to cast the output of now() to a Timestamp with second precision:

select arrow_cast(now(), 'Timestamp(Second, None)');
+---------------------+
| now()               |
+---------------------+
| 2023-03-03T17:19:21 |
+---------------------+

Character Types

SQL DataType

Arrow DataType

CHAR

Utf8

VARCHAR

Utf8

TEXT

Utf8

STRING

Utf8

Numeric Types

SQL DataType

Arrow DataType

Notes

TINYINT

Int8

SMALLINT

Int16

INT or INTEGER

Int32

BIGINT

Int64

TINYINT UNSIGNED

UInt8

SMALLINT UNSIGNED

UInt16

INT UNSIGNED or INTEGER UNSIGNED

UInt32

BIGINT UNSIGNED

UInt64

FLOAT

Float32

REAL

Float32

DOUBLE

Float64

DECIMAL(precision, scale)

Decimal128(precision, scale)

Decimal support is currently experimental (#3523)

Date/Time Types

SQL DataType

Arrow DataType

DATE

Date32

TIME

Time64(Nanosecond)

TIMESTAMP

Timestamp(Nanosecond, None)

INTERVAL

Interval(IntervalMonthDayNano)

Boolean Types

SQL DataType

Arrow DataType

BOOLEAN

Boolean

Binary Types

SQL DataType

Arrow DataType

BYTEA

Binary

You can create binary literals using a hex string literal such as X'1234 to create a Binary value of two bytes, 0x12 and 0x34.

Unsupported SQL Types

SQL Data Type

Arrow DataType

UUID

Not yet supported

BLOB

Not yet supported

CLOB

Not yet supported

BINARY

Not yet supported

VARBINARY

Not yet supported

REGCLASS

Not yet supported

NVARCHAR

Not yet supported

CUSTOM

Not yet supported

ARRAY

Not yet supported

ENUM

Not yet supported

SET

Not yet supported

DATETIME

Not yet supported

Supported Arrow Types

The following types are supported by the arrow_typeof function:

Arrow Type

Null

Boolean

Int8

Int16

Int32

Int64

UInt8

UInt16

UInt32

UInt64

Float16

Float32

Float64

Utf8

LargeUtf8

Binary

Timestamp(Second, None)

Timestamp(Millisecond, None)

Timestamp(Microsecond, None)

Timestamp(Nanosecond, None)

Time32

Time64

Duration(Second)

Duration(Millisecond)

Duration(Microsecond)

Duration(Nanosecond)

Interval(YearMonth)

Interval(DayTime)

Interval(MonthDayNano)

FixedSizeBinary(<len>) (e.g. FixedSizeBinary(16))

Decimal128(<precision>, <scale>) e.g. Decimal128(3, 10)

Decimal256(<precision>, <scale>) e.g. Decimal256(3, 10)