SQLite Driver#
Available for: C/C++, GLib/Ruby, Go, Python, R
The SQLite driver provides access to SQLite databases.
This driver is essentially a “reference” driver that was used during ADBC development. It generally supports most ADBC features but has not received attention to optimization.
Installation#
For conda-forge users:
mamba install libadbc-driver-sqlite
Install the C/C++ package and use the Go driver manager. Requires CGO.
go get github.com/apache/arrow-adbc/go/adbc/drivermgr
# For conda-forge
mamba install adbc-driver-sqlite
# For pip
pip install adbc_driver_sqlite
# install.packages("pak")
pak::pak("apache/arrow-adbc/r/adbcsqlite")
Usage#
To connect to a database, supply the “uri” parameter when constructing
the AdbcDatabase
. This should be a filename or URI
filename.
If omitted, it will default to an in-memory database, but one that is
shared across all connections.
#include "adbc.h"
// Ignoring error handling
struct AdbcDatabase database;
AdbcDatabaseNew(&database, nullptr);
AdbcDatabaseSetOption(&database, "driver", "adbc_driver_sqlite", nullptr);
AdbcDatabaseSetOption(&database, "uri", "<sqlite uri>", nullptr);
AdbcDatabaseInit(&database, nullptr);
import adbc_driver_sqlite.dbapi
with adbc_driver_sqlite.dbapi.connect() as conn:
pass
library(adbcdrivermanager)
# Use the driver manager to connect to a database
db <- adbc_database_init(adbcsqlite::adbcsqlite(), uri = ":memory:")
con <- adbc_connection_init(db)
You must have libadbc_driver_sqlite.so on your LD_LIBRARY_PATH, or in the same directory as the executable when you run this. This requires CGO and loads the C++ ADBC sqlite driver.
import (
"context"
"github.com/apache/arrow-adbc/go/adbc"
"github.com/apache/arrow-adbc/go/adbc/drivermgr"
)
func main() {
var drv drivermgr.Driver
db, err := drv.NewDatabase(map[string]string{
"driver": "adbc_driver_sqlite",
adbc.OptionKeyURI: "<sqlite uri>",
})
if err != nil {
// handle error
}
cnxn, err := db.Open(context.Background())
if err != nil {
// handle error
}
defer cnxn.Close()
}
Supported Features#
Bulk Ingestion#
Bulk ingestion is supported. The mapping from Arrow types to SQLite types is the same as below.
Partitioned Result Sets#
Partitioned result sets are not supported.
Run-Time Loadable Extensions#
ADBC allows loading SQLite extensions. For details on extensions themselves, see “Run-Time Loadable Extensions” in the SQLite documentation.
To load an extension, three things are necessary:
Enable extension loading by setting
Set the path
Set the entrypoint
These options can only be set after the connection is fully initialized with
AdbcConnectionInit()
.
Options#
adbc.sqlite.load_extension.enabled
Whether to enable (“true”) or disable (“false”) extension loading. The default is disabled.
adbc.sqlite.load_extension.path
To load an extension, first set this option to the path to the extension to load. This will not load the extension yet.
adbc.sqlite.load_extension.entrypoint
After setting the path, set the option to the entrypoint in the extension (or NULL) to actually load the extension.
Example#
// TODO
# TODO
import adbc_driver_sqlite.dbapi as dbapi
with dbapi.connect() as conn:
conn.enable_load_extension(True)
conn.load_extension("path/to/extension.so")
The driver implements the same API as the Python standard library
sqlite3
module, so packages built for it should also work. For
example, sqlite-zstd:
import adbc_driver_sqlite.dbapi as dbapi
import sqlite_zstd
with dbapi.connect() as conn:
conn.enable_load_extension(True)
sqlite_zstd.load(conn)
# TODO
Transactions#
Transactions are supported.
Type Inference/Type Support#
SQLite does not enforce that values in a column have the same type. The SQLite driver will attempt to infer the best Arrow type for a column as the result set is read. When reading the first batch of data, the driver will be in “type promotion” mode. The inferred type of each column begins as INT64, and will convert to DOUBLE, then STRING, if needed. After that, reading more batches will attempt to convert to the inferred types. An error will be raised if this is not possible (e.g. if a string value is read but the column was inferred to be of type INT64).
In the future, other behaviors may also be supported.
Bound parameters will be translated to SQLite’s integer, floating-point, or text types as appropriate. Supported Arrow types are: signed and unsigned integers, (large) strings, float, and double.
Driver-specific options:
adbc.sqlite.query.batch_rows
The size of batches to read. Hence, this also controls how many rows are read to infer the Arrow type.