PostgreSQL Driver¶
Available for: C/C++, GLib/Ruby, Go, Python, R
The PostgreSQL driver provides access to any database that supports the PostgreSQL wire format. It wraps libpq, the client library for PostgreSQL. The project owes credit to 0x0L’s pgeon for the overall approach.
Note
This project is not affiliated with PostgreSQL in any way.
Note
The PostgreSQL driver is in beta. Performance/optimization and support for complex types and different ADBC features is still ongoing.
Installation¶
For conda-forge users:
mamba install libadbc-driver-postgresql
Install the C/C++ package and use the Go driver manager. Requires CGO.
go get github.com/apache/arrow-adbc/go/adbc/drivermgr
# For conda-forge
mamba install adbc-driver-postgresql
# For pip
pip install adbc_driver_postgresql
install.packages("adbcpostgresql")
Usage¶
To connect to a database, supply the “uri” parameter when constructing
the AdbcDatabase
. This should be a connection URI.
#include "arrow-adbc/adbc.h"
// Ignoring error handling
struct AdbcDatabase database;
AdbcDatabaseNew(&database, nullptr);
AdbcDatabaseSetOption(&database, "uri", "postgresql://localhost:5433", nullptr);
AdbcDatabaseInit(&database, nullptr);
You must have libadbc_driver_postgresql.so on your LD_LIBRARY_PATH, or in the same directory as the executable when you run this. This requires CGO and loads the C++ ADBC postgresql driver.
import (
"context"
"github.com/apache/arrow-adbc/go/adbc"
"github.com/apache/arrow-adbc/go/adbc/drivermgr"
)
func main() {
var drv drivermgr.Driver
db, err := drv.NewDatabase(map[string]string{
"driver": "adbc_driver_postgresql",
adbc.OptionKeyURI: "postgresql://user:pass@localhost:5433/postgres",
})
if err != nil {
// handle error
}
defer db.Close()
cnxn, err := db.Open(context.Background())
if err != nil {
// handle error
}
defer cnxn.Close()
}
import adbc_driver_postgresql.dbapi
uri = "postgresql://user:pass@localhost:5433/postgres"
with adbc_driver_postgresql.dbapi.connect(uri) as conn:
pass
For more examples, see PostgreSQL Recipes.
library(adbcdrivermanager)
# Use the driver manager to connect to a database
uri <- Sys.getenv("ADBC_POSTGRESQL_TEST_URI")
db <- adbc_database_init(adbcpostgresql::adbcpostgresql(), uri = uri)
con <- adbc_connection_init(db)
Supported Features¶
The PostgreSQL driver mostly supports features defined in the ADBC API specification 1.0.0, but not all cases are fully implemented (particularly around bind parameters and prepared statements).
Bind Parameters and Prepared Statements¶
The PostgreSQL driver only supports executing prepared statements with parameters that do not return result sets (basically, an INSERT with parameters). Queries that return result sets are difficult with prepared statements because the driver is built around using COPY for best performance, which is not supported in this context.
Bulk Ingestion¶
Bulk ingestion is supported. The mapping from Arrow types to PostgreSQL types is the same as below.
Partitioned Result Sets¶
Partitioned result sets are not supported.
Transactions¶
Transactions are supported.
Type Support¶
PostgreSQL allows defining new types at runtime, so the driver must build a mapping of available types. This is currently done once at startup.
Type support is currently limited depending on the type and whether it is being read or written.
Arrow Type |
As Bind Parameter |
In Bulk Ingestion [1] |
---|---|---|
binary |
BYTEA |
BYTEA |
bool |
BOOLEAN |
BOOLEAN |
date32 |
DATE |
DATE |
date64 |
❌ |
❌ |
dictionary |
(as unpacked type) |
(as unpacked type, only for binary/string) |
duration |
INTERVAL |
INTERVAL |
float32 |
REAL |
REAL |
float64 |
DOUBLE PRECISION |
DOUBLE PRECISION |
int8 |
SMALLINT |
SMALLINT |
int16 |
SMALLINT |
SMALLINT |
int32 |
INTEGER |
INTEGER |
int64 |
BIGINT |
BIGINT |
large_binary |
❌ |
❌ |
large_string |
TEXT |
TEXT |
month_day_nano_interval |
INTERVAL |
INTERVAL |
string |
TEXT |
TEXT |
timestamp |
TIMESTAMP [3] |
TIMESTAMP/TIMESTAMP WITH TIMEZONE |
PostgreSQL Type |
In Result Set |
---|---|
ARRAY |
list |
BIGINT |
int64 |
BINARY |
binary |
BOOLEAN |
bool |
CHAR |
utf8 |
DATE |
date32 |
DOUBLE PRECISION |
float64 |
INTEGER |
int32 |
INTERVAL |
month_day_nano_interval |
NUMERIC |
utf8 [2] |
REAL |
float32 |
SMALLINT |
int16 |
TEXT |
utf8 |
TIME |
time64 |
TIMESTAMP WITH TIME ZONE |
timestamp[unit, UTC] |
TIMESTAMP WITHOUT TIME ZONE |
timestamp[unit] |
VARCHAR |
utf8 |
Unknown Types¶
Types without direct Arrow equivalents can still be returned by the driver. In this case, the Arrow type will be binary, and the contents will be the raw bytes as provided by the PostgreSQL wire protocol.
For Arrow implementations that support the Opaque canonical extension type, the extension type metadata is also always present. This helps differentiate when the driver intentionally returned a binary column from when it returned a binary column as a fallback.
Warning
Currently, the driver also attaches a metadata key named
ADBC:posgresql:typname
to the schema field of the unknown
column, but this has been deprecated in favor of the Opaque type
and you should not rely on this key continuing to exist.
Software Versions¶
For Python wheels, the shipped version of the PostgreSQL client libraries is 15.2. For conda-forge packages, the version of libpq is the same as the version of libpq in your Conda environment.
The PostgreSQL driver is tested against PostgreSQL versions 11 through 16.