PostgreSQL Driver¶
Language: C/C++ Status: Stable
The PostgreSQL driver provides access to any database that supports the PostgreSQL wire format. It wraps libpq, the client library for PostgreSQL. The project owes credit to 0x0L’s pgeon for the overall approach.
Note
This project is not affiliated with PostgreSQL in any way.
Note
This driver has experimental support for Amazon Redshift. As Redshift does not support reading or writing COPY in PostgreSQL binary format, however, the optimizations that accelerate queries are not enabled when connecting to Redshift. There may also be other differences in functionality; please file a bug report if problems are encountered.
Installation¶
Install libadbc-driver-postgresql from conda-forge:
mamba install libadbc-driver-postgresql
Install the C/C++ driver, then use the Go driver manager. Requires CGO.
go get github.com/apache/arrow-adbc/go/adbc/drivermgr
Install adbc-driver-postgresql from conda-forge:
mamba install adbc-driver-postgresql
Install adbc-driver-postgresql from PyPI:
pip install adbc-driver-postgresql
Install adbcpostgresql from CRAN:
install.packages("adbcpostgresql")
Additionally, the driver may be used from C/C++, C#, GLib, Go, R, Ruby, and Rust via the driver manager.
Usage¶
To connect to a database, supply the “uri” parameter when constructing
the AdbcDatabase.  This should be a connection URI.
#include "arrow-adbc/adbc.h"
// Ignoring error handling
struct AdbcDatabase database;
AdbcDatabaseNew(&database, nullptr);
AdbcDatabaseSetOption(&database, "uri", "postgresql://localhost:5433", nullptr);
AdbcDatabaseInit(&database, nullptr);
You must have libadbc_driver_postgresql.so on your LD_LIBRARY_PATH, or in the same directory as the executable when you run this. This requires CGO and loads the C++ ADBC postgresql driver.
import (
   "context"
   "github.com/apache/arrow-adbc/go/adbc"
   "github.com/apache/arrow-adbc/go/adbc/drivermgr"
)
func main() {
   var drv drivermgr.Driver
   db, err := drv.NewDatabase(map[string]string{
      "driver": "adbc_driver_postgresql",
      adbc.OptionKeyURI: "postgresql://user:pass@localhost:5433/postgres",
   })
   if err != nil {
      // handle error
   }
   defer db.Close()
   cnxn, err := db.Open(context.Background())
   if err != nil {
      // handle error
   }
   defer cnxn.Close()
}
import adbc_driver_postgresql.dbapi
uri = "postgresql://user:pass@localhost:5433/postgres"
with adbc_driver_postgresql.dbapi.connect(uri) as conn:
    pass
For more examples, see PostgreSQL Recipes.
library(adbcdrivermanager)
# Use the driver manager to connect to a database
uri <- Sys.getenv("ADBC_POSTGRESQL_TEST_URI")
db <- adbc_database_init(adbcpostgresql::adbcpostgresql(), uri = uri)
con <- adbc_connection_init(db)
Supported Features¶
The PostgreSQL driver supports features defined in the ADBC API specification 1.0.0.
COPY query execution¶
The PostgreSQL driver executes queries with COPY for best performance.
PostgreSQL does not support this for all queries, however (such as SHOW).
The optimization can be disabled by the statement option
adbc.postgresql.use_copy.  For an example, see
Execute a statement without COPY.
Bulk Ingestion¶
Bulk ingestion is supported. The mapping from Arrow types to PostgreSQL types is the same as below.
Partitioned Result Sets¶
Partitioned result sets are not supported.
Transactions¶
Transactions are supported.
Type Support¶
PostgreSQL allows defining new types at runtime, so the driver must build a mapping of available types. This is currently done once at startup.
Type support is currently limited depending on the type and whether it is being read or written.
| Arrow Type | As Bind Parameter | In Bulk Ingestion [1] | 
|---|---|---|
| binary | BYTEA | BYTEA | 
| bool | BOOLEAN | BOOLEAN | 
| date32 | DATE | DATE | 
| date64 | ❌ | ❌ | 
| dictionary | (as unpacked type) | (as unpacked type, only for binary/string) | 
| duration | INTERVAL | INTERVAL | 
| float32 | REAL | REAL | 
| float64 | DOUBLE PRECISION | DOUBLE PRECISION | 
| int8 | SMALLINT | SMALLINT | 
| int16 | SMALLINT | SMALLINT | 
| int32 | INTEGER | INTEGER | 
| int64 | BIGINT | BIGINT | 
| large_binary | ❌ | ❌ | 
| large_string | TEXT | TEXT | 
| month_day_nano_interval | INTERVAL | INTERVAL | 
| string | TEXT | TEXT | 
| timestamp | TIMESTAMP [3] | TIMESTAMP/TIMESTAMP WITH TIMEZONE | 
| PostgreSQL Type | In Result Set | 
|---|---|
| ARRAY | list | 
| BIGINT | int64 | 
| BINARY | binary | 
| BOOLEAN | bool | 
| CHAR | utf8 | 
| DATE | date32 | 
| DOUBLE PRECISION | float64 | 
| INTEGER | int32 | 
| INTERVAL | month_day_nano_interval | 
| NUMERIC | utf8 [2] | 
| REAL | float32 | 
| SMALLINT | int16 | 
| TEXT | utf8 | 
| TIME | time64 | 
| TIMESTAMP WITH TIME ZONE | timestamp[unit, UTC] | 
| TIMESTAMP WITHOUT TIME ZONE | timestamp[unit] | 
| VARCHAR | utf8 | 
Unknown Types¶
Types without direct Arrow equivalents can still be returned by the driver. In this case, the Arrow type will be binary, and the contents will be the raw bytes as provided by the PostgreSQL wire protocol.
For Arrow implementations that support the Opaque canonical extension type, the extension type metadata is also always present. This helps differentiate when the driver intentionally returned a binary column from when it returned a binary column as a fallback.
Warning
Currently, the driver also attaches a metadata key named
ADBC:postgresql:typname to the schema field of the unknown
column, but this has been deprecated in favor of the Opaque type
and you should not rely on this key continuing to exist.
Software Versions¶
For Python wheels, the shipped version of the PostgreSQL client libraries is 15.2. For conda-forge packages, the version of libpq is the same as the version of libpq in your Conda environment.
The PostgreSQL driver is tested against PostgreSQL versions 11 through 16.