The goal of adbcdrivermanager is to provide a low-level developer-facing interface to Arrow Database Connectivity (ADBC) for the purposes of driver development, testing, and support for user-facing packages that rely on ADBC drivers.
Installation
You can install the released version of adbcdrivermanager from CRAN with:
install.packages("adbcdrivermanager")
You can install the development version of adbcdrivermanager from GitHub with:
# install.packages("pak")
pak::pak("apache/arrow-adbc/r/adbcdrivermanager")
ADBC drivers for R use a relatively new feature of pkgbuild to enable installation from GitHub via pak. Depending on when you installed pak, you may need to update its internal version of pkgbuild.
install.packages("pkgbuild", pak:::private_lib_dir())
pak::cache_clean()
Example
This is a basic example which shows you how to solve a common problem:
library(adbcdrivermanager)
# Get a reference to a database using a driver. The adbcdrivermanager package
# contains a few drivers useful for illustration and testing.
db <- adbc_database_init(adbc_driver_monkey())
# Open a new connection to a database
con <- adbc_connection_init(db)
# Initialize a new statement from a connection
stmt <- adbc_statement_init(con)
# The monkey driver allows you to specify the data for a query
# in advance for testing purposes
adbc_statement_bind_stream(stmt, nycflights13::flights)
# Set the query
adbc_statement_set_sql_query(stmt, "SELECT * FROM flights")
# Start executing the query. Results in ADBC are ArrowArrayStream objects,
# which can be materialized using as.data.frame(), as_tibble(),
# or converted to an arrow::RecordBatchReader using
# arrow::as_record_batch_reader()
stream <- nanoarrow::nanoarrow_allocate_array_stream()
adbc_statement_execute_query(stmt, stream)
#> [1] -1
# Materialize the whole query as a tibble
tibble::as_tibble(stream)
#> # A tibble: 336,776 × 19
#> year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time
#> <int> <int> <int> <int> <int> <dbl> <int> <int>
#> 1 2013 1 1 517 515 2 830 819
#> 2 2013 1 1 533 529 4 850 830
#> 3 2013 1 1 542 540 2 923 850
#> 4 2013 1 1 544 545 -1 1004 1022
#> 5 2013 1 1 554 600 -6 812 837
#> 6 2013 1 1 554 558 -4 740 728
#> 7 2013 1 1 555 600 -5 913 854
#> 8 2013 1 1 557 600 -3 709 723
#> 9 2013 1 1 557 600 -3 838 846
#> 10 2013 1 1 558 600 -2 753 745
#> # ℹ 336,766 more rows
#> # ℹ 11 more variables: arr_delay <dbl>, carrier <chr>, flight <int>,
#> # tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>,
#> # hour <dbl>, minute <dbl>, time_hour <dttm>
# Clean up!
adbc_statement_release(stmt)
adbc_connection_release(con)
adbc_database_release(db)