The Arrow R package uses several additional development tools:
-
lintrfor code analysis -
stylerfor code styling -
pkgdownfor building the website -
roxygen2for documenting the package- the R documentation uses the
@examplesIftag introduced inroxygen2version 7.1.2
- the R documentation uses the
You can install all these additional dependencies by running:
install.packages(c("lintr", "styler", "pkgdown", "roxygen2"))The arrow/r directory contains a Makefile
to help with some common tasks from the command line
(e.g. make test, make doc,
make clean, etc.).
Rebuilding the documentation
The R documentation uses the @examplesIf
tag introduced in roxygen2 version 7.1.2.
remotes::install_github("r-lib/roxygen2")You can use devtools::document() and
pkgdown::build_site() to rebuild the documentation and
preview the results.
# Update roxygen documentation
devtools::document()
# To preview the documentation website
pkgdown::build_site(preview=TRUE)Styling and linting
Styling and linting can be set up and performed entirely with the pre-commit tool:
pre-commit run --show-diff-on-failure --color=always --all-files rSee also the following subsections our styling and lint details for R and C++ codes.
R code
The R code in the package follows the tidyverse style. On PR submission (and on pushes) our CI will run linting and will flag possible errors on the pull request with annotations.
You can automatically change the formatting of the code in the package using the styler package.
The styler package will fix many styling errors, thought not all
lintr errors are automatically fixable with styler. The list of files we
intentionally do not style is in r/.styler_excludes.R.
Linting and styling with pre-commit as described above is the best way to ensure your changes are being checked properly but you can also run the tools individually if you prefer:
lintr::lint_package() # for linting
styler::style_pkg() # for stylingNote: To run lintr, we require the cyclocomp package to
be installed first.
C++ code
The arrow package uses some customized tools on top of cpp11 to prepare its C++ code in
src/. This is because there are some features that are only
enabled and built conditionally during build time. If you change C++
code in the R package, you will need to set the ARROW_R_DEV
environment variable to true (optionally, add it to your
~/.Renviron file to persist across sessions) so that the
data-raw/codegen.R file is used for code generation. The
Makefile commands also handles this automatically.
We use Google C++ style in our C++ code. The easiest way to
accomplish this is use an editors/IDE that formats your code for you.
Many popular editors/IDEs have support for running
clang-format on C++ files when you save them.
Installing/enabling the appropriate plugin may save you much
frustration.
Running tests
Tests can be run either using devtools::test() or the
Makefile alternative.
# Run the test suite, optionally filtering file names
devtools::test(filter="^regexp$")# or the Makefile alternative from the arrow/r directory in a shell:
make test file=regexpSome tests are conditionally enabled based on the availability of certain features in the package build (S3 support, compression libraries, etc.). Others are generally skipped by default but can be enabled with environment variables or other settings:
All tests are skipped on Linux if the package builds without the C++ libarrow. To make the build fail if libarrow is not available (as in, to test that the C++ build was successful), set
TEST_R_WITH_ARROW=trueSome tests are disabled unless
ARROW_R_DEV=trueTests that require allocating >2GB of memory to test Large types are disabled unless
ARROW_LARGE_MEMORY_TESTS=trueIntegration tests against a real S3 bucket are disabled unless credentials are set in
AWS_ACCESS_KEY_IDandAWS_SECRET_ACCESS_KEY; these are available on requestS3 tests using MinIO locally are enabled if the
minio serverprocess is found running. If you’re running MinIO with custom settings, you can setMINIO_ACCESS_KEY,MINIO_SECRET_KEY, andMINIO_PORTto override the defaults.
Running checks
You can run package checks by using devtools::check()
and check test coverage with covr::package_coverage().
# All package checks
devtools::check()
# See test coverage statistics
covr::report()
covr::package_coverage()For full package validation, you can run the following commands from a terminal.
R CMD build .
R CMD check arrow_*.tar.gz --as-cranRunning extended CI checks
On a pull request, there are some actions you can trigger by commenting on the PR. These extended CI checks are run nightly and can also be requested on-demand using an internal tool called crossbow. A few important GitHub comment commands are shown below.
Run all extended R CI tasks
@github-actions crossbow submit -g r
This runs each of the R-related CI tasks.
Run a specific task
@github-actions crossbow submit {task-name}
See the r: group definition near the beginning of the crossbow
configuration for a list of glob expression patterns that match
names of items in the tasks: list below it.