Roadmap#
Apache Arrow nanoarrow is a relatively new library and is under active development. Maintaining the balance between useful and minimal is difficult to do; however, there are a number of features that fit comforably within the scope of nanoarrow that have not yet been scheduled for implementation.
C library#
Type coverage: The C library currently provides support for all types that are available via the Arrow C Data interface except string view/list view types. Support for these should be added in nanoarrow as well (#583, #616, #510).
Remove Arrow C++ dependency for tests: The C library and IPC extension rely on Arrow C++ for some test code that was written early in the library’s development. These tests are valuable to ensure compatibility between nanoarrow and Arrow C++; however, including them in the default test suite complicates release verification for some users and prevents testing in environments where Arrow C++ does not currently build (e.g., WASM, compilers without C++17 support) (#619).
Test verbosity: Tests for the C library were written before testing utilities in the
nanoarrow_testing
library were available (and before there was ananoarrow_testing
library in which to put new ones). As a result, some of them are very verbose and can be difficult to read, which can and should be improved (#577, #566).C++ integration: The existing C++ integration is intentionally minimal; however, there are likely improvements that could be made to better integrate nanoarrow into existing C++ projects (#599).
Documentation: As the C library and its user base evolves, documentation needs to be refined and expanded to support the current set of use cases (#187, #497).
IPC Dictionary support: The IPC extension does not currently support reading dictionary messages an IPC stream (#622).
IPC Compression support: The IPC extension does not currently support compressed streams using per-buffer compression, although streams can be compressed outside the nanoarrow library (e.g., gzip compression of the entire stream) (#621)
R bindings#
Conversion internals: The initial implementation of conversion from Arrow data to R vectors was implemented in C and its verbosity makes it difficult to add support for new types. The internals should be refactored to make the conversion code easier to understand for new developers (#392).
Type support: The R bindings currently rely on the Arrow R package for conversion of some R types (e.g., list_of), and some types are not supported in nanoarrow nor the arrow R package (e.g., run-end encoding, list view, and string/binary view) (#617).
ALTREP support: A recent R release added enhanced ALTREP support such that types that convert to
list()
can defer materialization cost/allocation. Arrow sources that arrive in chunks (e.g., from aTable
orChunkedArray
) currently can’t be converted via any ALTREP mechanism and support could be added (#219).