Apache Arrow nanoarrow is a relatively new library and is under active development. Maintaining the balance between useful and minimal is difficult to do; however, there are a number of features that fit comforably within the scope of nanoarrow that have not yet been scheduled for implementation.
Type coverage: The C library currently provides support for all types that are available via the Arrow C Data interface. When the recently-added run-end encoded (REE) types and potentially forthcoming string view/list view types are available via the Arrow C Data interface, support should be added in nanoarrow as well.
Array append: The
ArrowArrayAppend*()family of functions provide a means by which to incrementally build arrays; however, there is no built-in way to append an
ArrowArrayView, potentially more efficiently appending multiple values at once. Among other things, this would provide a route to an unoptimized filter/take implementation.
Remove Arrow C++ dependency for tests: The C library and IPC extension rely on Arrow C++ for some test code that was written early in the library’s development. These tests are valuable to ensure compatibility between nanoarrow and Arrow C++; however, including them in the default test suite complicates release verification for some users and prevents testing in environments where Arrow C++ does not currently build (e.g., WASM, compilers without C++17 support).
C++ integration: The existing C++ integration is intentionally minimal; however, there are likely improvements that could be made to better integrate nanoarrow into existing C++ projects.
Documentation: As the C library and its user base evolves, documentation needs to be refined and expanded to support the current set of use cases.
Write support: The IPC extension currently provides support for reading IPC streams but not writing them.
Dictionary support: The IPC extension does not currently support reading dictionary messages an IPC stream.
Compression: The IPC extension does not currently support compressed streams.
This entire extension is currently experimental and awaiting use-cases that will drive future development.
Type support: The R bindings currently do not provide support for extension types and relies on Arrow C++ for some dictionary-encoded types.
ALTREP support: A recent R release added enhanced ALTREP support such that types that convert to
list()can defer materialization cost/allocation. Arrow sources that arrive in chunks (e.g., from a
ChunkedArray) currently can’t be converted via any ALTREP mechanism and support could be added.
IPC support: The IPC reader is not currently exposed in the R bindings.
Packaging: The Python bindings are currently unpublished (pypi or conda) and are not included in release verification.
Element conversion: There is currently no mechanism to extract an element of an
ArrowArrayViewas a Python object (e.g., an
numpy/Pandas conversion: The Python bindings currently expose the
ArrowArrayViewbut do not provide a means by which to convert to popular packages such as numpy or Pandas.
Creating arrays: The Python bindings do not currently provide a means by which to create an
ArrowArrayfrom buffers or incrementally.
IPC support: The IPC reader is not currently exposed in the Python bindings.