Wrapper around JsonTableReader to read a newline-delimited JSON (ndjson) file into a data frame or Arrow Table.
read_json_arrow(
file,
col_select = NULL,
as_data_frame = TRUE,
schema = NULL,
...
)
A character file name or URI, raw
vector, an Arrow input stream,
or a FileSystem
with path (SubTreeFileSystem
).
If a file name, a memory-mapped Arrow InputStream will be opened and
closed when finished; compression will be detected from the file extension
and handled automatically. If an input stream is provided, it will be left
open.
A character vector of column names to keep, as in the
"select" argument to data.table::fread()
, or a
tidy selection specification
of columns, as used in dplyr::select()
.
Should the function return a data.frame
(default) or
an Arrow Table?
Schema that describes the table.
Additional options passed to JsonTableReader$create()
A data.frame
, or a Table if as_data_frame = FALSE
.
If passed a path, will detect and handle compression from the file extension
(e.g. .json.gz
). Accepts explicit or implicit nulls.
tf <- tempfile()
on.exit(unlink(tf))
writeLines('
{ "hello": 3.5, "world": false, "yo": "thing" }
{ "hello": 3.25, "world": null }
{ "hello": 0.0, "world": true, "yo": null }
', tf, useBytes = TRUE)
df <- read_json_arrow(tf)