A Table is a sequence of chunked arrays. They have a similar interface to record batches, but they can be composed from multiple record batches or chunked arrays.
The Table$create() function takes the following arguments:
... arrays, chunked arrays, or R vectors, with names; alternatively,
an unnamed series of record batches may also be provided,
which will be stacked as rows in the table.
schema a Schema, or NULL (the default) to infer the schema from
the data in ...
Tables are data-frame-like, and many methods you expect to work on
a data.frame are implemented for Table. This includes [, [[,
$, names, dim, nrow, ncol, head, and tail. You can also pull
the data from an Arrow table into R with as.data.frame(). See the
examples.
A caveat about the $ method: because Table is an R6 object,
$ is also used to access the object's methods (see below). Methods take
precedence over the table's columns. So, tab$Slice would return the
"Slice" method function even if there were a column in the table called
"Slice".
In addition to the more R-friendly S3 methods, a Table object has
the following R6 methods that map onto the underlying C++ methods:
$column(i): Extract a ChunkedArray by integer position from the table
$ColumnNames(): Get all column names (called by names(tab))
$GetColumnByName(name): Extract a ChunkedArray by string name
$field(i): Extract a Field from the table schema by integer position
$SelectColumns(indices): Return new Table with specified columns, expressed as 0-based integers.
$Slice(offset, length = NULL): Create a zero-copy view starting at the
indicated integer offset and going for the given length, or to the end
of the table if NULL, the default.
$Take(i): return an Table with rows at positions given by
integers i. If i is an Arrow Array or ChunkedArray, it will be
coerced to an R vector before taking.
$Filter(i, keep_na = TRUE): return an Table with rows at positions where logical
vector or Arrow boolean-type (Chunked)Array i is TRUE.
$serialize(output_stream, ...): Write the table to the given
OutputStream
$cast(target_schema, safe = TRUE, options = cast_options(safe)): Alter
the schema of the record batch.
There are also some active bindings:
$num_columns
$num_rows
$schema
$metadata: Returns the key-value metadata of the Schema as a named list.
Modify or replace by assigning in (tab$metadata <- new_metadata).
All list elements are coerced to string.
$columns: Returns a list of ChunkedArrays
#> [1] 32 12#> [1] 6 12#> [1] "name" "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" #> [11] "gear" "carb"tab$mpg#> ChunkedArray #> [ #> [ #> 21, #> 21, #> 22.8, #> 21.4, #> 18.7, #> 18.1, #> 14.3, #> 24.4, #> 22.8, #> 19.2, #> ... #> 15.2, #> 13.3, #> 19.2, #> 27.3, #> 26, #> 30.4, #> 15.8, #> 19.7, #> 15, #> 21.4 #> ] #> ]tab[["cyl"]]#> ChunkedArray #> [ #> [ #> 6, #> 6, #> 4, #> 6, #> 8, #> 6, #> 8, #> 4, #> 4, #> 6, #> ... #> 8, #> 8, #> 8, #> 4, #> 4, #> 4, #> 8, #> 6, #> 8, #> 4 #> ] #> ]#> # A tibble: 5 x 3 #> gear hp wt #> <dbl> <dbl> <dbl> #> 1 3 110 3.22 #> 2 3 175 3.44 #> 3 3 105 3.46 #> 4 3 245 3.57 #> 5 4 62 3.19# }