Skip to contents

A ChunkedArray is a data structure managing a list of primitive Arrow Arrays logically as one large array. Chunked arrays may be grouped together in a Table.

Usage

chunked_array(..., type = NULL)

Arguments

...

Vectors to coerce

type

currently ignored

Factory

The ChunkedArray$create() factory method instantiates the object from various Arrays or R vectors. chunked_array() is an alias for it.

Methods

  • $length(): Size in the number of elements this array contains

  • $chunk(i): Extract an Array chunk by integer position

  • `$nbytes() : Total number of bytes consumed by the elements of the array

  • $as_vector(): convert to an R vector

  • $Slice(offset, length = NULL): Construct a zero-copy slice of the array with the indicated offset and length. If length is NULL, the slice goes until the end of the array.

  • $Take(i): return a ChunkedArray with values at positions given by integers i. If i is an Arrow Array or ChunkedArray, it will be coerced to an R vector before taking.

  • $Filter(i, keep_na = TRUE): return a ChunkedArray with values at positions where logical vector or Arrow boolean-type (Chunked)Array i is TRUE.

  • $SortIndices(descending = FALSE): return an Array of integer positions that can be used to rearrange the ChunkedArray in ascending or descending order

  • $cast(target_type, safe = TRUE, options = cast_options(safe)): Alter the data in the array to change its type.

  • $null_count: The number of null entries in the array

  • $chunks: return a list of Arrays

  • $num_chunks: integer number of chunks in the ChunkedArray

  • $type: logical type of data

  • $View(type): Construct a zero-copy view of this ChunkedArray with the given type.

  • $Validate(): Perform any validation checks to determine obvious inconsistencies within the array's internal data. This can be an expensive check, potentially O(length)

See also

Examples

# Pass items into chunked_array as separate objects to create chunks
class_scores <- chunked_array(c(87, 88, 89), c(94, 93, 92), c(71, 72, 73))
class_scores$num_chunks
#> [1] 3

# When taking a Slice from a chunked_array, chunks are preserved
class_scores$Slice(2, length = 5)
#> ChunkedArray
#> <double>
#> [
#>   [
#>     89
#>   ],
#>   [
#>     94,
#>     93,
#>     92
#>   ],
#>   [
#>     71
#>   ]
#> ]

# You can combine Take and SortIndices to return a ChunkedArray with 1 chunk
# containing all values, ordered.
class_scores$Take(class_scores$SortIndices(descending = TRUE))
#> ChunkedArray
#> <double>
#> [
#>   [
#>     94,
#>     93,
#>     92,
#>     89,
#>     88,
#>     87,
#>     73,
#>     72,
#>     71
#>   ]
#> ]

# If you pass a list into chunked_array, you get a list of length 1
list_scores <- chunked_array(list(c(9.9, 9.6, 9.5), c(8.2, 8.3, 8.4), c(10.0, 9.9, 9.8)))
list_scores$num_chunks
#> [1] 1

# When constructing a ChunkedArray, the first chunk is used to infer type.
doubles <- chunked_array(c(1, 2, 3), c(5L, 6L, 7L))
doubles$type
#> Float64
#> double

# Concatenating chunked arrays returns a new chunked array containing all chunks
a <- chunked_array(c(1, 2), 3)
b <- chunked_array(c(4, 5), 6)
c(a, b)
#> ChunkedArray
#> <double>
#> [
#>   [
#>     1,
#>     2
#>   ],
#>   [
#>     3
#>   ],
#>   [
#>     4,
#>     5
#>   ],
#>   [
#>     6
#>   ]
#> ]