Package org.apache.arrow.vector
Class VectorSchemaRoot
java.lang.Object
org.apache.arrow.vector.VectorSchemaRoot
- All Implemented Interfaces:
AutoCloseable
Holder for a set of vectors to be loaded/unloaded. A VectorSchemaRoot is a container that can
hold batches, batches flow through VectorSchemaRoot as part of a pipeline. Note this is different
from other implementations (i.e. in C++ and Python, a RecordBatch is a collection of equal-length
vector instances and was created each time for a new batch).
The recommended usage for VectorSchemaRoot is creating a single VectorSchemaRoot based on the known schema and populated data over and over into the same VectorSchemaRoot in a stream of batches rather than create a new VectorSchemaRoot instance each time (see Flight or ArrowFileWriter for better understanding). Thus at any one point a VectorSchemaRoot may have data or may have no data (say it was transferred downstream or not yet populated).
-
Constructor Summary
ConstructorDescriptionVectorSchemaRoot
(Iterable<FieldVector> vectors) Constructs new instance containing each of the vectors.VectorSchemaRoot
(List<Field> fields, List<FieldVector> fieldVectors) Constructs a new instance.VectorSchemaRoot
(List<Field> fields, List<FieldVector> fieldVectors, int rowCount) Constructs a new instance.VectorSchemaRoot
(FieldVector parent) Constructs a new instance containing the children of parent but not the parent itself.VectorSchemaRoot
(Schema schema, List<FieldVector> fieldVectors, int rowCount) Constructs a new instance. -
Method Summary
Modifier and TypeMethodDescriptionaddVector
(int index, FieldVector vector) Add vector to the record batch, producing a new VectorSchemaRoot.void
Do an adaptive allocation of each vector for memory purposes.boolean
approxEquals
(VectorSchemaRoot other) Determine if two VectorSchemaRoots are approximately equal using default functions to calculate difference between float/double values.boolean
approxEquals
(VectorSchemaRoot other, VectorValueEqualizer<Float4Vector> floatDiffFunction, VectorValueEqualizer<Float8Vector> doubleDiffFunction) Determine if two VectorSchemaRoots are approximately equal using the given functions to calculate difference between float/double values.void
clear()
Release all the memory for each vector held in this root.void
close()
Returns a tab separated value of vectors (based on their java object representation).static VectorSchemaRoot
create
(Schema schema, BufferAllocator allocator) Creates a new set of empty vectors corresponding to the given schema.boolean
equals
(VectorSchemaRoot other) Determine if two VectorSchemaRoots are exactly equal.int
getVector
(int index) gets a vector by name.static VectorSchemaRoot
of
(FieldVector... vectors) Constructs a new instance from vectors.removeVector
(int index) Remove vector from the record batch, producing a new VectorSchemaRoot.void
setRowCount
(int rowCount) Set the row count of all the vectors in this container.slice
(int index) Slice this root from desired index.slice
(int index, int length) Slice this root at desired index and length.boolean
Synchronizes the schema from the current vectors.
-
Constructor Details
-
VectorSchemaRoot
Constructs new instance containing each of the vectors. -
VectorSchemaRoot
Constructs a new instance containing the children of parent but not the parent itself. -
VectorSchemaRoot
Constructs a new instance.- Parameters:
fields
- The types of each vector.fieldVectors
- The data vectors (must be equal in size tofields
.
-
VectorSchemaRoot
Constructs a new instance.- Parameters:
fields
- The types of each vector.fieldVectors
- The data vectors (must be equal in size tofields
.rowCount
- The number of rows contained.
-
VectorSchemaRoot
Constructs a new instance.- Parameters:
schema
- The schema for the vectors.fieldVectors
- The data vectors.rowCount
- The number of rows
-
-
Method Details
-
create
Creates a new set of empty vectors corresponding to the given schema. -
of
Constructs a new instance from vectors. -
allocateNew
public void allocateNew()Do an adaptive allocation of each vector for memory purposes. Sizes will be based on previously defined initial allocation for each vector (and subsequent size learned). -
clear
public void clear()Release all the memory for each vector held in this root. This DOES NOT remove vectors from the container. -
getFieldVectors
-
getVector
gets a vector by name.if name occurs multiple times this returns the first inserted entry for name
-
getVector
-
getVector
-
addVector
Add vector to the record batch, producing a new VectorSchemaRoot.- Parameters:
index
- field indexvector
- vector to be added.- Returns:
- out VectorSchemaRoot with vector added
-
removeVector
Remove vector from the record batch, producing a new VectorSchemaRoot.- Parameters:
index
- field index- Returns:
- out VectorSchemaRoot with vector removed
-
getSchema
-
getRowCount
public int getRowCount() -
setRowCount
public void setRowCount(int rowCount) Set the row count of all the vectors in this container. Also sets the value count for each root level contained FieldVector.- Parameters:
rowCount
- Number of records.
-
close
public void close()- Specified by:
close
in interfaceAutoCloseable
-
contentToTSVString
Returns a tab separated value of vectors (based on their java object representation). -
syncSchema
public boolean syncSchema()Synchronizes the schema from the current vectors. In some cases, the schema and the actual vector structure may be different. This can be caused by a promoted writer (For details, please seePromotableWriter
). For example, when writing different types of data to aListVector
may lead to such a case. When this happens, this method should be called to bring the schema and vector structure in a synchronized state.- Returns:
- true if the schema is updated, false otherwise.
-
slice
Slice this root from desired index.- Parameters:
index
- start position of the slice- Returns:
- the sliced root
-
slice
Slice this root at desired index and length.- Parameters:
index
- start position of the slicelength
- length of the slice- Returns:
- the sliced root
-
equals
Determine if two VectorSchemaRoots are exactly equal. -
approxEquals
public boolean approxEquals(VectorSchemaRoot other, VectorValueEqualizer<Float4Vector> floatDiffFunction, VectorValueEqualizer<Float8Vector> doubleDiffFunction) Determine if two VectorSchemaRoots are approximately equal using the given functions to calculate difference between float/double values. Note that approx equals are in regards to floating point values, other values are comparing to exactly equals.- Parameters:
floatDiffFunction
- function to calculate difference between float values.doubleDiffFunction
- function to calculate difference between double values.
-
approxEquals
Determine if two VectorSchemaRoots are approximately equal using default functions to calculate difference between float/double values.
-