- All Superinterfaces:
AutoCloseable
,Closeable
,Iterable<ValueVector>
- All Known Subinterfaces:
BaseIntVector
,BaseListVector
,ElementAddressableVector
,FieldVector
,FixedWidthVector
,FloatingPointVector
,RepeatedValueVector
,ValueIterableVector<T>
,VariableWidthFieldVector
,VariableWidthVector
- All Known Implementing Classes:
AbstractContainerVector
,AbstractStructVector
,BaseFixedWidthVector
,BaseLargeRepeatedValueViewVector
,BaseLargeVariableWidthVector
,BaseRepeatedValueVector
,BaseRepeatedValueViewVector
,BaseValueVector
,BaseVariableWidthVector
,BaseVariableWidthViewVector
,BigIntVector
,BitVector
,DateDayVector
,DateMilliVector
,Decimal256Vector
,DecimalVector
,DenseUnionVector
,DurationVector
,ExtensionTypeVector
,FixedSizeBinaryVector
,FixedSizeListVector
,Float2Vector
,Float4Vector
,Float8Vector
,IntervalDayVector
,IntervalMonthDayNanoVector
,IntervalYearVector
,IntVector
,LargeListVector
,LargeListViewVector
,LargeVarBinaryVector
,LargeVarCharVector
,ListVector
,ListViewVector
,MapVector
,NonNullableStructVector
,NullVector
,OpaqueVector
,RunEndEncodedVector
,SmallIntVector
,StructVector
,TimeMicroVector
,TimeMilliVector
,TimeNanoVector
,TimeSecVector
,TimeStampMicroTZVector
,TimeStampMicroVector
,TimeStampMilliTZVector
,TimeStampMilliVector
,TimeStampNanoTZVector
,TimeStampNanoVector
,TimeStampSecTZVector
,TimeStampSecVector
,TimeStampVector
,TinyIntVector
,UInt1Vector
,UInt2Vector
,UInt4Vector
,UInt8Vector
,UnionVector
,VarBinaryVector
,VarCharVector
,ViewVarBinaryVector
,ViewVarCharVector
,ZeroVector
An abstraction that is used to store a sequence of values in an individual column.
A value vector
stores underlying data in-memory in a columnar fashion that
is compact and efficient. The column whose data is stored, is referred by getField()
.
It is important that vector is allocated before attempting to read or write.
There are a few "rules" around vectors:
- values need to be written in order (e.g. index 0, 1, 2, 5)
- null vectors start with all values as null before writing anything
- for variable width types, the offset vector should be all zeros before writing
- you must call setValueCount before a vector can be read
- you should never write to a vector once it has been read.
Please note that the current implementation doesn't enforce those rules, hence we may find few places that deviate from these rules (e.g. offset vectors in Variable Length and Repeated vector)
This interface "should" strive to guarantee this order of operation:
allocate > mutate > setvaluecount > access > clear (or allocate to start the process over).
-
Method Summary
Modifier and TypeMethodDescription<OUT,
IN> OUT accept
(VectorVisitor<OUT, IN> visitor, IN value) Accept a genericVectorVisitor
and return the result.void
Allocate new buffers.boolean
Allocates new buffers.void
clear()
Release any owned ArrowBuf and reset the ValueVector to the initial state.void
close()
Alternative to clear().void
copyFrom
(int fromIndex, int thisIndex, ValueVector from) Copy a cell value from a particular index in source vector to a particular position in this vector.void
copyFromSafe
(int fromIndex, int thisIndex, ValueVector from) Same ascopyFrom(int, int, ValueVector)
except that it handles the case when the capacity of the vector needs to be expanded before copy.Get the allocator associated with the vector.ArrowBuf[]
getBuffers
(boolean clear) Return the underlying buffers associated with this vector.int
Get the number of bytes used by this vector.int
getBufferSizeFor
(int valueCount) Returns the number of bytes that is used by this vector if it holds the given number of values.Gets the underlying buffer associated with data vector.getField()
Get information about how this field is materialized.getName()
Gets the name of the vector.int
Returns number of null elements in the vector.getObject
(int index) Get friendly type object from the vector.Gets the underlying buffer associated with offset vector.Get a reader for this vector.getTransferPair
(String ref, BufferAllocator allocator) To transfer quota responsibility.getTransferPair
(String ref, BufferAllocator allocator, CallBack callBack) To transfer quota responsibility.getTransferPair
(BufferAllocator allocator) To transfer quota responsibility.getTransferPair
(Field field, BufferAllocator allocator) To transfer quota responsibility.getTransferPair
(Field field, BufferAllocator allocator, CallBack callBack) To transfer quota responsibility.Gets the underlying buffer associated with validity vector.int
Returns the maximum number of values that can be stored in this vector instance.int
Gets the number of values.int
hashCode
(int index) Returns hashCode of element in index with the default hasher.int
hashCode
(int index, ArrowBufHasher hasher) Returns hashCode of element in index with the given hasher.boolean
isNull
(int index) Check whether an element in the vector is null.makeTransferPair
(ValueVector target) Makes a new transfer pair used to transfer underlying buffers.void
reAlloc()
Allocate new buffer with double capacity, and copy data into the new buffer.void
reset()
Reset the ValueVector to the initial state without releasing any owned ArrowBuf.void
setInitialCapacity
(int numRecords) Set the initial record capacity.void
setValueCount
(int valueCount) Set number of values in the vector.default void
validate()
default void
Methods inherited from interface java.lang.Iterable
forEach, iterator, spliterator
-
Method Details
-
allocateNew
Allocate new buffers. ValueVector implements logic to determine how much to allocate.- Throws:
OutOfMemoryException
- Thrown if no memory can be allocated.
-
allocateNewSafe
boolean allocateNewSafe()Allocates new buffers. ValueVector implements logic to determine how much to allocate.- Returns:
- Returns true if allocation was successful.
-
reAlloc
void reAlloc()Allocate new buffer with double capacity, and copy data into the new buffer. Replace vector's buffer with new buffer, and release old one -
getAllocator
BufferAllocator getAllocator()Get the allocator associated with the vector. CAVEAT: Some ValueVector subclasses (e.g. NullVector) do not require an allocator for data storage and may return null.- Returns:
- Returns nullable allocator.
-
setInitialCapacity
void setInitialCapacity(int numRecords) Set the initial record capacity.- Parameters:
numRecords
- the initial record capacity.
-
getValueCapacity
int getValueCapacity()Returns the maximum number of values that can be stored in this vector instance.- Returns:
- the maximum number of values that can be stored in this vector instance.
-
close
void close()Alternative to clear(). Allows use as an AutoCloseable in try-with-resources.- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
-
clear
void clear()Release any owned ArrowBuf and reset the ValueVector to the initial state. If the vector has any child vectors, they will also be cleared. -
reset
void reset()Reset the ValueVector to the initial state without releasing any owned ArrowBuf. Buffer capacities will remain unchanged and any previous data will be zeroed out. This includes buffers for data, validity, offset, etc. If the vector has any child vectors, they will also be reset. -
getField
Field getField()Get information about how this field is materialized.- Returns:
- the field corresponding to this vector
-
getMinorType
Types.MinorType getMinorType() -
getTransferPair
To transfer quota responsibility.- Parameters:
allocator
- the target allocator- Returns:
- a
transfer pair
, creating a new target vector of the same type.
-
getTransferPair
To transfer quota responsibility.- Parameters:
ref
- the name of the vectorallocator
- the target allocator- Returns:
- a
transfer pair
, creating a new target vector of the same type.
-
getTransferPair
To transfer quota responsibility.- Parameters:
field
- the Field object used by the target vectorallocator
- the target allocator- Returns:
- a
transfer pair
, creating a new target vector of the same type.
-
getTransferPair
To transfer quota responsibility.- Parameters:
ref
- the name of the vectorallocator
- the target allocatorcallBack
- A schema change callback.- Returns:
- a
transfer pair
, creating a new target vector of the same type.
-
getTransferPair
To transfer quota responsibility.- Parameters:
field
- the Field object used by the target vectorallocator
- the target allocatorcallBack
- A schema change callback.- Returns:
- a
transfer pair
, creating a new target vector of the same type.
-
makeTransferPair
Makes a new transfer pair used to transfer underlying buffers.- Parameters:
target
- the target for the transfer- Returns:
- a new
transfer pair
that is used to transfer underlying buffers into the target vector.
-
getReader
FieldReader getReader()Get a reader for this vector.- Returns:
- a
field reader
that supports reading values from this vector.
-
getBufferSize
int getBufferSize()Get the number of bytes used by this vector.- Returns:
- the number of bytes that is used by this vector instance.
-
getBufferSizeFor
int getBufferSizeFor(int valueCount) Returns the number of bytes that is used by this vector if it holds the given number of values. The result will be the same as if setValueCount() were called, followed by calling getBufferSize(), but without any of the closing side-effects that setValueCount() implies wrt finishing off the population of a vector. Some operations might wish to use this to determine how much memory has been used by a vector so far, even though it is not finished being populated.- Parameters:
valueCount
- the number of values to assume this vector contains- Returns:
- the buffer size if this vector is holding valueCount values
-
getBuffers
Return the underlying buffers associated with this vector. Note that this doesn't impact the reference counts for this buffer so it only should be used for in-context access. Also note that this buffer changes regularly thus external classes shouldn't hold a reference to it (unless they change it).- Parameters:
clear
- Whether to clear vector before returning; the buffers will still be refcounted; but the returned array will be the only reference to them- Returns:
- The underlying
buffers
that is used by this vector instance.
-
getValidityBuffer
ArrowBuf getValidityBuffer()Gets the underlying buffer associated with validity vector.- Returns:
- buffer
-
getDataBuffer
ArrowBuf getDataBuffer()Gets the underlying buffer associated with data vector.- Returns:
- buffer
-
getOffsetBuffer
ArrowBuf getOffsetBuffer()Gets the underlying buffer associated with offset vector.- Returns:
- buffer
-
getValueCount
int getValueCount()Gets the number of values.- Returns:
- number of values in the vector
-
setValueCount
void setValueCount(int valueCount) Set number of values in the vector. -
getObject
Get friendly type object from the vector.- Parameters:
index
- index of object to get- Returns:
- friendly type object
-
getNullCount
int getNullCount()Returns number of null elements in the vector.- Returns:
- number of null elements
-
isNull
boolean isNull(int index) Check whether an element in the vector is null.- Parameters:
index
- index to check for null- Returns:
- true if element is null
-
hashCode
int hashCode(int index) Returns hashCode of element in index with the default hasher. -
hashCode
Returns hashCode of element in index with the given hasher. -
copyFrom
Copy a cell value from a particular index in source vector to a particular position in this vector.- Parameters:
fromIndex
- position to copy from in source vectorthisIndex
- position to copy to in this vectorfrom
- source vector
-
copyFromSafe
Same ascopyFrom(int, int, ValueVector)
except that it handles the case when the capacity of the vector needs to be expanded before copy.- Parameters:
fromIndex
- position to copy from in source vectorthisIndex
- position to copy to in this vectorfrom
- source vector
-
accept
Accept a genericVectorVisitor
and return the result.- Type Parameters:
OUT
- the output result type.IN
- the input data together with visitor.
-
getName
String getName()Gets the name of the vector.- Returns:
- the name of the vector.
-
validate
default void validate() -
validateFull
default void validateFull()
-