Package org.apache.arrow.vector.complex
Class ListViewVector
java.lang.Object
org.apache.arrow.vector.BaseValueVector
org.apache.arrow.vector.complex.BaseRepeatedValueViewVector
org.apache.arrow.vector.complex.ListViewVector
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Iterable<ValueVector>
,BaseListVector
,PromotableVector
,RepeatedValueVector
,DensityAwareVector
,FieldVector
,ValueIterableVector<List<?>>
,ValueVector
public class ListViewVector
extends BaseRepeatedValueViewVector
implements PromotableVector, ValueIterableVector<List<?>>
A list view vector contains lists of a specific type of elements. Its structure contains four
elements.
- A validity buffer.
- An offset buffer, that denotes lists starts.
- A size buffer, that denotes lists ends.
- A child data vector that contains the elements of lists.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected Field
protected UnionListReader
protected int
protected ArrowBuf
Fields inherited from class org.apache.arrow.vector.complex.BaseRepeatedValueViewVector
DATA_VECTOR_NAME, DEFAULT_DATA_VECTOR, defaultDataVectorName, OFFSET_WIDTH, offsetAllocationSizeInBytes, offsetBuffer, repeatedCallBack, SIZE_WIDTH, sizeAllocationSizeInBytes, sizeBuffer, valueCount, vector
Fields inherited from class org.apache.arrow.vector.BaseValueVector
allocator, fieldReader, INITIAL_VALUE_ALLOCATION, MAX_ALLOCATION_SIZE, MAX_ALLOCATION_SIZE_PROPERTY
Fields inherited from interface org.apache.arrow.vector.complex.RepeatedValueVector
DEFAULT_REPEAT_PER_RECORD
-
Constructor Summary
ConstructorsConstructorDescriptionListViewVector
(String name, BufferAllocator allocator, FieldType fieldType, CallBack callBack) Constructs a new instance.ListViewVector
(Field field, BufferAllocator allocator, CallBack callBack) Constructs a new instance. -
Method Summary
Modifier and TypeMethodDescription<OUT,
IN> OUT accept
(VectorVisitor<OUT, IN> visitor, IN value) Accept a genericVectorVisitor
and return the result.<T extends ValueVector>
AddOrGetResult<T>addOrGetVector
(FieldType fieldType) Initialize the data vector (and execute callback) if it hasn't already been done, returns the data vector.void
Allocate new buffers.boolean
Allocates new buffers.protected void
allocateValidityBuffer
(long size) void
clear()
Clear the vector data.void
copyFrom
(int inIndex, int outIndex, ValueVector from) Copy a cell value from a particular index in source vector to a particular position in this vector.void
copyFromSafe
(int inIndex, int outIndex, ValueVector from) Same asValueVector.copyFrom(int, int, ValueVector)
except that it handles the case when the capacity of the vector needs to be expanded before copy.static ListViewVector
empty
(String name, BufferAllocator allocator) void
endValue
(int index, int size) End the current value.void
exportCDataBuffers
(List<ArrowBuf> buffers, ArrowBuf buffersPtr, long nullValue) Export the buffers of the fields for C Data Interface.ArrowBuf[]
getBuffers
(boolean clear) Return the underlying buffers associated with this vector.int
Get the size (number of bytes) of underlying buffers used by this vector.int
getBufferSizeFor
(int valueCount) Get the size (number of bytes) of underlying buffers used by this.The returned list is the same size as the list passed to initializeChildrenFromFields.Gets the underlying buffer associated with data vector.long
Gets the starting address of the underlying buffer associated with data vector.Get the data vector.double
Get the density of this ListVector.int
getElementEndIndex
(int index) Get data vector end index with the given list index.int
getElementStartIndex
(int index) Get data vector start index with the given list index.getField()
Get the field associated with the list view vector.Get the buffers of the fields, (same size as getFieldVectors() since it is their content).Deprecated.Get the minor type for the vector.int
Get the number of elements that are null in the vector.List<?>
getObject
(int index) Get the element in the list view vector at a particular index.Gets the underlying buffer associated with offset vector.long
Gets the starting address of the underlying buffer associated with offset vector.Default implementation to create a reader for the vector.protected FieldReader
Each vector has a different reader that implements the FieldReader interface.long
getTransferPair
(String ref, BufferAllocator allocator) To transfer quota responsibility.getTransferPair
(String ref, BufferAllocator allocator, CallBack callBack) To transfer quota responsibility.getTransferPair
(Field field, BufferAllocator allocator) To transfer quota responsibility.getTransferPair
(Field field, BufferAllocator allocator, CallBack callBack) To transfer quota responsibility.Gets the underlying buffer associated with validity vector.long
Gets the starting address of the underlying buffer associated with validity vector.int
Get the value capacity by considering validity and offset capacity.int
Gets the number of values.int
hashCode
(int index) Get the hash code for the element at the given index.int
hashCode
(int index, ArrowBufHasher hasher) Get the hash code for the element at the given index.void
initializeChildrenFromFields
(List<Field> children) Initializes the child vectors to be later loaded with loadBuffers.boolean
isEmpty
(int index) Check if an element at given index is an empty list.boolean
isNull
(int index) Check if an element at given index is null.int
isSet
(int index) Same asisNull(int)
.void
loadFieldBuffers
(ArrowFieldNode fieldNode, List<ArrowBuf> ownBuffers) Load the buffers associated with this Field.makeTransferPair
(ValueVector target) Makes a new transfer pair used to transfer underlying buffers.void
reAlloc()
Allocate new buffer with double capacity, and copy data into the new buffer.protected void
void
reset()
Release the buffers associated with this vector.void
setInitialCapacity
(int numRecords) Set the initial record capacity.void
setInitialCapacity
(int numRecords, double density) Specialized version of setInitialCapacity() for ListViewVector.void
setInitialTotalCapacity
(int numRecords, int totalNumberOfElements) Specialized version of setInitialTotalCapacity() for ListViewVector.void
setNull
(int index) Set the element at the given index to null.void
setOffset
(int index, int value) Set the offset at the given index.void
setSize
(int index, int value) Set the size at the given index.void
setValidity
(int index, int value) Set the validity at the given index.void
setValueCount
(int valueCount) Set number of values in the vector.int
startNewValue
(int index) Start new value in the ListView vector.void
validate()
Validating ListViewVector creation based on the specification guideline.Methods inherited from class org.apache.arrow.vector.complex.BaseRepeatedValueViewVector
getLengthOfChildVector, getLengthOfChildVectorByIndex, getName, getOffsetBufferValueCapacity, getOffsetVector, getSizeBufferValueCapacity, iterator, reallocateBuffers, replaceDataVector
Methods inherited from class org.apache.arrow.vector.BaseValueVector
checkBufRefs, close, getAllocator, getTransferPair, getValidityBufferSizeFromCount, releaseBuffer, toString, transferBuffer
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface org.apache.arrow.vector.FieldVector
exportBuffer, getExportedCDataBufferCount
Methods inherited from interface java.lang.Iterable
forEach, iterator, spliterator
Methods inherited from interface org.apache.arrow.vector.ValueIterableVector
getValueIterable, getValueIterator
Methods inherited from interface org.apache.arrow.vector.ValueVector
close, getAllocator, getName, getTransferPair, validateFull
-
Field Details
-
validityBuffer
-
reader
-
field
-
validityAllocationSizeInBytes
protected int validityAllocationSizeInBytes
-
-
Constructor Details
-
ListViewVector
public ListViewVector(String name, BufferAllocator allocator, FieldType fieldType, CallBack callBack) Constructs a new instance.- Parameters:
name
- The name of the instance.allocator
- The allocator to use for allocating/reallocating buffers.fieldType
- The type of this list.callBack
- A schema change callback.
-
ListViewVector
Constructs a new instance.- Parameters:
field
- The field materialized by this vector.allocator
- The allocator to use for allocating/reallocating buffers.callBack
- A schema change callback.
-
-
Method Details
-
empty
-
initializeChildrenFromFields
Description copied from interface:FieldVector
Initializes the child vectors to be later loaded with loadBuffers.- Specified by:
initializeChildrenFromFields
in interfaceFieldVector
- Parameters:
children
- the schema
-
setInitialCapacity
public void setInitialCapacity(int numRecords) Description copied from interface:ValueVector
Set the initial record capacity.- Specified by:
setInitialCapacity
in interfaceValueVector
- Overrides:
setInitialCapacity
in classBaseRepeatedValueViewVector
- Parameters:
numRecords
- the initial record capacity.
-
setInitialCapacity
public void setInitialCapacity(int numRecords, double density) Specialized version of setInitialCapacity() for ListViewVector. This is used by some callers when they want to explicitly control and be conservative about memory allocated for inner data vector. This is very useful when we are working with memory constraints for a query and have a fixed amount of memory reserved for the record batch. In such cases, we are likely to face OOM or related problems when we reserve memory for a record batch with value count x and do setInitialCapacity(x) such that each vector allocates only what is necessary and not the default amount, but the multiplier forces the memory requirement to go beyond what was needed.- Specified by:
setInitialCapacity
in interfaceDensityAwareVector
- Overrides:
setInitialCapacity
in classBaseRepeatedValueViewVector
- Parameters:
numRecords
- value countdensity
- density of ListViewVector. Density is the average size of a list per position in the ListViewVector. For example, a density value of 10 implies each position in the list vector has a list of 10 values. A density value of 0.1 implies out of 10 positions in the list vector, 1 position has a list of size 1, and the remaining positions are null (no lists) or empty lists. This helps in tightly controlling the memory we provision for inner data vector.
-
setInitialTotalCapacity
public void setInitialTotalCapacity(int numRecords, int totalNumberOfElements) Specialized version of setInitialTotalCapacity() for ListViewVector. This is used by some callers when they want to explicitly control and be conservative about memory allocated for inner data vector. This is very useful when we are working with memory constraints for a query and have a fixed amount of memory reserved for the record batch. In such cases, we are likely to face OOM or related problems when we reserve memory for a record batch with value count x and do setInitialCapacity(x) such that each vector allocates only what is necessary and not the default amount, but the multiplier forces the memory requirement to go beyond what was needed.- Overrides:
setInitialTotalCapacity
in classBaseRepeatedValueViewVector
- Parameters:
numRecords
- value counttotalNumberOfElements
- the total number of elements to allow for in this vector across all records.
-
getChildrenFromFields
Description copied from interface:FieldVector
The returned list is the same size as the list passed to initializeChildrenFromFields.- Specified by:
getChildrenFromFields
in interfaceFieldVector
- Returns:
- the children according to schema (empty for primitive types)
-
loadFieldBuffers
Load the buffers associated with this Field.- Specified by:
loadFieldBuffers
in interfaceFieldVector
- Parameters:
fieldNode
- the fieldNodeownBuffers
- the buffers for this Field (own buffers only, children not included)
-
getFieldBuffers
Description copied from interface:FieldVector
Get the buffers of the fields, (same size as getFieldVectors() since it is their content).- Specified by:
getFieldBuffers
in interfaceFieldVector
- Returns:
- the buffers containing the data for this vector (ready for reading)
-
exportCDataBuffers
Export the buffers of the fields for C Data Interface. This method traverses the buffers and export buffer and buffer's memory address into a list of buffers and a pointer to the list of buffers.- Specified by:
exportCDataBuffers
in interfaceFieldVector
-
allocateNew
Description copied from interface:ValueVector
Allocate new buffers. ValueVector implements logic to determine how much to allocate.- Specified by:
allocateNew
in interfaceValueVector
- Throws:
OutOfMemoryException
- Thrown if no memory can be allocated.
-
allocateNewSafe
public boolean allocateNewSafe()Description copied from interface:ValueVector
Allocates new buffers. ValueVector implements logic to determine how much to allocate.- Specified by:
allocateNewSafe
in interfaceValueVector
- Overrides:
allocateNewSafe
in classBaseRepeatedValueViewVector
- Returns:
- Returns true if allocation was successful.
-
allocateValidityBuffer
protected void allocateValidityBuffer(long size) -
reAlloc
public void reAlloc()Description copied from interface:ValueVector
Allocate new buffer with double capacity, and copy data into the new buffer. Replace vector's buffer with new buffer, and release old one- Specified by:
reAlloc
in interfaceValueVector
- Overrides:
reAlloc
in classBaseRepeatedValueViewVector
-
reallocValidityAndSizeAndOffsetBuffers
protected void reallocValidityAndSizeAndOffsetBuffers() -
copyFromSafe
Description copied from interface:ValueVector
Same asValueVector.copyFrom(int, int, ValueVector)
except that it handles the case when the capacity of the vector needs to be expanded before copy.- Specified by:
copyFromSafe
in interfaceValueVector
- Overrides:
copyFromSafe
in classBaseValueVector
- Parameters:
inIndex
- position to copy from in source vectoroutIndex
- position to copy to in this vectorfrom
- source vector
-
copyFrom
Description copied from interface:ValueVector
Copy a cell value from a particular index in source vector to a particular position in this vector.- Specified by:
copyFrom
in interfaceValueVector
- Overrides:
copyFrom
in classBaseValueVector
- Parameters:
inIndex
- position to copy from in source vectoroutIndex
- position to copy to in this vectorfrom
- source vector
-
getDataVector
Description copied from interface:RepeatedValueVector
Get the data vector.- Specified by:
getDataVector
in interfaceRepeatedValueVector
- Overrides:
getDataVector
in classBaseRepeatedValueViewVector
- Returns:
- the underlying data vector or null if none exists.
-
getTransferPair
Description copied from interface:ValueVector
To transfer quota responsibility.- Specified by:
getTransferPair
in interfaceValueVector
- Parameters:
ref
- the name of the vectorallocator
- the target allocator- Returns:
- a
transfer pair
, creating a new target vector of the same type.
-
getTransferPair
Description copied from interface:ValueVector
To transfer quota responsibility.- Specified by:
getTransferPair
in interfaceValueVector
- Parameters:
field
- the Field object used by the target vectorallocator
- the target allocator- Returns:
- a
transfer pair
, creating a new target vector of the same type.
-
getTransferPair
Description copied from interface:ValueVector
To transfer quota responsibility.- Specified by:
getTransferPair
in interfaceValueVector
- Parameters:
ref
- the name of the vectorallocator
- the target allocatorcallBack
- A schema change callback.- Returns:
- a
transfer pair
, creating a new target vector of the same type.
-
getTransferPair
Description copied from interface:ValueVector
To transfer quota responsibility.- Specified by:
getTransferPair
in interfaceValueVector
- Parameters:
field
- the Field object used by the target vectorallocator
- the target allocatorcallBack
- A schema change callback.- Returns:
- a
transfer pair
, creating a new target vector of the same type.
-
makeTransferPair
Description copied from interface:ValueVector
Makes a new transfer pair used to transfer underlying buffers.- Specified by:
makeTransferPair
in interfaceValueVector
- Parameters:
target
- the target for the transfer- Returns:
- a new
transfer pair
that is used to transfer underlying buffers into the target vector.
-
getValidityBufferAddress
public long getValidityBufferAddress()Description copied from interface:FieldVector
Gets the starting address of the underlying buffer associated with validity vector.- Specified by:
getValidityBufferAddress
in interfaceFieldVector
- Returns:
- buffer address
-
getDataBufferAddress
public long getDataBufferAddress()Description copied from interface:FieldVector
Gets the starting address of the underlying buffer associated with data vector.- Specified by:
getDataBufferAddress
in interfaceFieldVector
- Returns:
- buffer address
-
getOffsetBufferAddress
public long getOffsetBufferAddress()Description copied from interface:FieldVector
Gets the starting address of the underlying buffer associated with offset vector.- Specified by:
getOffsetBufferAddress
in interfaceFieldVector
- Returns:
- buffer address
-
getValidityBuffer
Description copied from interface:ValueVector
Gets the underlying buffer associated with validity vector.- Specified by:
getValidityBuffer
in interfaceValueVector
- Returns:
- buffer
-
getDataBuffer
Description copied from interface:ValueVector
Gets the underlying buffer associated with data vector.- Specified by:
getDataBuffer
in interfaceValueVector
- Returns:
- buffer
-
getOffsetBuffer
Description copied from interface:ValueVector
Gets the underlying buffer associated with offset vector.- Specified by:
getOffsetBuffer
in interfaceValueVector
- Returns:
- buffer
-
getSizeBuffer
-
getSizeBufferAddress
public long getSizeBufferAddress() -
hashCode
public int hashCode(int index) Get the hash code for the element at the given index.- Specified by:
hashCode
in interfaceValueVector
- Parameters:
index
- position of the element- Returns:
- hash code for the element at the given index
-
hashCode
Get the hash code for the element at the given index.- Specified by:
hashCode
in interfaceValueVector
- Parameters:
index
- position of the elementhasher
- hasher to use- Returns:
- hash code for the element at the given index
-
accept
Description copied from interface:ValueVector
Accept a genericVectorVisitor
and return the result.- Specified by:
accept
in interfaceValueVector
- Type Parameters:
OUT
- the output result type.IN
- the input data together with visitor.
-
getReaderImpl
Description copied from class:BaseValueVector
Each vector has a different reader that implements the FieldReader interface. Overridden methods must make sure to return the correct concrete reader implementation.- Specified by:
getReaderImpl
in classBaseValueVector
- Returns:
- Returns a lambda that initializes a reader when called.
-
getReader
Description copied from class:BaseValueVector
Default implementation to create a reader for the vector. Depends on the individual vector class' implementation ofBaseValueVector.getReaderImpl()
to initialize the reader appropriately.- Specified by:
getReader
in interfaceValueVector
- Overrides:
getReader
in classBaseValueVector
- Returns:
- Concrete instance of FieldReader by using double-checked locking.
-
getBufferSize
public int getBufferSize()Get the size (number of bytes) of underlying buffers used by this vector.- Specified by:
getBufferSize
in interfaceValueVector
- Overrides:
getBufferSize
in classBaseRepeatedValueViewVector
- Returns:
- size of underlying buffers.
-
getBufferSizeFor
public int getBufferSizeFor(int valueCount) Get the size (number of bytes) of underlying buffers used by this.- Specified by:
getBufferSizeFor
in interfaceValueVector
- Overrides:
getBufferSizeFor
in classBaseRepeatedValueViewVector
- Parameters:
valueCount
- the number of values to assume this vector contains- Returns:
- size of underlying buffers.
-
getField
Get the field associated with the list view vector.- Specified by:
getField
in interfaceValueVector
- Returns:
- the field
-
getMinorType
Get the minor type for the vector.- Specified by:
getMinorType
in interfaceValueVector
- Returns:
- the minor type
-
clear
public void clear()Clear the vector data.- Specified by:
clear
in interfaceValueVector
- Overrides:
clear
in classBaseRepeatedValueViewVector
-
reset
public void reset()Release the buffers associated with this vector.- Specified by:
reset
in interfaceValueVector
- Overrides:
reset
in classBaseRepeatedValueViewVector
-
getBuffers
Return the underlying buffers associated with this vector. Note that this doesn't impact the reference counts for this buffer, so it only should be used for in-context access. Also note that this buffer changes regularly, thus external classes shouldn't hold a reference to it (unless they change it).- Specified by:
getBuffers
in interfaceValueVector
- Overrides:
getBuffers
in classBaseRepeatedValueViewVector
- Parameters:
clear
- Whether to clear vector before returning, the buffers will still be refcounted but the returned array will be the only reference to them- Returns:
- The underlying
buffers
that is used by this vector instance.
-
getObject
Get the element in the list view vector at a particular index.- Specified by:
getObject
in interfaceValueVector
- Parameters:
index
- position of the element- Returns:
- Object at given position
-
isNull
public boolean isNull(int index) Check if an element at given index is null.- Specified by:
isNull
in interfaceValueVector
- Parameters:
index
- position of an element- Returns:
- true if an element at given index is null, false otherwise
-
isEmpty
public boolean isEmpty(int index) Check if an element at given index is an empty list.- Specified by:
isEmpty
in classBaseRepeatedValueViewVector
- Parameters:
index
- position of an element- Returns:
- true if an element at given index is an empty list or NULL, false otherwise
-
isSet
public int isSet(int index) Same asisNull(int)
.- Parameters:
index
- position of the element- Returns:
- 1 if element at given index is not null, 0 otherwise
-
getNullCount
public int getNullCount()Get the number of elements that are null in the vector.- Specified by:
getNullCount
in interfaceValueVector
- Returns:
- the number of null elements.
-
getValueCapacity
public int getValueCapacity()Get the value capacity by considering validity and offset capacity. Note that the size buffer capacity is not considered here since it has the same capacity as the offset buffer.- Specified by:
getValueCapacity
in interfaceValueVector
- Overrides:
getValueCapacity
in classBaseRepeatedValueViewVector
- Returns:
- the value capacity
-
setNull
public void setNull(int index) Set the element at the given index to null.- Specified by:
setNull
in interfaceFieldVector
- Parameters:
index
- the value to change
-
startNewValue
public int startNewValue(int index) Start new value in the ListView vector.- Overrides:
startNewValue
in classBaseRepeatedValueViewVector
- Parameters:
index
- index of the value to start- Returns:
- offset of the new value
-
setOffset
public void setOffset(int index, int value) Set the offset at the given index. Make sure to use this function after updating `field` vector and using `setValidity`- Parameters:
index
- index of the value to setvalue
- value to set
-
setSize
public void setSize(int index, int value) Set the size at the given index. Make sure to use this function after using `setOffset`.- Parameters:
index
- index of the value to setvalue
- value to set
-
setValidity
public void setValidity(int index, int value) Set the validity at the given index.- Parameters:
index
- index of the value to setvalue
- value to set (0 for unset and 1 for a set)
-
setValueCount
public void setValueCount(int valueCount) Description copied from interface:ValueVector
Set number of values in the vector.- Specified by:
setValueCount
in interfaceValueVector
- Overrides:
setValueCount
in classBaseRepeatedValueViewVector
-
getElementStartIndex
public int getElementStartIndex(int index) Description copied from interface:BaseListVector
Get data vector start index with the given list index.- Specified by:
getElementStartIndex
in interfaceBaseListVector
-
getElementEndIndex
public int getElementEndIndex(int index) Description copied from interface:BaseListVector
Get data vector end index with the given list index.- Specified by:
getElementEndIndex
in interfaceBaseListVector
-
addOrGetVector
Description copied from class:BaseRepeatedValueViewVector
Initialize the data vector (and execute callback) if it hasn't already been done, returns the data vector.- Specified by:
addOrGetVector
in interfacePromotableVector
- Overrides:
addOrGetVector
in classBaseRepeatedValueViewVector
-
promoteToUnion
- Specified by:
promoteToUnion
in interfacePromotableVector
-
getFieldInnerVectors
Deprecated.Description copied from interface:FieldVector
Get the inner vectors.- Specified by:
getFieldInnerVectors
in interfaceFieldVector
- Returns:
- the inner vectors for this field as defined by the TypeLayout
-
getWriter
-
getValueCount
public int getValueCount()Description copied from interface:ValueVector
Gets the number of values.- Specified by:
getValueCount
in interfaceValueVector
- Overrides:
getValueCount
in classBaseRepeatedValueViewVector
- Returns:
- number of values in the vector
-
getDensity
public double getDensity()Get the density of this ListVector.- Returns:
- density
-
validate
public void validate()Validating ListViewVector creation based on the specification guideline.- Specified by:
validate
in interfaceValueVector
-
endValue
public void endValue(int index, int size) End the current value.- Parameters:
index
- index of the value to endsize
- number of elements in the list that was written
-