public abstract class BaseLargeVariableWidthVector extends BaseValueVector implements VariableWidthVector, FieldVector, VectorDefinitionSetter
Modifier and Type | Field and Description |
---|---|
protected static byte[] |
emptyByteArray |
protected Field |
field |
protected int |
lastSet |
static int |
OFFSET_WIDTH |
protected ArrowBuf |
offsetBuffer |
protected ArrowBuf |
validityBuffer |
protected ArrowBuf |
valueBuffer |
protected int |
valueCount |
allocator, fieldReader, INITIAL_VALUE_ALLOCATION, MAX_ALLOCATION_SIZE, MAX_ALLOCATION_SIZE_PROPERTY
Constructor and Description |
---|
BaseLargeVariableWidthVector(Field field,
BufferAllocator allocator)
Constructs a new instance.
|
Modifier and Type | Method and Description |
---|---|
<OUT,IN> OUT |
accept(VectorVisitor<OUT,IN> visitor,
IN value)
Accept a generic
VectorVisitor and return the result. |
void |
allocateNew()
Same as
allocateNewSafe() . |
void |
allocateNew(int valueCount)
Allocate a new memory space for this vector.
|
void |
allocateNew(long totalBytes,
int valueCount)
Allocate memory for the vector to support storing at least the provided number of
elements in the vector.
|
boolean |
allocateNewSafe()
Allocate memory for the vector.
|
void |
clear()
Same as
close() . |
void |
close()
Close the vector and release the associated buffers.
|
void |
copyFrom(int fromIndex,
int thisIndex,
ValueVector from)
Copy a cell value from a particular index in source vector to a particular
position in this vector.
|
void |
copyFromSafe(int fromIndex,
int thisIndex,
ValueVector from)
Same as
copyFrom(int, int, ValueVector) except that
it handles the case when the capacity of the vector needs to be expanded
before copy. |
void |
fillEmpties(int index)
Create holes in the vector upto the given index (exclusive).
|
protected void |
fillHoles(int index) |
static byte[] |
get(ArrowBuf data,
ArrowBuf offset,
int index)
Method used by Json Writer to read a variable width element from
the variable width vector and write to Json.
|
ArrowBuf[] |
getBuffers(boolean clear)
Return the underlying buffers associated with this vector.
|
int |
getBufferSize()
Get the size (number of bytes) of underlying buffers used by this
vector.
|
int |
getBufferSizeFor(int valueCount)
Get the potential buffer size for a particular number of records.
|
int |
getByteCapacity()
Get the size (number of bytes) of underlying data buffer.
|
List<FieldVector> |
getChildrenFromFields()
Get the inner child vectors.
|
ArrowBuf |
getDataBuffer()
Get the buffer that stores the data for elements in the vector.
|
long |
getDataBufferAddress()
Get the memory address of buffer that stores the data for elements
in the vector.
|
ArrowBufPointer |
getDataPointer(int index)
Gets the pointer for the data at the given index.
|
ArrowBufPointer |
getDataPointer(int index,
ArrowBufPointer reuse)
Gets the pointer for the data at the given index.
|
double |
getDensity()
Get the density of this ListVector.
|
Field |
getField()
Get information about how this field is materialized.
|
List<ArrowBuf> |
getFieldBuffers()
Get the buffers belonging to this vector.
|
List<BufferBacked> |
getFieldInnerVectors()
Deprecated.
This API will be removed as the current implementations no longer support inner vectors.
|
int |
getLastSet()
Get the index of last non-null element in the vector.
|
String |
getName()
Gets the name of the vector.
|
int |
getNullCount()
Get the number of elements that are null in the vector.
|
ArrowBuf |
getOffsetBuffer()
buffer that stores the offsets for elements
in the vector.
|
long |
getOffsetBufferAddress()
Get the memory address of buffer that stores the offsets for elements
in the vector.
|
protected long |
getStartOffset(int index)
Gets the starting offset of a record, given its index.
|
TransferPair |
getTransferPair(BufferAllocator allocator)
Construct a transfer pair of this vector and another vector of same type.
|
abstract TransferPair |
getTransferPair(String ref,
BufferAllocator allocator)
Construct a transfer pair of this vector and another vector of same type.
|
TransferPair |
getTransferPair(String ref,
BufferAllocator allocator,
CallBack callBack)
Construct a transfer pair of this vector and another vector of same type.
|
ArrowBuf |
getValidityBuffer()
Get buffer that manages the validity (NULL or NON-NULL nature) of
elements in the vector.
|
long |
getValidityBufferAddress()
Get the memory address of buffer that manages the validity
(NULL or NON-NULL nature) of elements in the vector.
|
int |
getValueCapacity()
Get the current capacity which does not exceed either validity buffer or offset buffer.
|
int |
getValueCount()
Get the value count of vector.
|
int |
getValueLength(int index)
Get the variable length element at specified index as Text.
|
protected void |
handleSafe(int index,
int dataLength) |
int |
hashCode(int index)
Returns hashCode of element in index with the default hasher.
|
int |
hashCode(int index,
ArrowBufHasher hasher)
Returns hashCode of element in index with the given hasher.
|
void |
initializeChildrenFromFields(List<Field> children)
Initialize the children in schema for this Field.
|
boolean |
isNull(int index)
Check if element at given index is null.
|
boolean |
isSafe(int index)
Check if the given index is within the current value capacity
of the vector.
|
int |
isSet(int index)
Same as
isNull(int) . |
void |
loadFieldBuffers(ArrowFieldNode fieldNode,
List<ArrowBuf> ownBuffers)
Load the buffers of this vector with provided source buffers.
|
void |
reAlloc()
Resize the vector to increase the capacity.
|
void |
reallocDataBuffer()
Reallocate the data buffer.
|
void |
reallocValidityAndOffsetBuffers()
Reallocate the validity and offset buffers for this vector.
|
void |
reset()
Reset the vector to initial state.
|
static ArrowBuf |
set(ArrowBuf buffer,
BufferAllocator allocator,
int valueCount,
int index,
long value)
Method used by Json Reader to explicitly set the offsets of the variable
width vector data.
|
void |
set(int index,
byte[] value)
Set the variable length element at the specified index to the supplied
byte array.
|
void |
set(int index,
byte[] value,
int start,
int length)
Set the variable length element at the specified index to the supplied
byte array.
|
void |
set(int index,
ByteBuffer value,
int start,
int length)
Set the variable length element at the specified index to the
content in supplied ByteBuffer.
|
void |
set(int index,
int isSet,
long start,
long end,
ArrowBuf buffer)
Store the given value at a particular position in the vector.
|
void |
set(int index,
long start,
int length,
ArrowBuf buffer)
Store the given value at a particular position in the vector.
|
protected void |
setBytes(int index,
byte[] value,
int start,
int length) |
void |
setIndexDefined(int index)
Mark the particular position in the vector as non-null.
|
void |
setInitialCapacity(int valueCount)
Sets the desired value capacity for the vector.
|
void |
setInitialCapacity(int valueCount,
double density)
Sets the desired value capacity for the vector.
|
void |
setLastSet(int value)
Set the index of last non-null element in the vector.
|
void |
setNull(int index)
Set the element at the given index to null.
|
void |
setSafe(int index,
byte[] value)
Same as
set(int, byte[]) except that it handles the
case where index and length of new element are beyond the existing
capacity of the vector. |
void |
setSafe(int index,
byte[] value,
int start,
int length)
Same as
set(int, byte[], int, int) except that it handles the
case where index and length of new element are beyond the existing
capacity of the vector. |
void |
setSafe(int index,
ByteBuffer value,
int start,
int length)
Same as
set(int, ByteBuffer, int, int) except that it handles the
case where index and length of new element are beyond the existing
capacity of the vector. |
void |
setSafe(int index,
int isSet,
long start,
long end,
ArrowBuf buffer)
Same as
set(int, int, long, long, ArrowBuf) except that it handles the case
when index is greater than or equal to current value capacity of the
vector. |
void |
setSafe(int index,
long start,
int length,
ArrowBuf buffer)
Same as
set(int, int, long, long, ArrowBuf) except that it handles the case
when index is greater than or equal to current value capacity of the
vector. |
void |
setValueCount(int valueCount)
Sets the value count for the vector.
|
void |
setValueLengthSafe(int index,
int length)
Sets the value length for an element.
|
int |
sizeOfValueBuffer()
Provide the number of bytes contained in the valueBuffer.
|
void |
splitAndTransferTo(int startIndex,
int length,
BaseLargeVariableWidthVector target)
Slice this vector at desired index and length and transfer the
corresponding data to the target vector.
|
void |
transferTo(BaseLargeVariableWidthVector target)
Transfer this vector'data to another vector.
|
void |
zeroVector()
zero out the vector and the data in associated buffers.
|
checkBufRefs, getAllocator, getReader, getReaderImpl, getValidityBufferSizeFromCount, iterator, releaseBuffer, toString, transferBuffer
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
getAllocator, getMinorType, getObject, getReader, makeTransferPair
forEach, iterator, spliterator
public static final int OFFSET_WIDTH
protected static final byte[] emptyByteArray
protected ArrowBuf validityBuffer
protected ArrowBuf valueBuffer
protected ArrowBuf offsetBuffer
protected int valueCount
protected int lastSet
protected final Field field
public BaseLargeVariableWidthVector(Field field, BufferAllocator allocator)
field
- The field materialized by this vector.allocator
- The allocator to use for creating/resizing bufferspublic String getName()
ValueVector
getName
in interface ValueVector
getName
in class BaseValueVector
public ArrowBuf getValidityBuffer()
getValidityBuffer
in interface ValueVector
public ArrowBuf getDataBuffer()
getDataBuffer
in interface ValueVector
public ArrowBuf getOffsetBuffer()
getOffsetBuffer
in interface ValueVector
public long getOffsetBufferAddress()
getOffsetBufferAddress
in interface FieldVector
public long getValidityBufferAddress()
getValidityBufferAddress
in interface FieldVector
public long getDataBufferAddress()
getDataBufferAddress
in interface FieldVector
public void setInitialCapacity(int valueCount)
setInitialCapacity
in interface ValueVector
valueCount
- desired number of elements in the vectorpublic void setInitialCapacity(int valueCount, double density)
setInitialCapacity
in interface DensityAwareVector
valueCount
- desired number of elements in the vectordensity
- average number of bytes per variable width elementpublic double getDensity()
public int getValueCapacity()
getValueCapacity
in interface ValueVector
public void zeroVector()
public void reset()
zeroVector()
.
Note that this method doesn't release any memory.reset
in interface ValueVector
public void close()
close
in interface Closeable
close
in interface AutoCloseable
close
in interface ValueVector
close
in class BaseValueVector
public void clear()
close()
.clear
in interface ValueVector
clear
in class BaseValueVector
@Deprecated public List<BufferBacked> getFieldInnerVectors()
getFieldInnerVectors
in interface FieldVector
public void initializeChildrenFromFields(List<Field> children)
initializeChildrenFromFields
in interface FieldVector
children
- the schemaIllegalArgumentException
- if children is a non-empty list for scalar types.public List<FieldVector> getChildrenFromFields()
getChildrenFromFields
in interface FieldVector
public void loadFieldBuffers(ArrowFieldNode fieldNode, List<ArrowBuf> ownBuffers)
loadFieldBuffers
in interface FieldVector
fieldNode
- the fieldNode indicating the value countownBuffers
- the buffers for this Field (own buffers only, children not included)public List<ArrowBuf> getFieldBuffers()
getFieldBuffers
in interface FieldVector
public void allocateNew()
allocateNewSafe()
.allocateNew
in interface ValueVector
public boolean allocateNewSafe()
allocateNew(long, int)
for allocating memory for specific
number of elements in the vector.allocateNewSafe
in interface ValueVector
public void allocateNew(long totalBytes, int valueCount)
allocateNew
in interface VariableWidthVector
totalBytes
- desired total memory capacityvalueCount
- the desired number of elements in the vectorOutOfMemoryException
- if memory allocation failspublic void allocateNew(int valueCount)
VariableWidthVector
allocateNew
in interface VariableWidthVector
valueCount
- Number of values in the vector.public void reAlloc()
reAlloc
in interface ValueVector
public void reallocDataBuffer()
OversizedAllocationException
- if the desired new size is more than
max allowedOutOfMemoryException
- if the internal memory allocation failspublic void reallocValidityAndOffsetBuffers()
Note that data buffer for variable length vectors moves independent of the companion validity and offset buffers. This is in contrast to what we have for fixed width vectors.
So even though we may have setup an initial capacity of 1024 elements in the vector, it is quite possible that we need to reAlloc() the data buffer when we are setting the 5th element in the vector simply because previous variable length elements have exhausted the buffer capacity. However, we really don't need to reAlloc() validity and offset buffers until we try to set the 1025th element This is why we do a separate check for safe methods to determine which buffer needs reallocation.
OversizedAllocationException
- if the desired new size is more than
max allowedOutOfMemoryException
- if the internal memory allocation failspublic int getByteCapacity()
getByteCapacity
in interface VariableWidthVector
public int sizeOfValueBuffer()
VariableWidthVector
sizeOfValueBuffer
in interface VariableWidthVector
public int getBufferSize()
getBufferSize
in interface ValueVector
public int getBufferSizeFor(int valueCount)
getBufferSizeFor
in interface ValueVector
valueCount
- desired number of elements in the vectorpublic Field getField()
getField
in interface ValueVector
public ArrowBuf[] getBuffers(boolean clear)
getBuffers
in interface ValueVector
clear
- Whether to clear vector before returning; the buffers will still be refcounted
but the returned array will be the only reference to thembuffers
that is used by this
vector instance.public TransferPair getTransferPair(String ref, BufferAllocator allocator, CallBack callBack)
getTransferPair
in interface ValueVector
ref
- name of the target vectorallocator
- allocator for the target vectorcallBack
- not usedpublic TransferPair getTransferPair(BufferAllocator allocator)
getTransferPair
in interface ValueVector
getTransferPair
in class BaseValueVector
allocator
- allocator for the target vectorpublic abstract TransferPair getTransferPair(String ref, BufferAllocator allocator)
getTransferPair
in interface ValueVector
ref
- name of the target vectorallocator
- allocator for the target vectorpublic void transferTo(BaseLargeVariableWidthVector target)
target
- destination vector for transferpublic void splitAndTransferTo(int startIndex, int length, BaseLargeVariableWidthVector target)
startIndex
- start position of the split in source vector.length
- length of the split.target
- destination vectorpublic int getNullCount()
getNullCount
in interface ValueVector
public boolean isSafe(int index)
index
- position to checkpublic boolean isNull(int index)
isNull
in interface ValueVector
index
- position of elementpublic int isSet(int index)
isNull(int)
.index
- position of elementpublic int getValueCount()
getValueCount
in interface ValueVector
public void setValueCount(int valueCount)
setValueCount
in interface ValueVector
valueCount
- value countpublic void fillEmpties(int index)
index
- target indexpublic void setLastSet(int value)
setValueCount(int)
.value
- desired index of last non-null element.public int getLastSet()
public void setIndexDefined(int index)
setIndexDefined
in interface VectorDefinitionSetter
index
- position of the element.public void setValueLengthSafe(int index, int length)
index
- position of the element to setlength
- length of the elementpublic int getValueLength(int index)
index
- position of element to getpublic void set(int index, byte[] value)
set(int, byte[], int, int)
with start as 0 and length as value.lengthindex
- position of the element to setvalue
- array of bytes to writepublic void setSafe(int index, byte[] value)
set(int, byte[])
except that it handles the
case where index and length of new element are beyond the existing
capacity of the vector.index
- position of the element to setvalue
- array of bytes to writepublic void set(int index, byte[] value, int start, int length)
index
- position of the element to setvalue
- array of bytes to writestart
- start index in array of byteslength
- length of data in array of bytespublic void setSafe(int index, byte[] value, int start, int length)
set(int, byte[], int, int)
except that it handles the
case where index and length of new element are beyond the existing
capacity of the vector.index
- position of the element to setvalue
- array of bytes to writestart
- start index in array of byteslength
- length of data in array of bytespublic void set(int index, ByteBuffer value, int start, int length)
index
- position of the element to setvalue
- ByteBuffer with datastart
- start index in ByteBufferlength
- length of data in ByteBufferpublic void setSafe(int index, ByteBuffer value, int start, int length)
set(int, ByteBuffer, int, int)
except that it handles the
case where index and length of new element are beyond the existing
capacity of the vector.index
- position of the element to setvalue
- ByteBuffer with datastart
- start index in ByteBufferlength
- length of data in ByteBufferpublic void setNull(int index)
setNull
in interface FieldVector
index
- position of elementpublic void set(int index, int isSet, long start, long end, ArrowBuf buffer)
index
- position of the new valueisSet
- 0 for NULL value, 1 otherwisestart
- start position of data in bufferend
- end position of data in bufferbuffer
- data buffer containing the variable width element to be stored
in the vectorpublic void setSafe(int index, int isSet, long start, long end, ArrowBuf buffer)
set(int, int, long, long, ArrowBuf)
except that it handles the case
when index is greater than or equal to current value capacity of the
vector.index
- position of the new valueisSet
- 0 for NULL value, 1 otherwisestart
- start position of data in bufferend
- end position of data in bufferbuffer
- data buffer containing the variable width element to be stored
in the vectorpublic void set(int index, long start, int length, ArrowBuf buffer)
index
- position of the new valuestart
- start position of data in bufferlength
- length of data in bufferbuffer
- data buffer containing the variable width element to be stored
in the vectorpublic void setSafe(int index, long start, int length, ArrowBuf buffer)
set(int, int, long, long, ArrowBuf)
except that it handles the case
when index is greater than or equal to current value capacity of the
vector.index
- position of the new valuestart
- start position of data in bufferlength
- length of data in bufferbuffer
- data buffer containing the variable width element to be stored
in the vectorprotected final void fillHoles(int index)
protected final void setBytes(int index, byte[] value, int start, int length)
protected final long getStartOffset(int index)
index
- index of the record.protected final void handleSafe(int index, int dataLength)
public static byte[] get(ArrowBuf data, ArrowBuf offset, int index)
This method should not be used externally.
data
- buffer storing the variable width vector elementsoffset
- buffer storing the offsets of variable width vector elementsindex
- position of the element in the vectorpublic static ArrowBuf set(ArrowBuf buffer, BufferAllocator allocator, int valueCount, int index, long value)
This method should not be used externally.
buffer
- ArrowBuf to store offsets for variable width elementsallocator
- memory allocatorvalueCount
- number of elementsindex
- position of the elementvalue
- offset of the elementpublic void copyFrom(int fromIndex, int thisIndex, ValueVector from)
copyFrom
in interface ValueVector
copyFrom
in class BaseValueVector
fromIndex
- position to copy from in source vectorthisIndex
- position to copy to in this vectorfrom
- source vectorpublic void copyFromSafe(int fromIndex, int thisIndex, ValueVector from)
copyFrom(int, int, ValueVector)
except that
it handles the case when the capacity of the vector needs to be expanded
before copy.copyFromSafe
in interface ValueVector
copyFromSafe
in class BaseValueVector
fromIndex
- position to copy from in source vectorthisIndex
- position to copy to in this vectorfrom
- source vectorpublic ArrowBufPointer getDataPointer(int index)
ElementAddressableVector
getDataPointer
in interface ElementAddressableVector
index
- the index for the data.public ArrowBufPointer getDataPointer(int index, ArrowBufPointer reuse)
ElementAddressableVector
getDataPointer
in interface ElementAddressableVector
index
- the index for the data.reuse
- the data pointer to fill, this avoids creating a new pointer object.public int hashCode(int index)
ValueVector
hashCode
in interface ValueVector
public int hashCode(int index, ArrowBufHasher hasher)
ValueVector
hashCode
in interface ValueVector
public <OUT,IN> OUT accept(VectorVisitor<OUT,IN> visitor, IN value)
ValueVector
VectorVisitor
and return the result.accept
in interface ValueVector
OUT
- the output result type.IN
- the input data together with visitor.Copyright © 2023 The Apache Software Foundation. All rights reserved.