Class VectorRunDeduplicator<V extends ValueVector>

java.lang.Object
org.apache.arrow.algorithm.deduplicate.VectorRunDeduplicator<V>
Type Parameters:
V - vector type.
All Implemented Interfaces:
AutoCloseable

public class VectorRunDeduplicator<V extends ValueVector> extends Object implements AutoCloseable
Remove adjacent equal elements from a vector. If the vector is sorted, it removes all duplicated values in the vector.
  • Constructor Details

    • VectorRunDeduplicator

      public VectorRunDeduplicator(V vector, BufferAllocator allocator)
      Constructs a vector run deduplicator for a given vector.
      Parameters:
      vector - the vector to deduplicate. Ownership is NOT taken.
      allocator - the allocator used for allocating buffers for start indices.
  • Method Details

    • getRunCount

      public int getRunCount()
      Gets the number of values which are different from their predecessor.
      Returns:
      the run count.
    • populateDeduplicatedValues

      public void populateDeduplicatedValues(V outVector)
      Gets the vector with deduplicated adjacent values removed.
      Parameters:
      outVector - the output vector.
    • populateRunLengths

      public void populateRunLengths(IntVector lengthVector)
      Gets the length of each distinct value.
      Parameters:
      lengthVector - the vector for holding length values.
    • close

      public void close()
      Specified by:
      close in interface AutoCloseable