Struct GenericByteDictionaryBuilder
pub struct GenericByteDictionaryBuilder<K, T>where
K: ArrowDictionaryKeyType,
T: ByteArrayType,{
state: RandomState,
dedup: HashTable<usize>,
keys_builder: PrimitiveBuilder<K>,
values_builder: GenericByteBuilder<T>,
}
Expand description
Builder for DictionaryArray
of GenericByteArray
For example to map a set of byte indices to String values. Note that
the use of a HashMap
here will not scale to very large arrays or
result in an ordered dictionary.
Fields§
§state: RandomState
§dedup: HashTable<usize>
§keys_builder: PrimitiveBuilder<K>
§values_builder: GenericByteBuilder<T>
Implementations§
§impl<K, T> GenericByteDictionaryBuilder<K, T>where
K: ArrowDictionaryKeyType,
T: ByteArrayType,
impl<K, T> GenericByteDictionaryBuilder<K, T>where
K: ArrowDictionaryKeyType,
T: ByteArrayType,
pub fn new() -> GenericByteDictionaryBuilder<K, T>
pub fn new() -> GenericByteDictionaryBuilder<K, T>
Creates a new GenericByteDictionaryBuilder
pub fn with_capacity(
keys_capacity: usize,
value_capacity: usize,
data_capacity: usize,
) -> GenericByteDictionaryBuilder<K, T>
pub fn with_capacity( keys_capacity: usize, value_capacity: usize, data_capacity: usize, ) -> GenericByteDictionaryBuilder<K, T>
Creates a new GenericByteDictionaryBuilder
with the provided capacities
keys_capacity
: the number of keys, i.e. length of array to build
value_capacity
: the number of distinct dictionary values, i.e. size of dictionary
data_capacity
: the total number of bytes of all distinct bytes in the dictionary
pub fn new_with_dictionary(
keys_capacity: usize,
dictionary_values: &GenericByteArray<T>,
) -> Result<GenericByteDictionaryBuilder<K, T>, ArrowError>
pub fn new_with_dictionary( keys_capacity: usize, dictionary_values: &GenericByteArray<T>, ) -> Result<GenericByteDictionaryBuilder<K, T>, ArrowError>
Creates a new GenericByteDictionaryBuilder
from a keys capacity and a dictionary
which is initialized with the given values.
The indices of those dictionary values are used as keys.
§Example
let dictionary_values = StringArray::from(vec![None, Some("abc"), Some("def")]);
let mut builder = StringDictionaryBuilder::new_with_dictionary(3, &dictionary_values).unwrap();
builder.append("def").unwrap();
builder.append_null();
builder.append("abc").unwrap();
let dictionary_array = builder.finish();
let keys = dictionary_array.keys();
assert_eq!(keys, &Int16Array::from(vec![Some(2), None, Some(1)]));
§impl<K, T> GenericByteDictionaryBuilder<K, T>where
K: ArrowDictionaryKeyType,
T: ByteArrayType,
impl<K, T> GenericByteDictionaryBuilder<K, T>where
K: ArrowDictionaryKeyType,
T: ByteArrayType,
pub fn append(
&mut self,
value: impl AsRef<<T as ByteArrayType>::Native>,
) -> Result<<K as ArrowPrimitiveType>::Native, ArrowError>
pub fn append( &mut self, value: impl AsRef<<T as ByteArrayType>::Native>, ) -> Result<<K as ArrowPrimitiveType>::Native, ArrowError>
Append a value to the array. Return an existing index if already present in the values array or a new index if the value is appended to the values array.
Returns an error if the new index would overflow the key type.
pub fn append_n(
&mut self,
value: impl AsRef<<T as ByteArrayType>::Native>,
count: usize,
) -> Result<<K as ArrowPrimitiveType>::Native, ArrowError>
pub fn append_n( &mut self, value: impl AsRef<<T as ByteArrayType>::Native>, count: usize, ) -> Result<<K as ArrowPrimitiveType>::Native, ArrowError>
Append a value multiple times to the array.
This is the same as append
but allows to append the same value multiple times without doing multiple lookups.
Returns an error if the new index would overflow the key type.
pub fn append_value(&mut self, value: impl AsRef<<T as ByteArrayType>::Native>)
pub fn append_value(&mut self, value: impl AsRef<<T as ByteArrayType>::Native>)
Infallibly append a value to this builder
§Panics
Panics if the resulting length of the dictionary values array would exceed T::Native::MAX
pub fn append_values(
&mut self,
value: impl AsRef<<T as ByteArrayType>::Native>,
count: usize,
)
pub fn append_values( &mut self, value: impl AsRef<<T as ByteArrayType>::Native>, count: usize, )
Infallibly append a value to this builder repeatedly count
times.
This is the same as append_value
but allows to append the same value multiple times without doing multiple lookups.
§Panics
Panics if the resulting length of the dictionary values array would exceed T::Native::MAX
pub fn append_null(&mut self)
pub fn append_null(&mut self)
Appends a null slot into the builder
pub fn append_nulls(&mut self, n: usize)
pub fn append_nulls(&mut self, n: usize)
Infallibly append n
null slots into the builder
pub fn append_option(
&mut self,
value: Option<impl AsRef<<T as ByteArrayType>::Native>>,
)
pub fn append_option( &mut self, value: Option<impl AsRef<<T as ByteArrayType>::Native>>, )
Append an Option
value into the builder
§Panics
Panics if the resulting length of the dictionary values array would exceed T::Native::MAX
pub fn append_options(
&mut self,
value: Option<impl AsRef<<T as ByteArrayType>::Native>>,
count: usize,
)
pub fn append_options( &mut self, value: Option<impl AsRef<<T as ByteArrayType>::Native>>, count: usize, )
Append an Option
value into the builder repeatedly count
times.
This is the same as append_option
but allows to append the same value multiple times without doing multiple lookups.
§Panics
Panics if the resulting length of the dictionary values array would exceed T::Native::MAX
pub fn extend_dictionary(
&mut self,
dictionary: &TypedDictionaryArray<'_, K, GenericByteArray<T>>,
) -> Result<(), ArrowError>
pub fn extend_dictionary( &mut self, dictionary: &TypedDictionaryArray<'_, K, GenericByteArray<T>>, ) -> Result<(), ArrowError>
Extends builder with an existing dictionary array.
This is the same as Self::extend
but is faster as it translates
the dictionary values once rather than doing a lookup for each item in the iterator
when dictionary values are null (the actual mapped values) the keys are null
pub fn finish(&mut self) -> DictionaryArray<K>
pub fn finish(&mut self) -> DictionaryArray<K>
Builds the DictionaryArray
and reset this builder.
pub fn finish_cloned(&self) -> DictionaryArray<K>
pub fn finish_cloned(&self) -> DictionaryArray<K>
Builds the DictionaryArray
without resetting the builder.
pub fn validity_slice(&self) -> Option<&[u8]>
pub fn validity_slice(&self) -> Option<&[u8]>
Returns the current null buffer as a slice
Trait Implementations§
§impl<K, T> ArrayBuilder for GenericByteDictionaryBuilder<K, T>where
K: ArrowDictionaryKeyType,
T: ByteArrayType,
impl<K, T> ArrayBuilder for GenericByteDictionaryBuilder<K, T>where
K: ArrowDictionaryKeyType,
T: ByteArrayType,
§fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
Returns the builder as an mutable Any
reference.
§fn into_box_any(self: Box<GenericByteDictionaryBuilder<K, T>>) -> Box<dyn Any>
fn into_box_any(self: Box<GenericByteDictionaryBuilder<K, T>>) -> Box<dyn Any>
Returns the boxed builder as a box of Any
.
§fn finish_cloned(&self) -> Arc<dyn Array>
fn finish_cloned(&self) -> Arc<dyn Array>
Builds the array without resetting the builder.
§impl<K, T> Debug for GenericByteDictionaryBuilder<K, T>
impl<K, T> Debug for GenericByteDictionaryBuilder<K, T>
§impl<K, T> Default for GenericByteDictionaryBuilder<K, T>where
K: ArrowDictionaryKeyType,
T: ByteArrayType,
impl<K, T> Default for GenericByteDictionaryBuilder<K, T>where
K: ArrowDictionaryKeyType,
T: ByteArrayType,
§fn default() -> GenericByteDictionaryBuilder<K, T>
fn default() -> GenericByteDictionaryBuilder<K, T>
§impl<K, T, V> Extend<Option<V>> for GenericByteDictionaryBuilder<K, T>
impl<K, T, V> Extend<Option<V>> for GenericByteDictionaryBuilder<K, T>
§fn extend<I>(&mut self, iter: I)where
I: IntoIterator<Item = Option<V>>,
fn extend<I>(&mut self, iter: I)where
I: IntoIterator<Item = Option<V>>,
Source§fn extend_one(&mut self, item: A)
fn extend_one(&mut self, item: A)
extend_one
)Source§fn extend_reserve(&mut self, additional: usize)
fn extend_reserve(&mut self, additional: usize)
extend_one
)