Struct DictionaryTracker

Source

pub struct DictionaryTracker {
    written: HashMap<i64, ArrayData>,
    dict_ids: Vec<i64>,
    error_on_replacement: bool,
}

Expand description

Keeps track of dictionaries that have been written, to avoid emitting the same dictionary multiple times.

Can optionally error if an update to an existing dictionary is attempted, which isn’t allowed in the FileWriter.

Fields§

§written: HashMap<i64, ArrayData>§dict_ids: Vec<i64>§error_on_replacement: bool

Implementations§

Source §

impl DictionaryTracker

Source

pub fn new(error_on_replacement: bool) -> Self

Create a new DictionaryTracker.

If error_on_replacement is true, an error will be generated if an update to an existing dictionary is attempted.

Source

pub fn next_dict_id(&mut self) -> i64

Record and return the next dictionary ID.

Source

pub fn dict_id(&mut self) -> &[i64]

Return the sequence of dictionary IDs in the order they should be observed while traversing the schema

Source

pub fn insert( &mut self, dict_id: i64, column: &ArrayRef, ) -> Result<bool, ArrowError>

👎Deprecated since 56.1.0: Use insert_column instead

Keep track of the dictionary with the given ID and values. Behavior:

If this ID has been written already and has the same data, return Ok(false) to indicate that the dictionary was not actually inserted (because it’s already been seen).
If this ID has been written already but with different data, and this tracker is configured to return an error, return an error.
If the tracker has not been configured to error on replacement or this dictionary has never been seen before, return Ok(true) to indicate that the dictionary was just inserted.

Source

pub fn insert_column( &mut self, dict_id: i64, column: &ArrayRef, dict_handling: DictionaryHandling, ) -> Result<DictionaryUpdate, ArrowError>

Keep track of the dictionary with the given ID and values. The return value indicates what, if any, update to the internal map took place and how it should be interpreted based on the dict_handling parameter.

§Returns

Ok(Dictionary::New) - If the dictionary was not previously written
Ok(Dictionary::Replaced) - If the dictionary was previously written with completely different data, or if the data is a delta of the existing, but with dict_handling set to DictionaryHandling::Resend
Ok(Dictionary::Delta) - If the dictionary was previously written, but the new data is a delta of the old and the dict_handling is set to DictionaryHandling::Delta
Err(e) - If the dictionary was previously written with different data, and error_on_replacement is set to true.