fn override_selector_strategy_if_needed(
plan_builder: ReadPlanBuilder,
projection_mask: &ProjectionMask,
offset_index: Option<&[OffsetIndexMetaData]>,
) -> ReadPlanBuilderExpand description
Override the selection strategy if needed.
Some pages can be skipped during row-group construction if they are not read
by the selections. This means that the data pages for those rows are never
loaded and definition/repetition levels are never read. When using
RowSelections selection works because skip_records() handles this
case and skips the page accordingly.
However, with the current mask design, all values must be read and decoded and then a mask filter is applied. Thus if any pages are skipped during row-group construction, the data pages are missing and cannot be decoded.
A simple example:
- the page size is 2, the mask is 100001, row selection should be read(1) skip(4) read(1)
- the
ColumnChunkDatawould be page1(10), page2(skipped), page3(01)
Using the row selection to skip(4), page2 won’t be read at all, so in this case we can’t decode all the rows and apply a mask. To correctly apply the bit mask, we need all 6 values be read, but page2 is not in memory.