parquet::arrow::arrow_reader::selection

Struct RowSelection

Source
pub struct RowSelection {
    selectors: Vec<RowSelector>,
}
Expand description

RowSelection allows selecting or skipping a provided number of rows when scanning the parquet file.

This is applied prior to reading column data, and can therefore be used to skip IO to fetch data into memory

A typical use-case would be using the PageIndex to filter out rows that don’t satisfy a predicate

§Example

use parquet::arrow::arrow_reader::{RowSelection, RowSelector};

let selectors = vec![
    RowSelector::skip(5),
    RowSelector::select(5),
    RowSelector::select(5),
    RowSelector::skip(5),
];

// Creating a selection will combine adjacent selectors
let selection: RowSelection = selectors.into();

let expected = vec![
    RowSelector::skip(5),
    RowSelector::select(10),
    RowSelector::skip(5),
];

let actual: Vec<RowSelector> = selection.into();
assert_eq!(actual, expected);

// you can also create a selection from consecutive ranges
let ranges = vec![5..10, 10..15];
let selection =
  RowSelection::from_consecutive_ranges(ranges.into_iter(), 20);
let actual: Vec<RowSelector> = selection.into();
assert_eq!(actual, expected);

A RowSelection maintains the following invariants:

Fields§

§selectors: Vec<RowSelector>

Implementations§

Source§

impl RowSelection

Source

pub fn from_filters(filters: &[BooleanArray]) -> Self

Creates a RowSelection from a slice of [BooleanArray]

§Panic

Panics if any of the [BooleanArray] contain nulls

Source

pub fn from_consecutive_ranges<I: Iterator<Item = Range<usize>>>( ranges: I, total_rows: usize, ) -> Self

Creates a RowSelection from an iterator of consecutive ranges to keep

Source

pub fn scan_ranges(&self, page_locations: &[PageLocation]) -> Vec<Range<usize>>

Given an offset index, return the byte ranges for all data pages selected by self

This is useful for determining what byte ranges to fetch from underlying storage

Note: this method does not make any effort to combine consecutive ranges, nor coalesce ranges that are close together. This is instead delegated to the IO subsystem to optimise, e.g. ObjectStore::get_ranges

Source

pub fn split_off(&mut self, row_count: usize) -> Self

Splits off the first row_count from this RowSelection

Source

pub fn and_then(&self, other: &Self) -> Self

returns a RowSelection representing rows that are selected in both input RowSelections.

This is equivalent to the logical AND / conjunction of the two selections.

§Example

If N means the row is not selected, and Y means it is selected:

self:     NNNNNNNNNNNNYYYYYYYYYYYYYYYYYYYYYYNNNYYYYY
other:                YYYYYNNNNYYYYYYYYYYYYY   YYNNN

returned: NNNNNNNNNNNNYYYYYNNNNYYYYYYYYYYYYYNNNYYNNN
§Panics

Panics if other does not have a length equal to the number of rows selected by this RowSelection

Source

pub fn intersection(&self, other: &Self) -> Self

Compute the intersection of two RowSelection For example: self: NNYYYYNNYYNYN other: NYNNNNNNY

returned: NNNNNNNNYYNYN

Source

pub fn union(&self, other: &Self) -> Self

Compute the union of two RowSelection For example: self: NNYYYYNNYYNYN other: NYNNNNNNN

returned: NYYYYYNNYYNYN

Source

pub fn selects_any(&self) -> bool

Returns true if this RowSelection selects any rows

Source

pub(crate) fn trim(self) -> Self

Trims this RowSelection removing any trailing skips

Source

pub(crate) fn offset(self, offset: usize) -> Self

Applies an offset to this RowSelection, skipping the first offset selected rows

Source

pub(crate) fn limit(self, limit: usize) -> Self

Limit this RowSelection to only select limit rows

Source

pub fn iter(&self) -> impl Iterator<Item = &RowSelector>

Returns an iterator over the RowSelectors for this RowSelection.

Source

pub fn row_count(&self) -> usize

Returns the number of selected rows

Source

pub fn skipped_row_count(&self) -> usize

Returns the number of de-selected rows

Trait Implementations§

Source§

impl Clone for RowSelection

Source§

fn clone(&self) -> RowSelection

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for RowSelection

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for RowSelection

Source§

fn default() -> RowSelection

Returns the “default value” for a type. Read more
Source§

impl From<RowSelection> for Vec<RowSelector>

Source§

fn from(r: RowSelection) -> Self

Converts to this type from the input type.
Source§

impl From<RowSelection> for VecDeque<RowSelector>

Source§

fn from(r: RowSelection) -> Self

Converts to this type from the input type.
Source§

impl From<Vec<RowSelector>> for RowSelection

Source§

fn from(selectors: Vec<RowSelector>) -> Self

Converts to this type from the input type.
Source§

impl FromIterator<RowSelector> for RowSelection

Source§

fn from_iter<T: IntoIterator<Item = RowSelector>>(iter: T) -> Self

Creates a value from an iterator. Read more
Source§

impl PartialEq for RowSelection

Source§

fn eq(&self, other: &RowSelection) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl Eq for RowSelection

Source§

impl StructuralPartialEq for RowSelection

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dst: *mut T)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

§

fn equivalent(&self, key: &K) -> bool

Checks if this value is equivalent to the given key. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,

§

impl<T> ErasedDestructor for T
where T: 'static,

§

impl<T> MaybeSendSync for T