parquet::arrow::async_reader

Struct ParquetObjectReader

Source
pub struct ParquetObjectReader {
    store: Arc<dyn ObjectStore>,
    meta: ObjectMeta,
    metadata_size_hint: Option<usize>,
    preload_column_index: bool,
    preload_offset_index: bool,
    runtime: Option<Handle>,
}
Expand description

Reads Parquet files in object storage using [ObjectStore].

// Populate configuration from environment
let storage_container = Arc::new(MicrosoftAzureBuilder::from_env().build().unwrap());
let location = Path::from("path/to/blob.parquet");
let meta = storage_container.head(&location).await.unwrap();
println!("Found Blob with {}B at {}", meta.size, meta.location);

// Show Parquet metadata
let reader = ParquetObjectReader::new(storage_container, meta);
let builder = ParquetRecordBatchStreamBuilder::new(reader).await.unwrap();
print_parquet_metadata(&mut stdout(), builder.metadata());

Fields§

§store: Arc<dyn ObjectStore>§meta: ObjectMeta§metadata_size_hint: Option<usize>§preload_column_index: bool§preload_offset_index: bool§runtime: Option<Handle>

Implementations§

Source§

impl ParquetObjectReader

Source

pub fn new(store: Arc<dyn ObjectStore>, meta: ObjectMeta) -> Self

Creates a new ParquetObjectReader for the provided [ObjectStore] and [ObjectMeta]

[ObjectMeta] can be obtained using [ObjectStore::list] or [ObjectStore::head]

Provide a hint as to the size of the parquet file’s footer, see fetch_parquet_metadata

Source

pub fn with_preload_column_index(self, preload_column_index: bool) -> Self

Load the Column Index as part of Self::get_metadata

Source

pub fn with_preload_offset_index(self, preload_offset_index: bool) -> Self

Load the Offset Index as part of Self::get_metadata

Source

pub fn with_runtime(self, handle: Handle) -> Self

Perform IO on the provided tokio runtime

Tokio is a cooperative scheduler, and relies on tasks yielding in a timely manner to service IO. Therefore, running IO and CPU-bound tasks, such as parquet decoding, on the same tokio runtime can lead to degraded throughput, dropped connections and other issues. For more information see here.

Source

fn spawn<F, O, E>(&self, f: F) -> BoxFuture<'_, Result<O>>
where F: for<'a> FnOnce(&'a Arc<dyn ObjectStore>, &'a Path) -> BoxFuture<'a, Result<O, E>> + Send + 'static, O: Send + 'static, E: Into<ParquetError> + Send + 'static,

Trait Implementations§

Source§

impl AsyncFileReader for ParquetObjectReader

Source§

fn get_bytes(&mut self, range: Range<usize>) -> BoxFuture<'_, Result<Bytes>>

Retrieve the bytes in range
Source§

fn get_byte_ranges( &mut self, ranges: Vec<Range<usize>>, ) -> BoxFuture<'_, Result<Vec<Bytes>>>
where Self: Send,

Retrieve multiple byte ranges. The default implementation will call get_bytes sequentially
Source§

fn get_metadata(&mut self) -> BoxFuture<'_, Result<Arc<ParquetMetaData>>>

Provides asynchronous access to the ParquetMetaData of a parquet file, allowing fine-grained control over how metadata is sourced, in particular allowing for caching, pre-fetching, catalog metadata, etc…
Source§

impl Clone for ParquetObjectReader

Source§

fn clone(&self) -> ParquetObjectReader

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for ParquetObjectReader

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dst: *mut T)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<T> ErasedDestructor for T
where T: 'static,

§

impl<T> MaybeSendSync for T