pub struct ParquetPushDecoder {
state: ParquetDecoderState,
}Expand description
A push based Parquet Decoder
See ParquetPushDecoderBuilder for an example of how to build and use the decoder.
ParquetPushDecoder is a low level API for decoding Parquet data without an
underlying reader for performing IO, and thus offers fine grained control
over how data is fetched and decoded.
When more data is needed to make progress, instead of reading data directly
from a reader, the decoder returns DecodeResult indicating what ranges
are needed. Once the caller provides the requested ranges via
Self::push_ranges, they try to decode again by calling
Self::try_decode.
The decoder’s internal state tracks what has been already decoded and what is needed next.
Fields§
§state: ParquetDecoderStateThe inner state.
This state is consumed on every transition and a new state is produced so the Rust compiler can ensure that the state is always valid and transitions are not missed.
Implementations§
Source§impl ParquetPushDecoder
impl ParquetPushDecoder
Sourcepub fn try_decode(&mut self) -> Result<DecodeResult<RecordBatch>, ParquetError>
pub fn try_decode(&mut self) -> Result<DecodeResult<RecordBatch>, ParquetError>
Attempt to decode the next batch of data, or return what data is needed
The the decoder communicates the next state with a DecodeResult
See full example in ParquetPushDecoderBuilder
use parquet::DecodeResult;
let mut decoder = get_decoder();
loop {
match decoder.try_decode().unwrap() {
DecodeResult::NeedsData(ranges) => {
// The decoder needs more data. Fetch the data for the given ranges
// call decoder.push_ranges(ranges, data) and call again
push_data(&mut decoder, ranges);
}
DecodeResult::Data(batch) => {
// Successfully decoded the next batch of data
println!("Got batch with {} rows", batch.num_rows());
}
DecodeResult::Finished => {
// The decoder has finished decoding all data
break;
}
}
}Sourcepub fn push_range(
&mut self,
range: Range<u64>,
data: Bytes,
) -> Result<(), ParquetError>
pub fn push_range( &mut self, range: Range<u64>, data: Bytes, ) -> Result<(), ParquetError>
Push data into the decoder for processing
This is a convenience wrapper around Self::push_ranges for pushing a
single range of data.
Note this can be the entire file or just a part of it. If it is part of the file, the ranges should correspond to the data ranges requested by the decoder.
See example in ParquetPushDecoderBuilder
Sourcepub fn push_ranges(
&mut self,
ranges: Vec<Range<u64>>,
data: Vec<Bytes>,
) -> Result<(), ParquetError>
pub fn push_ranges( &mut self, ranges: Vec<Range<u64>>, data: Vec<Bytes>, ) -> Result<(), ParquetError>
Push data into the decoder for processing
This should correspond to the data ranges requested by the decoder
Sourcepub fn buffered_bytes(&self) -> u64
pub fn buffered_bytes(&self) -> u64
Returns the total number of buffered bytes in the decoder
This is the sum of the size of all [Bytes] that has been pushed to the
decoder but not yet consumed.
Note that this does not include any overhead of the internal data
structures and that since [Bytes] are ref counted memory, this may not
reflect additional memory usage.
This can be used to monitor memory usage of the decoder.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for ParquetPushDecoder
impl !RefUnwindSafe for ParquetPushDecoder
impl Send for ParquetPushDecoder
impl !Sync for ParquetPushDecoder
impl Unpin for ParquetPushDecoder
impl !UnwindSafe for ParquetPushDecoder
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more