enum ParquetDecoderState {
ReadingRowGroup {
remaining_row_groups: Box<RemainingRowGroups>,
},
DecodingRowGroup {
record_batch_reader: Box<ParquetRecordBatchReader>,
remaining_row_groups: Box<RemainingRowGroups>,
},
Finished,
}Expand description
Internal state machine for the ParquetPushDecoder
Variants§
ReadingRowGroup
Waiting for data needed to decode the next RowGroup
Fields
remaining_row_groups: Box<RemainingRowGroups>DecodingRowGroup
The decoder is actively decoding a RowGroup
Fields
record_batch_reader: Box<ParquetRecordBatchReader>Current active reader
remaining_row_groups: Box<RemainingRowGroups>Finished
The decoder has finished processing all data
Implementations§
Source§impl ParquetDecoderState
impl ParquetDecoderState
Sourcefn try_next_reader(
self,
) -> Result<(Self, DecodeResult<ParquetRecordBatchReader>), ParquetError>
fn try_next_reader( self, ) -> Result<(Self, DecodeResult<ParquetRecordBatchReader>), ParquetError>
If actively reading a RowGroup, return the currently active ParquetRecordBatchReader and advance to the next group.
Sourcefn try_next_batch(
self,
) -> Result<(Self, DecodeResult<RecordBatch>), ParquetError>
fn try_next_batch( self, ) -> Result<(Self, DecodeResult<RecordBatch>), ParquetError>
Current state –> next state + output
This function is called to get the next RecordBatch
This structure is used to reduce the indentation level of the main loop in try_build
Sourcefn transition(self) -> Result<(Self, DecodeResult<()>), ParquetError>
fn transition(self) -> Result<(Self, DecodeResult<()>), ParquetError>
Transition to the next state with a reader (data can be produced), if not end of stream
This function is called in a loop until the decoder is ready to return data (has the required pages buffered) or is finished.
Sourcepub fn push_data(
self,
ranges: Vec<Range<u64>>,
data: Vec<Bytes>,
) -> Result<Self, ParquetError>
pub fn push_data( self, ranges: Vec<Range<u64>>, data: Vec<Bytes>, ) -> Result<Self, ParquetError>
Push data, and transition state if needed
This should correspond to the data ranges requested by the decoder
Sourcefn buffered_bytes(&self) -> u64
fn buffered_bytes(&self) -> u64
How many bytes are currently buffered in the decoder?