pub struct SerializedRowGroupWriter<'a, W: Write> {Show 15 fields
descr: SchemaDescPtr,
props: WriterPropertiesPtr,
buf: &'a mut TrackedWrite<W>,
total_rows_written: Option<u64>,
total_bytes_written: u64,
total_uncompressed_bytes: i64,
column_index: usize,
row_group_metadata: Option<RowGroupMetaDataPtr>,
column_chunks: Vec<ColumnChunkMetaData>,
bloom_filters: Vec<Option<Sbbf>>,
column_indexes: Vec<Option<ColumnIndex>>,
offset_indexes: Vec<Option<OffsetIndex>>,
row_group_index: i16,
file_offset: i64,
on_close: Option<OnCloseRowGroup<'a, W>>,
}
Expand description
Parquet row group writer API. Provides methods to access column writers in an iterator-like fashion, order is guaranteed to match the order of schema leaves (column descriptors).
All columns should be written sequentially; the main workflow is:
- Request the next column using
next_column
method - this will returnNone
if no more columns are available to write. - Once done writing a column, close column writer with
close
- Once all columns have been written, close row group writer with
close
method. THe close method will return row group metadata and is no-op on already closed row group.
Fields§
§descr: SchemaDescPtr
§props: WriterPropertiesPtr
§buf: &'a mut TrackedWrite<W>
§total_rows_written: Option<u64>
§total_bytes_written: u64
§total_uncompressed_bytes: i64
§column_index: usize
§row_group_metadata: Option<RowGroupMetaDataPtr>
§column_chunks: Vec<ColumnChunkMetaData>
§bloom_filters: Vec<Option<Sbbf>>
§column_indexes: Vec<Option<ColumnIndex>>
§offset_indexes: Vec<Option<OffsetIndex>>
§row_group_index: i16
§file_offset: i64
§on_close: Option<OnCloseRowGroup<'a, W>>
Implementations§
Source§impl<'a, W: Write + Send> SerializedRowGroupWriter<'a, W>
impl<'a, W: Write + Send> SerializedRowGroupWriter<'a, W>
Sourcepub fn new(
schema_descr: SchemaDescPtr,
properties: WriterPropertiesPtr,
buf: &'a mut TrackedWrite<W>,
row_group_index: i16,
on_close: Option<OnCloseRowGroup<'a, W>>,
) -> Self
pub fn new( schema_descr: SchemaDescPtr, properties: WriterPropertiesPtr, buf: &'a mut TrackedWrite<W>, row_group_index: i16, on_close: Option<OnCloseRowGroup<'a, W>>, ) -> Self
Creates a new SerializedRowGroupWriter
with:
schema_descr
- the schema to writeproperties
- writer propertiesbuf
- the buffer to write data torow_group_index
- row group index in this parquet file.file_offset
- file offset of this row group in this parquet file.on_close
- an optional callback that will invoked onSelf::close
Sourcefn next_column_desc(&mut self) -> Option<ColumnDescPtr>
fn next_column_desc(&mut self) -> Option<ColumnDescPtr>
Advance self.column_index
returning the next ColumnDescPtr
if any
Sourcefn get_on_close(&mut self) -> (&mut TrackedWrite<W>, OnCloseColumnChunk<'_>)
fn get_on_close(&mut self) -> (&mut TrackedWrite<W>, OnCloseColumnChunk<'_>)
Returns OnCloseColumnChunk
for the next writer
Sourcepub(crate) fn next_column_with_factory<'b, F, C>(
&'b mut self,
factory: F,
) -> Result<Option<C>>where
F: FnOnce(ColumnDescPtr, WriterPropertiesPtr, Box<dyn PageWriter + 'b>, OnCloseColumnChunk<'b>) -> Result<C>,
pub(crate) fn next_column_with_factory<'b, F, C>(
&'b mut self,
factory: F,
) -> Result<Option<C>>where
F: FnOnce(ColumnDescPtr, WriterPropertiesPtr, Box<dyn PageWriter + 'b>, OnCloseColumnChunk<'b>) -> Result<C>,
Returns the next column writer, if available, using the factory function;
otherwise returns None
.
Sourcepub fn next_column(&mut self) -> Result<Option<SerializedColumnWriter<'_>>>
pub fn next_column(&mut self) -> Result<Option<SerializedColumnWriter<'_>>>
Returns the next column writer, if available; otherwise returns None
.
In case of any IO error or Thrift error, or if row group writer has already been
closed returns Err
.
Sourcepub fn append_column<R: ChunkReader>(
&mut self,
reader: &R,
close: ColumnCloseResult,
) -> Result<()>
pub fn append_column<R: ChunkReader>( &mut self, reader: &R, close: ColumnCloseResult, ) -> Result<()>
Append an encoded column chunk from another source without decoding it
This can be used for efficiently concatenating or projecting parquet data, or encoding parquet data to temporary in-memory buffers
See Self::next_column
for writing data that isn’t already encoded
Sourcepub fn close(self) -> Result<RowGroupMetaDataPtr>
pub fn close(self) -> Result<RowGroupMetaDataPtr>
Closes this row group writer and returns row group metadata.
fn assert_previous_writer_closed(&self) -> Result<()>
Auto Trait Implementations§
impl<'a, W> Freeze for SerializedRowGroupWriter<'a, W>
impl<'a, W> !RefUnwindSafe for SerializedRowGroupWriter<'a, W>
impl<'a, W> Send for SerializedRowGroupWriter<'a, W>where
W: Send,
impl<'a, W> !Sync for SerializedRowGroupWriter<'a, W>
impl<'a, W> Unpin for SerializedRowGroupWriter<'a, W>
impl<'a, W> !UnwindSafe for SerializedRowGroupWriter<'a, W>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more