pub struct SerializedRowGroupWriter<'a, W: Write> {Show 16 fields
descr: SchemaDescPtr,
props: WriterPropertiesPtr,
buf: &'a mut TrackedWrite<W>,
total_rows_written: Option<u64>,
total_bytes_written: u64,
total_uncompressed_bytes: i64,
column_index: usize,
row_group_metadata: Option<RowGroupMetaDataPtr>,
column_chunks: Vec<ColumnChunkMetaData>,
bloom_filters: Vec<Option<Sbbf>>,
column_indexes: Vec<Option<ColumnIndexMetaData>>,
offset_indexes: Vec<Option<OffsetIndexMetaData>>,
row_group_index: i16,
file_offset: i64,
on_close: Option<OnCloseRowGroup<'a, W>>,
file_encryptor: Option<Arc<FileEncryptor>>,
}Expand description
Parquet row group writer API.
Provides methods to access column writers in an iterator-like fashion, order is guaranteed to match the order of schema leaves (column descriptors).
All columns should be written sequentially; the main workflow is:
- Request the next column using
next_columnmethod - this will returnNoneif no more columns are available to write. - Once done writing a column, close column writer with
close - Once all columns have been written, close row group writer with
closemethod. The close method will return row group metadata and is no-op on already closed row group.
Fields§
§descr: SchemaDescPtr§props: WriterPropertiesPtr§buf: &'a mut TrackedWrite<W>§total_rows_written: Option<u64>§total_bytes_written: u64§total_uncompressed_bytes: i64§column_index: usize§row_group_metadata: Option<RowGroupMetaDataPtr>§column_chunks: Vec<ColumnChunkMetaData>§bloom_filters: Vec<Option<Sbbf>>§column_indexes: Vec<Option<ColumnIndexMetaData>>§offset_indexes: Vec<Option<OffsetIndexMetaData>>§row_group_index: i16§file_offset: i64§on_close: Option<OnCloseRowGroup<'a, W>>§file_encryptor: Option<Arc<FileEncryptor>>Implementations§
Source§impl<'a, W: Write + Send> SerializedRowGroupWriter<'a, W>
impl<'a, W: Write + Send> SerializedRowGroupWriter<'a, W>
Sourcepub fn new(
schema_descr: SchemaDescPtr,
properties: WriterPropertiesPtr,
buf: &'a mut TrackedWrite<W>,
row_group_index: i16,
on_close: Option<OnCloseRowGroup<'a, W>>,
) -> Self
pub fn new( schema_descr: SchemaDescPtr, properties: WriterPropertiesPtr, buf: &'a mut TrackedWrite<W>, row_group_index: i16, on_close: Option<OnCloseRowGroup<'a, W>>, ) -> Self
Creates a new SerializedRowGroupWriter with:
schema_descr- the schema to writeproperties- writer propertiesbuf- the buffer to write data torow_group_index- row group index in this parquet file.file_offset- file offset of this row group in this parquet file.on_close- an optional callback that will invoked onSelf::close
Sourcepub(crate) fn with_file_encryptor(
self,
file_encryptor: Option<Arc<FileEncryptor>>,
) -> Self
pub(crate) fn with_file_encryptor( self, file_encryptor: Option<Arc<FileEncryptor>>, ) -> Self
Set the file encryptor to use for encrypting row group data and metadata
Sourcefn next_column_desc(&mut self) -> Option<ColumnDescPtr>
fn next_column_desc(&mut self) -> Option<ColumnDescPtr>
Advance self.column_index returning the next ColumnDescPtr if any
Sourcefn get_on_close(&mut self) -> (&mut TrackedWrite<W>, OnCloseColumnChunk<'_>)
fn get_on_close(&mut self) -> (&mut TrackedWrite<W>, OnCloseColumnChunk<'_>)
Returns OnCloseColumnChunk for the next writer
Sourcepub(crate) fn next_column_with_factory<'b, F, C>(
&'b mut self,
factory: F,
) -> Result<Option<C>>where
F: FnOnce(ColumnDescPtr, WriterPropertiesPtr, Box<dyn PageWriter + 'b>, OnCloseColumnChunk<'b>) -> Result<C>,
pub(crate) fn next_column_with_factory<'b, F, C>(
&'b mut self,
factory: F,
) -> Result<Option<C>>where
F: FnOnce(ColumnDescPtr, WriterPropertiesPtr, Box<dyn PageWriter + 'b>, OnCloseColumnChunk<'b>) -> Result<C>,
Returns the next column writer, if available, using the factory function;
otherwise returns None.
Sourcepub fn next_column(&mut self) -> Result<Option<SerializedColumnWriter<'_>>>
pub fn next_column(&mut self) -> Result<Option<SerializedColumnWriter<'_>>>
Returns the next column writer, if available; otherwise returns None.
In case of any IO error or Thrift error, or if row group writer has already been
closed returns Err.
Sourcepub fn append_column<R: ChunkReader>(
&mut self,
reader: &R,
close: ColumnCloseResult,
) -> Result<()>
pub fn append_column<R: ChunkReader>( &mut self, reader: &R, close: ColumnCloseResult, ) -> Result<()>
Append an encoded column chunk from reader directly to the underlying
writer.
This method can be used for efficiently concatenating or projecting Parquet data, or encoding Parquet data to temporary in-memory buffers.
Arguments:
reader: aChunkReadercontaining the encoded column dataclose: theColumnCloseResultmetadata returned from closing the column writer that wrote the data inreader.
See Also:
get_column_writerfor creating writers that can encode data.Self::next_columnfor writing data that isn’t already encoded
Sourcepub fn close(self) -> Result<RowGroupMetaDataPtr>
pub fn close(self) -> Result<RowGroupMetaDataPtr>
Closes this row group writer and returns row group metadata.
Sourcefn set_column_crypto_metadata(
&self,
builder: ColumnChunkMetaDataBuilder,
metadata: &ColumnChunkMetaData,
) -> ColumnChunkMetaDataBuilder
fn set_column_crypto_metadata( &self, builder: ColumnChunkMetaDataBuilder, metadata: &ColumnChunkMetaData, ) -> ColumnChunkMetaDataBuilder
Set the column crypto metadata for a column chunk
Sourcefn get_page_encryptor_context(&self) -> PageEncryptorContext
fn get_page_encryptor_context(&self) -> PageEncryptorContext
Get context required to create a PageEncryptor for a column
Sourcefn set_page_writer_encryptor<'b>(
column: &ColumnDescPtr,
context: PageEncryptorContext,
page_writer: SerializedPageWriter<'b, W>,
) -> Result<SerializedPageWriter<'b, W>>
fn set_page_writer_encryptor<'b>( column: &ColumnDescPtr, context: PageEncryptorContext, page_writer: SerializedPageWriter<'b, W>, ) -> Result<SerializedPageWriter<'b, W>>
Set the PageEncryptor on a page writer if a column is encrypted
fn assert_previous_writer_closed(&self) -> Result<()>
Auto Trait Implementations§
impl<'a, W> Freeze for SerializedRowGroupWriter<'a, W>
impl<'a, W> !RefUnwindSafe for SerializedRowGroupWriter<'a, W>
impl<'a, W> Send for SerializedRowGroupWriter<'a, W>where
W: Send,
impl<'a, W> !Sync for SerializedRowGroupWriter<'a, W>
impl<'a, W> Unpin for SerializedRowGroupWriter<'a, W>
impl<'a, W> !UnwindSafe for SerializedRowGroupWriter<'a, W>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more