SerializedRowGroupWriter

Struct SerializedRowGroupWriter 

Source
pub struct SerializedRowGroupWriter<'a, W: Write> {
Show 16 fields descr: SchemaDescPtr, props: WriterPropertiesPtr, buf: &'a mut TrackedWrite<W>, total_rows_written: Option<u64>, total_bytes_written: u64, total_uncompressed_bytes: i64, column_index: usize, row_group_metadata: Option<RowGroupMetaDataPtr>, column_chunks: Vec<ColumnChunkMetaData>, bloom_filters: Vec<Option<Sbbf>>, column_indexes: Vec<Option<ColumnIndexMetaData>>, offset_indexes: Vec<Option<OffsetIndexMetaData>>, row_group_index: i16, file_offset: i64, on_close: Option<OnCloseRowGroup<'a, W>>, file_encryptor: Option<Arc<FileEncryptor>>,
}
Expand description

Parquet row group writer API.

Provides methods to access column writers in an iterator-like fashion, order is guaranteed to match the order of schema leaves (column descriptors).

All columns should be written sequentially; the main workflow is:

  • Request the next column using next_column method - this will return None if no more columns are available to write.
  • Once done writing a column, close column writer with close
  • Once all columns have been written, close row group writer with close method. The close method will return row group metadata and is no-op on already closed row group.

Fields§

§descr: SchemaDescPtr§props: WriterPropertiesPtr§buf: &'a mut TrackedWrite<W>§total_rows_written: Option<u64>§total_bytes_written: u64§total_uncompressed_bytes: i64§column_index: usize§row_group_metadata: Option<RowGroupMetaDataPtr>§column_chunks: Vec<ColumnChunkMetaData>§bloom_filters: Vec<Option<Sbbf>>§column_indexes: Vec<Option<ColumnIndexMetaData>>§offset_indexes: Vec<Option<OffsetIndexMetaData>>§row_group_index: i16§file_offset: i64§on_close: Option<OnCloseRowGroup<'a, W>>§file_encryptor: Option<Arc<FileEncryptor>>

Implementations§

Source§

impl<'a, W: Write + Send> SerializedRowGroupWriter<'a, W>

Source

pub fn new( schema_descr: SchemaDescPtr, properties: WriterPropertiesPtr, buf: &'a mut TrackedWrite<W>, row_group_index: i16, on_close: Option<OnCloseRowGroup<'a, W>>, ) -> Self

Creates a new SerializedRowGroupWriter with:

  • schema_descr - the schema to write
  • properties - writer properties
  • buf - the buffer to write data to
  • row_group_index - row group index in this parquet file.
  • file_offset - file offset of this row group in this parquet file.
  • on_close - an optional callback that will invoked on Self::close
Source

pub(crate) fn with_file_encryptor( self, file_encryptor: Option<Arc<FileEncryptor>>, ) -> Self

Set the file encryptor to use for encrypting row group data and metadata

Source

fn next_column_desc(&mut self) -> Option<ColumnDescPtr>

Advance self.column_index returning the next ColumnDescPtr if any

Source

fn get_on_close(&mut self) -> (&mut TrackedWrite<W>, OnCloseColumnChunk<'_>)

Returns OnCloseColumnChunk for the next writer

Source

pub(crate) fn next_column_with_factory<'b, F, C>( &'b mut self, factory: F, ) -> Result<Option<C>>

Returns the next column writer, if available, using the factory function; otherwise returns None.

Source

pub fn next_column(&mut self) -> Result<Option<SerializedColumnWriter<'_>>>

Returns the next column writer, if available; otherwise returns None. In case of any IO error or Thrift error, or if row group writer has already been closed returns Err.

Source

pub fn append_column<R: ChunkReader>( &mut self, reader: &R, close: ColumnCloseResult, ) -> Result<()>

Append an encoded column chunk from reader directly to the underlying writer.

This method can be used for efficiently concatenating or projecting Parquet data, or encoding Parquet data to temporary in-memory buffers.

Arguments:

  • reader: a ChunkReader containing the encoded column data
  • close: the ColumnCloseResult metadata returned from closing the column writer that wrote the data in reader.

See Also:

  1. get_column_writer for creating writers that can encode data.
  2. Self::next_column for writing data that isn’t already encoded
Source

pub fn close(self) -> Result<RowGroupMetaDataPtr>

Closes this row group writer and returns row group metadata.

Source

fn set_column_crypto_metadata( &self, builder: ColumnChunkMetaDataBuilder, metadata: &ColumnChunkMetaData, ) -> ColumnChunkMetaDataBuilder

Set the column crypto metadata for a column chunk

Source

fn get_page_encryptor_context(&self) -> PageEncryptorContext

Get context required to create a PageEncryptor for a column

Source

fn set_page_writer_encryptor<'b>( column: &ColumnDescPtr, context: PageEncryptorContext, page_writer: SerializedPageWriter<'b, W>, ) -> Result<SerializedPageWriter<'b, W>>

Set the PageEncryptor on a page writer if a column is encrypted

Source

fn assert_previous_writer_closed(&self) -> Result<()>

Auto Trait Implementations§

§

impl<'a, W> Freeze for SerializedRowGroupWriter<'a, W>

§

impl<'a, W> !RefUnwindSafe for SerializedRowGroupWriter<'a, W>

§

impl<'a, W> Send for SerializedRowGroupWriter<'a, W>
where W: Send,

§

impl<'a, W> !Sync for SerializedRowGroupWriter<'a, W>

§

impl<'a, W> Unpin for SerializedRowGroupWriter<'a, W>

§

impl<'a, W> !UnwindSafe for SerializedRowGroupWriter<'a, W>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

§

impl<T> ErasedDestructor for T
where T: 'static,

§

impl<T> Ungil for T
where T: Send,