parquet::file::writer

Struct SerializedRowGroupWriter

Source
pub struct SerializedRowGroupWriter<'a, W: Write> {
Show 15 fields descr: SchemaDescPtr, props: WriterPropertiesPtr, buf: &'a mut TrackedWrite<W>, total_rows_written: Option<u64>, total_bytes_written: u64, total_uncompressed_bytes: i64, column_index: usize, row_group_metadata: Option<RowGroupMetaDataPtr>, column_chunks: Vec<ColumnChunkMetaData>, bloom_filters: Vec<Option<Sbbf>>, column_indexes: Vec<Option<ColumnIndex>>, offset_indexes: Vec<Option<OffsetIndex>>, row_group_index: i16, file_offset: i64, on_close: Option<OnCloseRowGroup<'a, W>>,
}
Expand description

Parquet row group writer API. Provides methods to access column writers in an iterator-like fashion, order is guaranteed to match the order of schema leaves (column descriptors).

All columns should be written sequentially; the main workflow is:

  • Request the next column using next_column method - this will return None if no more columns are available to write.
  • Once done writing a column, close column writer with close
  • Once all columns have been written, close row group writer with close method. THe close method will return row group metadata and is no-op on already closed row group.

Fields§

§descr: SchemaDescPtr§props: WriterPropertiesPtr§buf: &'a mut TrackedWrite<W>§total_rows_written: Option<u64>§total_bytes_written: u64§total_uncompressed_bytes: i64§column_index: usize§row_group_metadata: Option<RowGroupMetaDataPtr>§column_chunks: Vec<ColumnChunkMetaData>§bloom_filters: Vec<Option<Sbbf>>§column_indexes: Vec<Option<ColumnIndex>>§offset_indexes: Vec<Option<OffsetIndex>>§row_group_index: i16§file_offset: i64§on_close: Option<OnCloseRowGroup<'a, W>>

Implementations§

Source§

impl<'a, W: Write + Send> SerializedRowGroupWriter<'a, W>

Source

pub fn new( schema_descr: SchemaDescPtr, properties: WriterPropertiesPtr, buf: &'a mut TrackedWrite<W>, row_group_index: i16, on_close: Option<OnCloseRowGroup<'a, W>>, ) -> Self

Creates a new SerializedRowGroupWriter with:

  • schema_descr - the schema to write
  • properties - writer properties
  • buf - the buffer to write data to
  • row_group_index - row group index in this parquet file.
  • file_offset - file offset of this row group in this parquet file.
  • on_close - an optional callback that will invoked on Self::close
Source

fn next_column_desc(&mut self) -> Option<ColumnDescPtr>

Advance self.column_index returning the next ColumnDescPtr if any

Source

fn get_on_close(&mut self) -> (&mut TrackedWrite<W>, OnCloseColumnChunk<'_>)

Returns OnCloseColumnChunk for the next writer

Source

pub(crate) fn next_column_with_factory<'b, F, C>( &'b mut self, factory: F, ) -> Result<Option<C>>

Returns the next column writer, if available, using the factory function; otherwise returns None.

Source

pub fn next_column(&mut self) -> Result<Option<SerializedColumnWriter<'_>>>

Returns the next column writer, if available; otherwise returns None. In case of any IO error or Thrift error, or if row group writer has already been closed returns Err.

Source

pub fn append_column<R: ChunkReader>( &mut self, reader: &R, close: ColumnCloseResult, ) -> Result<()>

Append an encoded column chunk from another source without decoding it

This can be used for efficiently concatenating or projecting parquet data, or encoding parquet data to temporary in-memory buffers

See Self::next_column for writing data that isn’t already encoded

Source

pub fn close(self) -> Result<RowGroupMetaDataPtr>

Closes this row group writer and returns row group metadata.

Source

fn assert_previous_writer_closed(&self) -> Result<()>

Auto Trait Implementations§

§

impl<'a, W> Freeze for SerializedRowGroupWriter<'a, W>

§

impl<'a, W> !RefUnwindSafe for SerializedRowGroupWriter<'a, W>

§

impl<'a, W> Send for SerializedRowGroupWriter<'a, W>
where W: Send,

§

impl<'a, W> !Sync for SerializedRowGroupWriter<'a, W>

§

impl<'a, W> Unpin for SerializedRowGroupWriter<'a, W>

§

impl<'a, W> !UnwindSafe for SerializedRowGroupWriter<'a, W>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<T> ErasedDestructor for T
where T: 'static,

§

impl<T> MaybeSendSync for T