Struct SerializedFileWriter

Source
pub struct SerializedFileWriter<W: Write> {
    buf: TrackedWrite<W>,
    schema: TypePtr,
    descr: SchemaDescPtr,
    props: WriterPropertiesPtr,
    row_groups: Vec<RowGroupMetaData>,
    bloom_filters: Vec<Vec<Option<Sbbf>>>,
    column_indexes: Vec<Vec<Option<ColumnIndex>>>,
    offset_indexes: Vec<Vec<Option<OffsetIndex>>>,
    row_group_index: usize,
    kv_metadatas: Vec<KeyValue>,
    finished: bool,
    file_encryptor: Option<Arc<FileEncryptor>>,
}
Expand description

Parquet file writer API. Provides methods to write row groups sequentially.

The main workflow should be as following:

  • Create file writer, this will open a new file and potentially write some metadata.
  • Request a new row group writer by calling next_row_group.
  • Once finished writing row group, close row group writer by calling close
  • Write subsequent row groups, if necessary.
  • After all row groups have been written, close the file writer using close method.

Fields§

§buf: TrackedWrite<W>§schema: TypePtr§descr: SchemaDescPtr§props: WriterPropertiesPtr§row_groups: Vec<RowGroupMetaData>§bloom_filters: Vec<Vec<Option<Sbbf>>>§column_indexes: Vec<Vec<Option<ColumnIndex>>>§offset_indexes: Vec<Vec<Option<OffsetIndex>>>§row_group_index: usize§kv_metadatas: Vec<KeyValue>§finished: bool§file_encryptor: Option<Arc<FileEncryptor>>

Implementations§

Source§

impl<W: Write + Send> SerializedFileWriter<W>

Source

pub fn new( buf: W, schema: TypePtr, properties: WriterPropertiesPtr, ) -> Result<Self>

Creates new file writer.

Source

fn get_file_encryptor( properties: &WriterPropertiesPtr, schema_descriptor: &SchemaDescriptor, ) -> Result<Option<Arc<FileEncryptor>>>

Source

pub fn next_row_group(&mut self) -> Result<SerializedRowGroupWriter<'_, W>>

Creates new row group from this file writer. In case of IO error or Thrift error, returns Err.

There can be at most 2^15 row groups in a file; and row groups have to be written sequentially. Every time the next row group is requested, the previous row group must be finalised and closed using RowGroupWriter::close method.

Source

pub fn flushed_row_groups(&self) -> &[RowGroupMetaData]

Returns metadata for any flushed row groups

Source

pub fn finish(&mut self) -> Result<FileMetaData>

Close and finalize the underlying Parquet writer

Unlike Self::close this does not consume self

Attempting to write after calling finish will result in an error

Source

pub fn close(self) -> Result<FileMetaData>

Closes and finalises file writer, returning the file metadata.

Source

fn start_file( properties: &WriterPropertiesPtr, buf: &mut TrackedWrite<W>, ) -> Result<()>

Writes magic bytes at the beginning of the file.

Source

fn write_metadata(&mut self) -> Result<FileMetaData>

Assembles and writes metadata at the end of the file.

Source

fn assert_previous_writer_closed(&self) -> Result<()>

Source

pub fn append_key_value_metadata(&mut self, kv_metadata: KeyValue)

Add a KeyValue to the file writer’s metadata

Source

pub fn schema_descr(&self) -> &SchemaDescriptor

Returns a reference to schema descriptor.

Source

pub fn properties(&self) -> &WriterPropertiesPtr

Returns a reference to the writer properties

Source

pub fn inner(&self) -> &W

Returns a reference to the underlying writer.

Source

pub fn write_all(&mut self, buf: &[u8]) -> Result<()>

Writes the given buf bytes to the internal buffer.

This can be used to write raw data to an in-progress parquet file, for example, custom index structures or other payloads. Other parquet readers will skip this data when reading the files.

It’s safe to use this method to write data to the underlying writer, because it will ensure that the buffering and byte‐counting layers are used.

Source

pub fn inner_mut(&mut self) -> &mut W

Returns a mutable reference to the underlying writer.

Warning: if you write directly to this writer, you will skip the TrackedWrite buffering and byte‐counting layers. That’ll cause the file footer’s recorded offsets and sizes to diverge from reality, resulting in an unreadable or corrupted Parquet file.

If you want to write safely to the underlying writer, use Self::write_all.

Source

pub fn into_inner(self) -> Result<W>

Writes the file footer and returns the underlying writer.

Source

pub fn bytes_written(&self) -> usize

Returns the number of bytes written to this instance

Source

pub(crate) fn file_encryptor(&self) -> Option<Arc<FileEncryptor>>

Get the file encryptor used by this instance to encrypt data

Trait Implementations§

Source§

impl<W: Write> Debug for SerializedFileWriter<W>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl<W> Freeze for SerializedFileWriter<W>
where W: Freeze,

§

impl<W> RefUnwindSafe for SerializedFileWriter<W>
where W: RefUnwindSafe,

§

impl<W> Send for SerializedFileWriter<W>
where W: Send,

§

impl<W> Sync for SerializedFileWriter<W>
where W: Sync,

§

impl<W> Unpin for SerializedFileWriter<W>
where W: Unpin,

§

impl<W> UnwindSafe for SerializedFileWriter<W>
where W: UnwindSafe,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,

§

impl<T> ErasedDestructor for T
where T: 'static,