pub struct SerializedFileWriter<W: Write> {
buf: TrackedWrite<W>,
schema: TypePtr,
descr: SchemaDescPtr,
props: WriterPropertiesPtr,
row_groups: Vec<RowGroupMetaData>,
bloom_filters: Vec<Vec<Option<Sbbf>>>,
column_indexes: Vec<Vec<Option<ColumnIndex>>>,
offset_indexes: Vec<Vec<Option<OffsetIndex>>>,
row_group_index: usize,
kv_metadatas: Vec<KeyValue>,
finished: bool,
}
Expand description
Parquet file writer API. Provides methods to write row groups sequentially.
The main workflow should be as following:
- Create file writer, this will open a new file and potentially write some metadata.
- Request a new row group writer by calling
next_row_group
. - Once finished writing row group, close row group writer by calling
close
- Write subsequent row groups, if necessary.
- After all row groups have been written, close the file writer using
close
method.
Fields§
§buf: TrackedWrite<W>
§schema: TypePtr
§descr: SchemaDescPtr
§props: WriterPropertiesPtr
§row_groups: Vec<RowGroupMetaData>
§bloom_filters: Vec<Vec<Option<Sbbf>>>
§column_indexes: Vec<Vec<Option<ColumnIndex>>>
§offset_indexes: Vec<Vec<Option<OffsetIndex>>>
§row_group_index: usize
§kv_metadatas: Vec<KeyValue>
§finished: bool
Implementations§
Source§impl<W: Write + Send> SerializedFileWriter<W>
impl<W: Write + Send> SerializedFileWriter<W>
Sourcepub fn new(
buf: W,
schema: TypePtr,
properties: WriterPropertiesPtr,
) -> Result<Self>
pub fn new( buf: W, schema: TypePtr, properties: WriterPropertiesPtr, ) -> Result<Self>
Creates new file writer.
Sourcepub fn next_row_group(&mut self) -> Result<SerializedRowGroupWriter<'_, W>>
pub fn next_row_group(&mut self) -> Result<SerializedRowGroupWriter<'_, W>>
Creates new row group from this file writer.
In case of IO error or Thrift error, returns Err
.
There can be at most 2^15 row groups in a file; and row groups have
to be written sequentially. Every time the next row group is requested, the
previous row group must be finalised and closed using RowGroupWriter::close
method.
Sourcepub fn flushed_row_groups(&self) -> &[RowGroupMetaData]
pub fn flushed_row_groups(&self) -> &[RowGroupMetaData]
Returns metadata for any flushed row groups
Sourcepub fn finish(&mut self) -> Result<FileMetaData>
pub fn finish(&mut self) -> Result<FileMetaData>
Close and finalize the underlying Parquet writer
Unlike Self::close
this does not consume self
Attempting to write after calling finish will result in an error
Sourcepub fn close(self) -> Result<FileMetaData>
pub fn close(self) -> Result<FileMetaData>
Closes and finalises file writer, returning the file metadata.
Sourcefn start_file(buf: &mut TrackedWrite<W>) -> Result<()>
fn start_file(buf: &mut TrackedWrite<W>) -> Result<()>
Writes magic bytes at the beginning of the file.
Sourcefn write_metadata(&mut self) -> Result<FileMetaData>
fn write_metadata(&mut self) -> Result<FileMetaData>
Assembles and writes metadata at the end of the file.
fn assert_previous_writer_closed(&self) -> Result<()>
Sourcepub fn append_key_value_metadata(&mut self, kv_metadata: KeyValue)
pub fn append_key_value_metadata(&mut self, kv_metadata: KeyValue)
Add a KeyValue
to the file writer’s metadata
Sourcepub fn schema_descr(&self) -> &SchemaDescriptor
pub fn schema_descr(&self) -> &SchemaDescriptor
Returns a reference to schema descriptor.
Sourcepub fn properties(&self) -> &WriterPropertiesPtr
pub fn properties(&self) -> &WriterPropertiesPtr
Returns a reference to the writer properties
Sourcepub fn inner_mut(&mut self) -> &mut W
pub fn inner_mut(&mut self) -> &mut W
Returns a mutable reference to the underlying writer.
It is inadvisable to directly write to the underlying writer.
Sourcepub fn into_inner(self) -> Result<W>
pub fn into_inner(self) -> Result<W>
Writes the file footer and returns the underlying writer.
Sourcepub fn bytes_written(&self) -> usize
pub fn bytes_written(&self) -> usize
Returns the number of bytes written to this instance
Trait Implementations§
Auto Trait Implementations§
impl<W> Freeze for SerializedFileWriter<W>where
W: Freeze,
impl<W> RefUnwindSafe for SerializedFileWriter<W>where
W: RefUnwindSafe,
impl<W> Send for SerializedFileWriter<W>where
W: Send,
impl<W> Sync for SerializedFileWriter<W>where
W: Sync,
impl<W> Unpin for SerializedFileWriter<W>where
W: Unpin,
impl<W> UnwindSafe for SerializedFileWriter<W>where
W: UnwindSafe,
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more