Expand description
Content-defined chunking (CDC) for Parquet data pages.
CDC creates data page boundaries based on content rather than fixed sizes,
enabling efficient deduplication in content-addressable storage (CAS) systems.
See CdcOptions for configuration.
Modulesยง
- cdc ๐
- cdc_
generated ๐
Structsยง
- CdcChunk ๐
- A chunk of data with level and value offsets for record-shredded nested data.