Reading and writing Parquet files

The Parquet C++ library is part of the Apache Arrow project and benefits from tight integration with Arrow C++.


The Parquet FileReader requires a ::arrow::io::RandomAccessFile instance representing the input file.

#include "arrow/parquet/arrow/reader.h"

   // ...
   arrow::Status st;
   arrow::MemoryPool* pool = default_memory_pool();
   std::shared_ptr<arrow::io::RandomAccessFile> input = ...;

   // Open Parquet file reader
   std::unique_ptr<parquet::arrow::FileReader> arrow_reader;
   st = parquet::arrow::OpenFile(input, pool, &arrow_reader);
   if (!st.ok()) {
      // Handle error instantiating file reader...

   // Read entire file as a single Arrow table
   std::shared_ptr<arrow::Table> table;
   st = arrow_reader->ReadTable(&table);
   if (!st.ok()) {
      // Handle error reading Parquet data...

Finer-grained options are available through the FileReaderBuilder helper class.


TODO: write this