Apache Arrow Python Cookbook¶
The Apache Arrow Cookbook is a collection of recipes which demonstrate how to solve many common tasks that users might need to perform when working with arrow data. The examples in this cookbook will also serve as robust and well performing solutions to those tasks.
This cookbook is tested with pyarrow 14.0.0.
- Reading and Writing Data
- Write a Parquet file
- Reading a Parquet file
- Reading a subset of Parquet data
- Saving Arrow Arrays to disk
- Memory Mapping Arrow Arrays from disk
- Writing CSV files
- Writing CSV files incrementally
- Reading CSV files
- Writing Partitioned Datasets
- Reading Partitioned data
- Reading Partitioned Data from S3
- Write a Feather file
- Reading a Feather file
- Reading Line Delimited JSON
- Writing Compressed Data
- Reading Compressed Data
- Creating Arrow Objects
- Working with Schema
- Data Manipulation
- Computing Mean/Min/Max values of an array
- Counting Occurrences of Elements
- Applying arithmetic functions to arrays.
- Appending tables to an existing table
- Adding a column to an existing Table
- Replacing a column in an existing Table
- Group a Table
- Sort a Table
- Searching for values matching a predicate in Arrays
- Filtering Arrays using a mask
- Arrow Flight