Specifications and Protocols
Format Versioning and Stability
Arrow Columnar Format
Arrow Flight RPC
Integration Testing
The Arrow C data interface
The Arrow C stream interface
Other Data Structures
Libraries
Implementation Status
C/GLib
C++
User Guide
High-Level Overview
Conventions
Using Arrow C++ in your own project
Memory Management
Arrays
Data Types
Tabular Data
Compute Functions
Input / output and filesystems
Reading and writing the Arrow IPC format
Reading and writing Parquet files
Reading and Writing CSV files
Reading JSON files
Tabular Datasets
Arrow Flight RPC
Examples
Minimal build using CMake
Arrow Datasets example
Row to columnar conversion
std::tuple-like ranges to Arrow
API Reference
Programming Support
Memory (management)
Data Types
Arrays
Scalars
Array Builders
Two-dimensional Datasets
C Interfaces
Compute Functions
Tensors
Utilities
Input / output
Arrow IPC
File Formats
CUDA support
Arrow Flight RPC
Filesystems
Dataset
C#
Go
Java
ValueVector
VectorSchemaRoot
Reading/Writing IPC formats
Java Algorithms
Reference (javadoc)
JavaScript
Julia
MATLAB
Python
Installing PyArrow
Memory and IO Interfaces
Data Types and In-Memory Data Model
Compute Functions
Streaming, Serialization, and IPC
Filesystem Interface
Filesystem Interface (legacy)
pyarrow.hdfs.connect
pyarrow.HadoopFileSystem.cat
pyarrow.HadoopFileSystem.chmod
pyarrow.HadoopFileSystem.chown
pyarrow.HadoopFileSystem.delete
pyarrow.HadoopFileSystem.df
pyarrow.HadoopFileSystem.disk_usage
pyarrow.HadoopFileSystem.download
pyarrow.HadoopFileSystem.exists
pyarrow.HadoopFileSystem.get_capacity
pyarrow.HadoopFileSystem.get_space_used
pyarrow.HadoopFileSystem.info
pyarrow.HadoopFileSystem.ls
pyarrow.HadoopFileSystem.mkdir
pyarrow.HadoopFileSystem.open
pyarrow.HadoopFileSystem.rename
pyarrow.HadoopFileSystem.rm
pyarrow.HadoopFileSystem.upload
pyarrow.HdfsFile
The Plasma In-Memory Object Store
NumPy Integration
Pandas Integration
Timestamps
Reading and Writing CSV files
Feather File Format
Reading JSON files
Reading and Writing the Apache Parquet Format
Tabular Datasets
CUDA Integration
Extending pyarrow
Using pyarrow from C++ and Cython Code
API Reference
Data Types and Schemas
pyarrow.null
pyarrow.bool_
pyarrow.int8
pyarrow.int16
pyarrow.int32
pyarrow.int64
pyarrow.uint8
pyarrow.uint16
pyarrow.uint32
pyarrow.uint64
pyarrow.float16
pyarrow.float32
pyarrow.float64
pyarrow.time32
pyarrow.time64
pyarrow.timestamp
pyarrow.date32
pyarrow.date64
pyarrow.binary
pyarrow.string
pyarrow.utf8
pyarrow.large_binary
pyarrow.large_string
pyarrow.large_utf8
pyarrow.decimal128
pyarrow.list_
pyarrow.large_list
pyarrow.map_
pyarrow.struct
pyarrow.dictionary
pyarrow.field
pyarrow.schema
pyarrow.from_numpy_dtype
pyarrow.DataType
pyarrow.DictionaryType
pyarrow.ListType
pyarrow.MapType
pyarrow.StructType
pyarrow.UnionType
pyarrow.TimestampType
pyarrow.Time32Type
pyarrow.Time64Type
pyarrow.FixedSizeBinaryType
pyarrow.Decimal128Type
pyarrow.Field
pyarrow.Schema
pyarrow.ExtensionType
pyarrow.PyExtensionType
pyarrow.register_extension_type
pyarrow.unregister_extension_type
pyarrow.types.is_boolean
pyarrow.types.is_integer
pyarrow.types.is_signed_integer
pyarrow.types.is_unsigned_integer
pyarrow.types.is_int8
pyarrow.types.is_int16
pyarrow.types.is_int32
pyarrow.types.is_int64
pyarrow.types.is_uint8
pyarrow.types.is_uint16
pyarrow.types.is_uint32
pyarrow.types.is_uint64
pyarrow.types.is_floating
pyarrow.types.is_float16
pyarrow.types.is_float32
pyarrow.types.is_float64
pyarrow.types.is_decimal
pyarrow.types.is_list
pyarrow.types.is_large_list
pyarrow.types.is_struct
pyarrow.types.is_union
pyarrow.types.is_nested
pyarrow.types.is_temporal
pyarrow.types.is_timestamp
pyarrow.types.is_date
pyarrow.types.is_date32
pyarrow.types.is_date64
pyarrow.types.is_time
pyarrow.types.is_time32
pyarrow.types.is_time64
pyarrow.types.is_null
pyarrow.types.is_binary
pyarrow.types.is_unicode
pyarrow.types.is_string
pyarrow.types.is_large_binary
pyarrow.types.is_large_unicode
pyarrow.types.is_large_string
pyarrow.types.is_fixed_size_binary
pyarrow.types.is_map
pyarrow.types.is_dictionary
Arrays and Scalars
pyarrow.array
pyarrow.nulls
pyarrow.Array
pyarrow.BooleanArray
pyarrow.FloatingPointArray
pyarrow.IntegerArray
pyarrow.Int8Array
pyarrow.Int16Array
pyarrow.Int32Array
pyarrow.Int64Array
pyarrow.NullArray
pyarrow.NumericArray
pyarrow.UInt8Array
pyarrow.UInt16Array
pyarrow.UInt32Array
pyarrow.UInt64Array
pyarrow.BinaryArray
pyarrow.StringArray
pyarrow.FixedSizeBinaryArray
pyarrow.LargeBinaryArray
pyarrow.LargeStringArray
pyarrow.Time32Array
pyarrow.Time64Array
pyarrow.Date32Array
pyarrow.Date64Array
pyarrow.TimestampArray
pyarrow.Decimal128Array
pyarrow.DictionaryArray
pyarrow.ListArray
pyarrow.FixedSizeListArray
pyarrow.LargeListArray
pyarrow.StructArray
pyarrow.UnionArray
pyarrow.ExtensionArray
pyarrow.scalar
pyarrow.NA
pyarrow.Scalar
pyarrow.BooleanScalar
pyarrow.Int8Scalar
pyarrow.Int16Scalar
pyarrow.Int32Scalar
pyarrow.Int64Scalar
pyarrow.UInt8Scalar
pyarrow.UInt16Scalar
pyarrow.UInt32Scalar
pyarrow.UInt64Scalar
pyarrow.FloatScalar
pyarrow.DoubleScalar
pyarrow.BinaryScalar
pyarrow.StringScalar
pyarrow.FixedSizeBinaryScalar
pyarrow.LargeBinaryScalar
pyarrow.LargeStringScalar
pyarrow.Time32Scalar
pyarrow.Time64Scalar
pyarrow.Date32Scalar
pyarrow.Date64Scalar
pyarrow.TimestampScalar
pyarrow.Decimal128Scalar
pyarrow.DictionaryScalar
pyarrow.ListScalar
pyarrow.LargeListScalar
pyarrow.StructScalar
pyarrow.UnionScalar
Buffers and Memory
pyarrow.allocate_buffer
pyarrow.py_buffer
pyarrow.foreign_buffer
pyarrow.Buffer
pyarrow.ResizableBuffer
pyarrow.Codec
pyarrow.compress
pyarrow.decompress
pyarrow.MemoryPool
pyarrow.default_memory_pool
pyarrow.jemalloc_memory_pool
pyarrow.mimalloc_memory_pool
pyarrow.system_memory_pool
pyarrow.jemalloc_set_decay_ms
pyarrow.set_memory_pool
pyarrow.log_memory_allocations
pyarrow.total_allocated_bytes
Compute Functions
pyarrow.compute.count
pyarrow.compute.index
pyarrow.compute.mean
pyarrow.compute.min_max
pyarrow.compute.mode
pyarrow.compute.stddev
pyarrow.compute.sum
pyarrow.compute.variance
pyarrow.compute.abs
pyarrow.compute.abs_checked
pyarrow.compute.add
pyarrow.compute.add_checked
pyarrow.compute.divide
pyarrow.compute.divide_checked
pyarrow.compute.multiply
pyarrow.compute.multiply_checked
pyarrow.compute.power
pyarrow.compute.power_checked
pyarrow.compute.shift_left
pyarrow.compute.shift_left_checked
pyarrow.compute.shift_right
pyarrow.compute.shift_right_checked
pyarrow.compute.sign
pyarrow.compute.subtract
pyarrow.compute.subtract_checked
pyarrow.compute.bit_wise_and
pyarrow.compute.bit_wise_not
pyarrow.compute.bit_wise_or
pyarrow.compute.bit_wise_xor
pyarrow.compute.ceil
pyarrow.compute.floor
pyarrow.compute.trunc
pyarrow.compute.ln
pyarrow.compute.ln_checked
pyarrow.compute.log10
pyarrow.compute.log10_checked
pyarrow.compute.log1p
pyarrow.compute.log1p_checked
pyarrow.compute.log2
pyarrow.compute.log2_checked
pyarrow.compute.acos
pyarrow.compute.acos_checked
pyarrow.compute.asin
pyarrow.compute.asin_checked
pyarrow.compute.atan
pyarrow.compute.atan2
pyarrow.compute.cos
pyarrow.compute.cos_checked
pyarrow.compute.sin
pyarrow.compute.sin_checked
pyarrow.compute.tan
pyarrow.compute.tan_checked
pyarrow.compute.equal
pyarrow.compute.greater
pyarrow.compute.greater_equal
pyarrow.compute.less
pyarrow.compute.less_equal
pyarrow.compute.not_equal
pyarrow.compute.max_element_wise
pyarrow.compute.min_element_wise
pyarrow.compute.and_
pyarrow.compute.and_kleene
pyarrow.compute.all
pyarrow.compute.any
pyarrow.compute.invert
pyarrow.compute.or_
pyarrow.compute.or_kleene
pyarrow.compute.xor
pyarrow.compute.ascii_is_alnum
pyarrow.compute.ascii_is_alpha
pyarrow.compute.ascii_is_decimal
pyarrow.compute.ascii_is_lower
pyarrow.compute.ascii_is_printable
pyarrow.compute.ascii_is_space
pyarrow.compute.ascii_is_upper
pyarrow.compute.utf8_is_alnum
pyarrow.compute.utf8_is_alpha
pyarrow.compute.utf8_is_decimal
pyarrow.compute.utf8_is_digit
pyarrow.compute.utf8_is_lower
pyarrow.compute.utf8_is_numeric
pyarrow.compute.utf8_is_printable
pyarrow.compute.utf8_is_space
pyarrow.compute.utf8_is_upper
pyarrow.compute.ascii_is_title
pyarrow.compute.utf8_is_title
pyarrow.compute.string_is_ascii
pyarrow.compute.split_pattern
pyarrow.compute.split_pattern_regex
pyarrow.compute.ascii_split_whitespace
pyarrow.compute.utf8_split_whitespace
pyarrow.compute.extract_regex
pyarrow.compute.binary_join
pyarrow.compute.binary_join_element_wise
pyarrow.compute.ascii_center
pyarrow.compute.ascii_lpad
pyarrow.compute.ascii_ltrim
pyarrow.compute.ascii_ltrim_whitespace
pyarrow.compute.ascii_lower
pyarrow.compute.ascii_reverse
pyarrow.compute.ascii_rpad
pyarrow.compute.ascii_rtrim
pyarrow.compute.ascii_rtrim_whitespace
pyarrow.compute.ascii_trim
pyarrow.compute.ascii_upper
pyarrow.compute.binary_length
pyarrow.compute.binary_replace_slice
pyarrow.compute.replace_substring
pyarrow.compute.replace_substring_regex
pyarrow.compute.utf8_center
pyarrow.compute.utf8_length
pyarrow.compute.utf8_lower
pyarrow.compute.utf8_lpad
pyarrow.compute.utf8_ltrim
pyarrow.compute.utf8_ltrim_whitespace
pyarrow.compute.utf8_replace_slice
pyarrow.compute.utf8_reverse
pyarrow.compute.utf8_rpad
pyarrow.compute.utf8_rtrim
pyarrow.compute.utf8_rtrim_whitespace
pyarrow.compute.utf8_trim
pyarrow.compute.utf8_upper
pyarrow.compute.count_substring
pyarrow.compute.count_substring_regex
pyarrow.compute.ends_with
pyarrow.compute.find_substring
pyarrow.compute.find_substring_regex
pyarrow.compute.index_in
pyarrow.compute.is_in
pyarrow.compute.match_like
pyarrow.compute.match_substring
pyarrow.compute.match_substring_regex
pyarrow.compute.starts_with
pyarrow.compute.cast
pyarrow.compute.strptime
pyarrow.compute.replace_with_mask
pyarrow.compute.filter
pyarrow.compute.take
pyarrow.compute.dictionary_encode
pyarrow.compute.unique
pyarrow.compute.value_counts
pyarrow.compute.partition_nth_indices
pyarrow.compute.sort_indices
pyarrow.compute.binary_length
pyarrow.compute.case_when
pyarrow.compute.coalesce
pyarrow.compute.fill_null
pyarrow.compute.if_else
pyarrow.compute.is_finite
pyarrow.compute.is_inf
pyarrow.compute.is_nan
pyarrow.compute.is_null
pyarrow.compute.is_valid
pyarrow.compute.list_value_length
pyarrow.compute.list_flatten
pyarrow.compute.list_parent_indices
Streams and File Access
pyarrow.input_stream
pyarrow.output_stream
pyarrow.memory_map
pyarrow.create_memory_map
pyarrow.NativeFile
pyarrow.OSFile
pyarrow.PythonFile
pyarrow.BufferReader
pyarrow.BufferOutputStream
pyarrow.FixedSizeBufferWriter
pyarrow.MemoryMappedFile
pyarrow.CompressedInputStream
pyarrow.CompressedOutputStream
pyarrow.hdfs.connect
pyarrow.LocalFileSystem
Tables and Tensors
pyarrow.chunked_array
pyarrow.concat_arrays
pyarrow.concat_tables
pyarrow.record_batch
pyarrow.table
pyarrow.ChunkedArray
pyarrow.RecordBatch
pyarrow.Table
pyarrow.Tensor
Serialization and IPC
pyarrow.ipc.new_file
pyarrow.ipc.open_file
pyarrow.ipc.new_stream
pyarrow.ipc.open_stream
pyarrow.ipc.read_message
pyarrow.ipc.read_record_batch
pyarrow.ipc.get_record_batch_size
pyarrow.ipc.read_tensor
pyarrow.ipc.write_tensor
pyarrow.ipc.get_tensor_size
pyarrow.ipc.IpcWriteOptions
pyarrow.ipc.Message
pyarrow.ipc.MessageReader
pyarrow.ipc.RecordBatchFileReader
pyarrow.ipc.RecordBatchFileWriter
pyarrow.ipc.RecordBatchStreamReader
pyarrow.ipc.RecordBatchStreamWriter
pyarrow.serialize
pyarrow.serialize_to
pyarrow.deserialize
pyarrow.deserialize_components
pyarrow.deserialize_from
pyarrow.read_serialized
pyarrow.SerializedPyObject
pyarrow.SerializationContext
Arrow Flight
pyarrow.flight.Action
pyarrow.flight.ActionType
pyarrow.flight.DescriptorType
pyarrow.flight.FlightDescriptor
pyarrow.flight.FlightEndpoint
pyarrow.flight.FlightInfo
pyarrow.flight.Location
pyarrow.flight.Ticket
pyarrow.flight.Result
pyarrow.flight.FlightCallOptions
pyarrow.flight.FlightClient
pyarrow.flight.ClientMiddlewareFactory
pyarrow.flight.ClientMiddleware
pyarrow.flight.FlightServerBase
pyarrow.flight.GeneratorStream
pyarrow.flight.RecordBatchStream
pyarrow.flight.ServerMiddlewareFactory
pyarrow.flight.ServerMiddleware
pyarrow.flight.ClientAuthHandler
pyarrow.flight.ServerAuthHandler
pyarrow.flight.FlightMethod
pyarrow.flight.CallInfo
Tabular File Formats
pyarrow.csv.ConvertOptions
pyarrow.csv.CSVStreamingReader
pyarrow.csv.CSVWriter
pyarrow.csv.ISO8601
pyarrow.csv.ParseOptions
pyarrow.csv.ReadOptions
pyarrow.csv.WriteOptions
pyarrow.csv.open_csv
pyarrow.csv.read_csv
pyarrow.csv.write_csv
pyarrow.feather.read_feather
pyarrow.feather.read_table
pyarrow.feather.write_feather
pyarrow.json.ReadOptions
pyarrow.json.ParseOptions
pyarrow.json.read_json
pyarrow.parquet.ParquetDataset
pyarrow.parquet.ParquetFile
pyarrow.parquet.ParquetWriter
pyarrow.parquet.read_table
pyarrow.parquet.read_metadata
pyarrow.parquet.read_pandas
pyarrow.parquet.read_schema
pyarrow.parquet.write_metadata
pyarrow.parquet.write_table
pyarrow.parquet.write_to_dataset
pyarrow.orc.ORCFile
Filesystems
pyarrow.fs.FileInfo
pyarrow.fs.FileSelector
pyarrow.fs.FileSystem
pyarrow.fs.LocalFileSystem
pyarrow.fs.S3FileSystem
pyarrow.fs.HadoopFileSystem
pyarrow.fs.SubTreeFileSystem
pyarrow.fs.PyFileSystem
pyarrow.fs.FileSystemHandler
pyarrow.fs.FSSpecHandler
Dataset
pyarrow.dataset.dataset
pyarrow.dataset.parquet_dataset
pyarrow.dataset.partitioning
pyarrow.dataset.field
pyarrow.dataset.scalar
pyarrow.dataset.write_dataset
pyarrow.dataset.FileFormat
pyarrow.dataset.ParquetFileFormat
pyarrow.dataset.Partitioning
pyarrow.dataset.PartitioningFactory
pyarrow.dataset.DirectoryPartitioning
pyarrow.dataset.HivePartitioning
pyarrow.dataset.Dataset
pyarrow.dataset.FileSystemDataset
pyarrow.dataset.FileSystemFactoryOptions
pyarrow.dataset.FileSystemDatasetFactory
pyarrow.dataset.UnionDataset
pyarrow.dataset.Scanner
pyarrow.dataset.Expression
Plasma In-Memory Object Store
pyarrow.plasma.ObjectID
pyarrow.plasma.PlasmaClient
pyarrow.plasma.PlasmaBuffer
CUDA Integration
pyarrow.cuda.Context
pyarrow.cuda.CudaBuffer
pyarrow.cuda.new_host_buffer
pyarrow.cuda.HostBuffer
pyarrow.cuda.BufferReader
pyarrow.cuda.BufferWriter
pyarrow.cuda.serialize_record_batch
pyarrow.cuda.read_record_batch
pyarrow.cuda.read_message
pyarrow.cuda.IpcMemHandle
Miscellaneous
pyarrow.cpu_count
pyarrow.set_cpu_count
pyarrow.get_include
pyarrow.get_libraries
pyarrow.get_library_dirs
Getting Involved
Benchmarks
R
Ruby
Rust
Development
Contributing to Apache Arrow
C++ Development
Building Arrow C++
Development Guidelines
Developing on Windows
Conventions
Fuzzing Arrow C++
Python Development
Daily Development using Archery
Packaging and Testing with Crossbow
Running Docker Builds
Benchmarks
Building the Documentation
API Reference
ΒΆ
Programming Support
General information
Error return and reporting
Memory (management)
Devices
Memory Managers
Buffers
Memory Pools
Allocation Functions
Slicing
Buffer Builders
STL Integration
Data Types
Factory functions
Concrete type subclasses
Primitive
Time-related
Binary-like
Nested
Dictionary-encoded
Fields and Schemas
Arrays
Concrete array subclasses
Non-nested
Nested
Chunked Arrays
Scalars
Factory functions
Concrete scalar subclasses
Array Builders
Concrete builder subclasses
Two-dimensional Datasets
Record Batches
Tables
C Interfaces
ABI Structures
C Data Interface
C Stream Interface
Compute Functions
Datum class
Abstract Function classes
Function registry
Convenience functions
Concrete options classes
Tensors
Dense Tensors
Sparse Tensors
Utilities
Decimal Numbers
Abstract Sequences
Compression
Input / output
Interfaces
Concrete implementations
In-memory streams
Local files
Buffering input / output wrappers
Compressed input / output wrappers
Arrow IPC
IPC options
Reading IPC streams and files
Blocking API
Event-driven API
Statistics
Writing IPC streams and files
Blocking API
Statistics
File Formats
CSV
Line-separated JSON
Parquet reader
Parquet writer
CUDA support
Contexts
Devices
Buffers
Memory Input / Output
IPC
Arrow Flight RPC
Common Types
Clients
Servers
Error Handling
Filesystems
Interface
High-level factory function
Concrete implementations
Dataset
Interface
Partitioning
Dataset discovery/factories
Scanning
Concrete implementations
File System Datasets
File Formats
Conversion of range of
std::tuple
-like to
Table
instances
Programming Support