logo

Specifications and Protocols

  • Format Versioning and Stability
  • Arrow Columnar Format
  • Arrow Flight RPC
  • Integration Testing
  • The Arrow C data interface
  • The Arrow C stream interface
  • Other Data Structures

Libraries

  • Implementation Status
  • C/GLib
  • C++
    • User Guide
      • High-Level Overview
      • Conventions
      • Using Arrow C++ in your own project
      • Memory Management
      • Arrays
      • Data Types
      • Tabular Data
      • Compute Functions
      • Input / output and filesystems
      • Reading and writing the Arrow IPC format
      • Reading and writing Parquet files
      • Reading and Writing CSV files
      • Reading JSON files
      • Tabular Datasets
      • Arrow Flight RPC
    • Examples
      • Minimal build using CMake
      • Arrow Datasets example
      • Row to columnar conversion
      • std::tuple-like ranges to Arrow
    • API Reference
      • Programming Support
      • Memory (management)
      • Data Types
      • Arrays
      • Scalars
      • Array Builders
      • Two-dimensional Datasets
      • C Interfaces
      • Compute Functions
      • Tensors
      • Utilities
      • Input / output
      • Arrow IPC
      • File Formats
      • CUDA support
      • Arrow Flight RPC
      • Filesystems
      • Dataset
  • C#
  • Go
  • Java
    • ValueVector
    • VectorSchemaRoot
    • Reading/Writing IPC formats
    • Java Algorithms
    • Reference (javadoc)
  • JavaScript
  • Julia
  • MATLAB
  • Python
    • Installing PyArrow
    • Memory and IO Interfaces
    • Data Types and In-Memory Data Model
    • Compute Functions
    • Streaming, Serialization, and IPC
    • Filesystem Interface
    • Filesystem Interface (legacy)
      • pyarrow.hdfs.connect
      • pyarrow.HadoopFileSystem.cat
      • pyarrow.HadoopFileSystem.chmod
      • pyarrow.HadoopFileSystem.chown
      • pyarrow.HadoopFileSystem.delete
      • pyarrow.HadoopFileSystem.df
      • pyarrow.HadoopFileSystem.disk_usage
      • pyarrow.HadoopFileSystem.download
      • pyarrow.HadoopFileSystem.exists
      • pyarrow.HadoopFileSystem.get_capacity
      • pyarrow.HadoopFileSystem.get_space_used
      • pyarrow.HadoopFileSystem.info
      • pyarrow.HadoopFileSystem.ls
      • pyarrow.HadoopFileSystem.mkdir
      • pyarrow.HadoopFileSystem.open
      • pyarrow.HadoopFileSystem.rename
      • pyarrow.HadoopFileSystem.rm
      • pyarrow.HadoopFileSystem.upload
      • pyarrow.HdfsFile
    • The Plasma In-Memory Object Store
    • NumPy Integration
    • Pandas Integration
    • Timestamps
    • Reading and Writing CSV files
    • Feather File Format
    • Reading JSON files
    • Reading and Writing the Apache Parquet Format
    • Tabular Datasets
    • CUDA Integration
    • Extending pyarrow
    • Using pyarrow from C++ and Cython Code
    • API Reference
      • Data Types and Schemas
        • pyarrow.null
        • pyarrow.bool_
        • pyarrow.int8
        • pyarrow.int16
        • pyarrow.int32
        • pyarrow.int64
        • pyarrow.uint8
        • pyarrow.uint16
        • pyarrow.uint32
        • pyarrow.uint64
        • pyarrow.float16
        • pyarrow.float32
        • pyarrow.float64
        • pyarrow.time32
        • pyarrow.time64
        • pyarrow.timestamp
        • pyarrow.date32
        • pyarrow.date64
        • pyarrow.binary
        • pyarrow.string
        • pyarrow.utf8
        • pyarrow.large_binary
        • pyarrow.large_string
        • pyarrow.large_utf8
        • pyarrow.decimal128
        • pyarrow.list_
        • pyarrow.large_list
        • pyarrow.map_
        • pyarrow.struct
        • pyarrow.dictionary
        • pyarrow.field
        • pyarrow.schema
        • pyarrow.from_numpy_dtype
        • pyarrow.DataType
        • pyarrow.DictionaryType
        • pyarrow.ListType
        • pyarrow.MapType
        • pyarrow.StructType
        • pyarrow.UnionType
        • pyarrow.TimestampType
        • pyarrow.Time32Type
        • pyarrow.Time64Type
        • pyarrow.FixedSizeBinaryType
        • pyarrow.Decimal128Type
        • pyarrow.Field
        • pyarrow.Schema
        • pyarrow.ExtensionType
        • pyarrow.PyExtensionType
        • pyarrow.register_extension_type
        • pyarrow.unregister_extension_type
        • pyarrow.types.is_boolean
        • pyarrow.types.is_integer
        • pyarrow.types.is_signed_integer
        • pyarrow.types.is_unsigned_integer
        • pyarrow.types.is_int8
        • pyarrow.types.is_int16
        • pyarrow.types.is_int32
        • pyarrow.types.is_int64
        • pyarrow.types.is_uint8
        • pyarrow.types.is_uint16
        • pyarrow.types.is_uint32
        • pyarrow.types.is_uint64
        • pyarrow.types.is_floating
        • pyarrow.types.is_float16
        • pyarrow.types.is_float32
        • pyarrow.types.is_float64
        • pyarrow.types.is_decimal
        • pyarrow.types.is_list
        • pyarrow.types.is_large_list
        • pyarrow.types.is_struct
        • pyarrow.types.is_union
        • pyarrow.types.is_nested
        • pyarrow.types.is_temporal
        • pyarrow.types.is_timestamp
        • pyarrow.types.is_date
        • pyarrow.types.is_date32
        • pyarrow.types.is_date64
        • pyarrow.types.is_time
        • pyarrow.types.is_time32
        • pyarrow.types.is_time64
        • pyarrow.types.is_null
        • pyarrow.types.is_binary
        • pyarrow.types.is_unicode
        • pyarrow.types.is_string
        • pyarrow.types.is_large_binary
        • pyarrow.types.is_large_unicode
        • pyarrow.types.is_large_string
        • pyarrow.types.is_fixed_size_binary
        • pyarrow.types.is_map
        • pyarrow.types.is_dictionary
      • Arrays and Scalars
        • pyarrow.array
        • pyarrow.nulls
        • pyarrow.Array
        • pyarrow.BooleanArray
        • pyarrow.FloatingPointArray
        • pyarrow.IntegerArray
        • pyarrow.Int8Array
        • pyarrow.Int16Array
        • pyarrow.Int32Array
        • pyarrow.Int64Array
        • pyarrow.NullArray
        • pyarrow.NumericArray
        • pyarrow.UInt8Array
        • pyarrow.UInt16Array
        • pyarrow.UInt32Array
        • pyarrow.UInt64Array
        • pyarrow.BinaryArray
        • pyarrow.StringArray
        • pyarrow.FixedSizeBinaryArray
        • pyarrow.LargeBinaryArray
        • pyarrow.LargeStringArray
        • pyarrow.Time32Array
        • pyarrow.Time64Array
        • pyarrow.Date32Array
        • pyarrow.Date64Array
        • pyarrow.TimestampArray
        • pyarrow.Decimal128Array
        • pyarrow.DictionaryArray
        • pyarrow.ListArray
        • pyarrow.FixedSizeListArray
        • pyarrow.LargeListArray
        • pyarrow.StructArray
        • pyarrow.UnionArray
        • pyarrow.ExtensionArray
        • pyarrow.scalar
        • pyarrow.NA
        • pyarrow.Scalar
        • pyarrow.BooleanScalar
        • pyarrow.Int8Scalar
        • pyarrow.Int16Scalar
        • pyarrow.Int32Scalar
        • pyarrow.Int64Scalar
        • pyarrow.UInt8Scalar
        • pyarrow.UInt16Scalar
        • pyarrow.UInt32Scalar
        • pyarrow.UInt64Scalar
        • pyarrow.FloatScalar
        • pyarrow.DoubleScalar
        • pyarrow.BinaryScalar
        • pyarrow.StringScalar
        • pyarrow.FixedSizeBinaryScalar
        • pyarrow.LargeBinaryScalar
        • pyarrow.LargeStringScalar
        • pyarrow.Time32Scalar
        • pyarrow.Time64Scalar
        • pyarrow.Date32Scalar
        • pyarrow.Date64Scalar
        • pyarrow.TimestampScalar
        • pyarrow.Decimal128Scalar
        • pyarrow.DictionaryScalar
        • pyarrow.ListScalar
        • pyarrow.LargeListScalar
        • pyarrow.StructScalar
        • pyarrow.UnionScalar
      • Buffers and Memory
        • pyarrow.allocate_buffer
        • pyarrow.py_buffer
        • pyarrow.foreign_buffer
        • pyarrow.Buffer
        • pyarrow.ResizableBuffer
        • pyarrow.Codec
        • pyarrow.compress
        • pyarrow.decompress
        • pyarrow.MemoryPool
        • pyarrow.default_memory_pool
        • pyarrow.jemalloc_memory_pool
        • pyarrow.mimalloc_memory_pool
        • pyarrow.system_memory_pool
        • pyarrow.jemalloc_set_decay_ms
        • pyarrow.set_memory_pool
        • pyarrow.log_memory_allocations
        • pyarrow.total_allocated_bytes
      • Compute Functions
        • pyarrow.compute.count
        • pyarrow.compute.index
        • pyarrow.compute.mean
        • pyarrow.compute.min_max
        • pyarrow.compute.mode
        • pyarrow.compute.stddev
        • pyarrow.compute.sum
        • pyarrow.compute.variance
        • pyarrow.compute.abs
        • pyarrow.compute.abs_checked
        • pyarrow.compute.add
        • pyarrow.compute.add_checked
        • pyarrow.compute.divide
        • pyarrow.compute.divide_checked
        • pyarrow.compute.multiply
        • pyarrow.compute.multiply_checked
        • pyarrow.compute.power
        • pyarrow.compute.power_checked
        • pyarrow.compute.shift_left
        • pyarrow.compute.shift_left_checked
        • pyarrow.compute.shift_right
        • pyarrow.compute.shift_right_checked
        • pyarrow.compute.sign
        • pyarrow.compute.subtract
        • pyarrow.compute.subtract_checked
        • pyarrow.compute.bit_wise_and
        • pyarrow.compute.bit_wise_not
        • pyarrow.compute.bit_wise_or
        • pyarrow.compute.bit_wise_xor
        • pyarrow.compute.ceil
        • pyarrow.compute.floor
        • pyarrow.compute.trunc
        • pyarrow.compute.ln
        • pyarrow.compute.ln_checked
        • pyarrow.compute.log10
        • pyarrow.compute.log10_checked
        • pyarrow.compute.log1p
        • pyarrow.compute.log1p_checked
        • pyarrow.compute.log2
        • pyarrow.compute.log2_checked
        • pyarrow.compute.acos
        • pyarrow.compute.acos_checked
        • pyarrow.compute.asin
        • pyarrow.compute.asin_checked
        • pyarrow.compute.atan
        • pyarrow.compute.atan2
        • pyarrow.compute.cos
        • pyarrow.compute.cos_checked
        • pyarrow.compute.sin
        • pyarrow.compute.sin_checked
        • pyarrow.compute.tan
        • pyarrow.compute.tan_checked
        • pyarrow.compute.equal
        • pyarrow.compute.greater
        • pyarrow.compute.greater_equal
        • pyarrow.compute.less
        • pyarrow.compute.less_equal
        • pyarrow.compute.not_equal
        • pyarrow.compute.max_element_wise
        • pyarrow.compute.min_element_wise
        • pyarrow.compute.and_
        • pyarrow.compute.and_kleene
        • pyarrow.compute.all
        • pyarrow.compute.any
        • pyarrow.compute.invert
        • pyarrow.compute.or_
        • pyarrow.compute.or_kleene
        • pyarrow.compute.xor
        • pyarrow.compute.ascii_is_alnum
        • pyarrow.compute.ascii_is_alpha
        • pyarrow.compute.ascii_is_decimal
        • pyarrow.compute.ascii_is_lower
        • pyarrow.compute.ascii_is_printable
        • pyarrow.compute.ascii_is_space
        • pyarrow.compute.ascii_is_upper
        • pyarrow.compute.utf8_is_alnum
        • pyarrow.compute.utf8_is_alpha
        • pyarrow.compute.utf8_is_decimal
        • pyarrow.compute.utf8_is_digit
        • pyarrow.compute.utf8_is_lower
        • pyarrow.compute.utf8_is_numeric
        • pyarrow.compute.utf8_is_printable
        • pyarrow.compute.utf8_is_space
        • pyarrow.compute.utf8_is_upper
        • pyarrow.compute.ascii_is_title
        • pyarrow.compute.utf8_is_title
        • pyarrow.compute.string_is_ascii
        • pyarrow.compute.split_pattern
        • pyarrow.compute.split_pattern_regex
        • pyarrow.compute.ascii_split_whitespace
        • pyarrow.compute.utf8_split_whitespace
        • pyarrow.compute.extract_regex
        • pyarrow.compute.binary_join
        • pyarrow.compute.binary_join_element_wise
        • pyarrow.compute.ascii_center
        • pyarrow.compute.ascii_lpad
        • pyarrow.compute.ascii_ltrim
        • pyarrow.compute.ascii_ltrim_whitespace
        • pyarrow.compute.ascii_lower
        • pyarrow.compute.ascii_reverse
        • pyarrow.compute.ascii_rpad
        • pyarrow.compute.ascii_rtrim
        • pyarrow.compute.ascii_rtrim_whitespace
        • pyarrow.compute.ascii_trim
        • pyarrow.compute.ascii_upper
        • pyarrow.compute.binary_length
        • pyarrow.compute.binary_replace_slice
        • pyarrow.compute.replace_substring
        • pyarrow.compute.replace_substring_regex
        • pyarrow.compute.utf8_center
        • pyarrow.compute.utf8_length
        • pyarrow.compute.utf8_lower
        • pyarrow.compute.utf8_lpad
        • pyarrow.compute.utf8_ltrim
        • pyarrow.compute.utf8_ltrim_whitespace
        • pyarrow.compute.utf8_replace_slice
        • pyarrow.compute.utf8_reverse
        • pyarrow.compute.utf8_rpad
        • pyarrow.compute.utf8_rtrim
        • pyarrow.compute.utf8_rtrim_whitespace
        • pyarrow.compute.utf8_trim
        • pyarrow.compute.utf8_upper
        • pyarrow.compute.count_substring
        • pyarrow.compute.count_substring_regex
        • pyarrow.compute.ends_with
        • pyarrow.compute.find_substring
        • pyarrow.compute.find_substring_regex
        • pyarrow.compute.index_in
        • pyarrow.compute.is_in
        • pyarrow.compute.match_like
        • pyarrow.compute.match_substring
        • pyarrow.compute.match_substring_regex
        • pyarrow.compute.starts_with
        • pyarrow.compute.cast
        • pyarrow.compute.strptime
        • pyarrow.compute.replace_with_mask
        • pyarrow.compute.filter
        • pyarrow.compute.take
        • pyarrow.compute.dictionary_encode
        • pyarrow.compute.unique
        • pyarrow.compute.value_counts
        • pyarrow.compute.partition_nth_indices
        • pyarrow.compute.sort_indices
        • pyarrow.compute.binary_length
        • pyarrow.compute.case_when
        • pyarrow.compute.coalesce
        • pyarrow.compute.fill_null
        • pyarrow.compute.if_else
        • pyarrow.compute.is_finite
        • pyarrow.compute.is_inf
        • pyarrow.compute.is_nan
        • pyarrow.compute.is_null
        • pyarrow.compute.is_valid
        • pyarrow.compute.list_value_length
        • pyarrow.compute.list_flatten
        • pyarrow.compute.list_parent_indices
      • Streams and File Access
        • pyarrow.input_stream
        • pyarrow.output_stream
        • pyarrow.memory_map
        • pyarrow.create_memory_map
        • pyarrow.NativeFile
        • pyarrow.OSFile
        • pyarrow.PythonFile
        • pyarrow.BufferReader
        • pyarrow.BufferOutputStream
        • pyarrow.FixedSizeBufferWriter
        • pyarrow.MemoryMappedFile
        • pyarrow.CompressedInputStream
        • pyarrow.CompressedOutputStream
        • pyarrow.hdfs.connect
        • pyarrow.LocalFileSystem
      • Tables and Tensors
        • pyarrow.chunked_array
        • pyarrow.concat_arrays
        • pyarrow.concat_tables
        • pyarrow.record_batch
        • pyarrow.table
        • pyarrow.ChunkedArray
        • pyarrow.RecordBatch
        • pyarrow.Table
        • pyarrow.Tensor
      • Serialization and IPC
        • pyarrow.ipc.new_file
        • pyarrow.ipc.open_file
        • pyarrow.ipc.new_stream
        • pyarrow.ipc.open_stream
        • pyarrow.ipc.read_message
        • pyarrow.ipc.read_record_batch
        • pyarrow.ipc.get_record_batch_size
        • pyarrow.ipc.read_tensor
        • pyarrow.ipc.write_tensor
        • pyarrow.ipc.get_tensor_size
        • pyarrow.ipc.IpcWriteOptions
        • pyarrow.ipc.Message
        • pyarrow.ipc.MessageReader
        • pyarrow.ipc.RecordBatchFileReader
        • pyarrow.ipc.RecordBatchFileWriter
        • pyarrow.ipc.RecordBatchStreamReader
        • pyarrow.ipc.RecordBatchStreamWriter
        • pyarrow.serialize
        • pyarrow.serialize_to
        • pyarrow.deserialize
        • pyarrow.deserialize_components
        • pyarrow.deserialize_from
        • pyarrow.read_serialized
        • pyarrow.SerializedPyObject
        • pyarrow.SerializationContext
      • Arrow Flight
        • pyarrow.flight.Action
        • pyarrow.flight.ActionType
        • pyarrow.flight.DescriptorType
        • pyarrow.flight.FlightDescriptor
        • pyarrow.flight.FlightEndpoint
        • pyarrow.flight.FlightInfo
        • pyarrow.flight.Location
        • pyarrow.flight.Ticket
        • pyarrow.flight.Result
        • pyarrow.flight.FlightCallOptions
        • pyarrow.flight.FlightClient
        • pyarrow.flight.ClientMiddlewareFactory
        • pyarrow.flight.ClientMiddleware
        • pyarrow.flight.FlightServerBase
        • pyarrow.flight.GeneratorStream
        • pyarrow.flight.RecordBatchStream
        • pyarrow.flight.ServerMiddlewareFactory
        • pyarrow.flight.ServerMiddleware
        • pyarrow.flight.ClientAuthHandler
        • pyarrow.flight.ServerAuthHandler
        • pyarrow.flight.FlightMethod
        • pyarrow.flight.CallInfo
      • Tabular File Formats
        • pyarrow.csv.ConvertOptions
        • pyarrow.csv.CSVStreamingReader
        • pyarrow.csv.CSVWriter
        • pyarrow.csv.ISO8601
        • pyarrow.csv.ParseOptions
        • pyarrow.csv.ReadOptions
        • pyarrow.csv.WriteOptions
        • pyarrow.csv.open_csv
        • pyarrow.csv.read_csv
        • pyarrow.csv.write_csv
        • pyarrow.feather.read_feather
        • pyarrow.feather.read_table
        • pyarrow.feather.write_feather
        • pyarrow.json.ReadOptions
        • pyarrow.json.ParseOptions
        • pyarrow.json.read_json
        • pyarrow.parquet.ParquetDataset
        • pyarrow.parquet.ParquetFile
        • pyarrow.parquet.ParquetWriter
        • pyarrow.parquet.read_table
        • pyarrow.parquet.read_metadata
        • pyarrow.parquet.read_pandas
        • pyarrow.parquet.read_schema
        • pyarrow.parquet.write_metadata
        • pyarrow.parquet.write_table
        • pyarrow.parquet.write_to_dataset
        • pyarrow.orc.ORCFile
      • Filesystems
        • pyarrow.fs.FileInfo
        • pyarrow.fs.FileSelector
        • pyarrow.fs.FileSystem
        • pyarrow.fs.LocalFileSystem
        • pyarrow.fs.S3FileSystem
        • pyarrow.fs.HadoopFileSystem
        • pyarrow.fs.SubTreeFileSystem
        • pyarrow.fs.PyFileSystem
        • pyarrow.fs.FileSystemHandler
        • pyarrow.fs.FSSpecHandler
      • Dataset
        • pyarrow.dataset.dataset
        • pyarrow.dataset.parquet_dataset
        • pyarrow.dataset.partitioning
        • pyarrow.dataset.field
        • pyarrow.dataset.scalar
        • pyarrow.dataset.write_dataset
        • pyarrow.dataset.FileFormat
        • pyarrow.dataset.ParquetFileFormat
        • pyarrow.dataset.Partitioning
        • pyarrow.dataset.PartitioningFactory
        • pyarrow.dataset.DirectoryPartitioning
        • pyarrow.dataset.HivePartitioning
        • pyarrow.dataset.Dataset
        • pyarrow.dataset.FileSystemDataset
        • pyarrow.dataset.FileSystemFactoryOptions
        • pyarrow.dataset.FileSystemDatasetFactory
        • pyarrow.dataset.UnionDataset
        • pyarrow.dataset.Scanner
        • pyarrow.dataset.Expression
      • Plasma In-Memory Object Store
        • pyarrow.plasma.ObjectID
        • pyarrow.plasma.PlasmaClient
        • pyarrow.plasma.PlasmaBuffer
      • CUDA Integration
        • pyarrow.cuda.Context
        • pyarrow.cuda.CudaBuffer
        • pyarrow.cuda.new_host_buffer
        • pyarrow.cuda.HostBuffer
        • pyarrow.cuda.BufferReader
        • pyarrow.cuda.BufferWriter
        • pyarrow.cuda.serialize_record_batch
        • pyarrow.cuda.read_record_batch
        • pyarrow.cuda.read_message
        • pyarrow.cuda.IpcMemHandle
      • Miscellaneous
        • pyarrow.cpu_count
        • pyarrow.set_cpu_count
        • pyarrow.get_include
        • pyarrow.get_libraries
        • pyarrow.get_library_dirs
    • Getting Involved
    • Benchmarks
  • R
  • Ruby
  • Rust

Development

  • Contributing to Apache Arrow
  • C++ Development
    • Building Arrow C++
    • Development Guidelines
    • Developing on Windows
    • Conventions
    • Fuzzing Arrow C++
  • Python Development
  • Daily Development using Archery
  • Packaging and Testing with Crossbow
  • Running Docker Builds
  • Benchmarks
  • Building the Documentation

Java ImplementationΒΆ

This is the documentation of the Java API of Apache Arrow. For more details on the Arrow format and other language bindings see the parent documentation.

  • ValueVector
    • Vector Life Cycle
    • Building ValueVector
    • Building ListVector
    • Slicing
  • VectorSchemaRoot
  • Reading/Writing IPC formats
    • Writing and Reading Streaming Format
    • Writing and Reading Random Access Files
  • Java Algorithms
    • Comparing Vector Elements
    • Vector Element Search
    • Vector Sorting
    • Other Algorithms
  • Reference (javadoc)
Dataset ValueVector

© Copyright 2016-2021 Apache Software Foundation.

Created using Sphinx 3.5.4.