Arrow Flight RPC

Note

Flight is currently unstable. APIs are subject to change, though we don’t expect drastic changes.

Note

Flight is currently only available when built from source appropriately.

Common Types

struct Action

An action to perform with the DoAction RPC.

Public Members

std::string type

The action type.

std::shared_ptr<Buffer> body

The action content as a Buffer.

struct ActionType

A type of action that can be performed with the DoAction RPC.

Public Members

std::string type

The name of the action.

std::string description

A human-readable description of the action.

struct Criteria

Opaque selection criteria for ListFlights RPC.

Public Members

std::string expression

Opaque criteria expression, dependent on server implementation.

struct FlightDescriptor

A request to retrieve or generate a dataset.

Public Functions

std::string ToString() const

Get a human-readable form of this descriptor.

Status SerializeToString(std::string *out) const

Get the wire-format representation of this type.

Useful when interoperating with non-Flight systems (e.g. REST services) that may want to return Flight types.

Public Members

DescriptorType type

The descriptor type.

std::string cmd

Opaque value used to express a command.

Should only be defined when type is CMD

std::vector<std::string> path

List of strings identifying a particular dataset.

Should only be defined when type is PATH

Public Static Functions

static Status Deserialize(const std::string &serialized, FlightDescriptor *out)

Parse the wire-format representation of this type.

Useful when interoperating with non-Flight systems (e.g. REST services) that may want to return Flight types.

struct FlightEndpoint

A flight ticket and list of locations where the ticket can be redeemed.

Public Members

Ticket ticket

Opaque ticket identify; use with DoGet RPC.

std::vector<Location> locations

List of locations where ticket can be redeemed.

If the list is empty, the ticket can only be redeemed on the current service where the ticket was generated

class FlightInfo

The access coordinates for retireval of a dataset, returned by GetFlightInfo.

Public Functions

Status GetSchema(ipc::DictionaryMemo *dictionary_memo, std::shared_ptr<Schema> *out) const

Deserialize the Arrow schema of the dataset, to be passed to each call to DoGet.

Populate any dictionary encoded fields into a DictionaryMemo for bookkeeping

Parameters
  • [inout] dictionary_memo: for dictionary bookkeeping, will be modified

  • [out] out: the reconstructed Schema

const FlightDescriptor &descriptor() const

The descriptor associated with this flight, may not be set.

const std::vector<FlightEndpoint> &endpoints() const

A list of endpoints associated with the flight (dataset).

To consume the whole flight, all endpoints must be consumed

int64_t total_records() const

The total number of records (rows) in the dataset. If unknown, set to -1.

int64_t total_bytes() const

The total number of bytes in the dataset. If unknown, set to -1.

Status SerializeToString(std::string *out) const

Get the wire-format representation of this type.

Useful when interoperating with non-Flight systems (e.g. REST services) that may want to return Flight types.

Public Static Functions

static Status Deserialize(const std::string &serialized, std::unique_ptr<FlightInfo> *out)

Parse the wire-format representation of this type.

Useful when interoperating with non-Flight systems (e.g. REST services) that may want to return Flight types.

struct FlightPayload

Staging data structure for messages about to be put on the wire.

This structure corresponds to FlightData in the protocol.

class FlightListing

An iterator to FlightInfo instances returned by ListFlights.

Subclassed by arrow::flight::SimpleFlightListing

Public Functions

virtual Status Next(std::unique_ptr<FlightInfo> *info) = 0

Retrieve the next FlightInfo from the iterator.

Return

Status

Parameters
  • [out] info: A single FlightInfo. Set to nullptr if there are none left.

struct Location

A host location (a URI)

Public Functions

Location()

Initialize a blank location.

std::string ToString() const

Get a representation of this URI as a string.

std::string scheme() const

Get the scheme of this URI.

Public Static Functions

static Status Parse(const std::string &uri_string, Location *location)

Initialize a location by parsing a URI string.

static Status ForGrpcTcp(const std::string &host, const int port, Location *location)

Initialize a location for a non-TLS, gRPC-based Flight service from a host and port.

Parameters
  • [in] host: The hostname to connect to

  • [in] port: The port

  • [out] location: The resulting location

static Status ForGrpcTls(const std::string &host, const int port, Location *location)

Initialize a location for a TLS-enabled, gRPC-based Flight service from a host and port.

Parameters
  • [in] host: The hostname to connect to

  • [in] port: The port

  • [out] location: The resulting location

static Status ForGrpcUnix(const std::string &path, Location *location)

Initialize a location for a domain socket-based Flight service.

Parameters
  • [in] path: The path to the domain socket

  • [out] location: The resulting location

class MetadataRecordBatchReader

An interface to read Flight data with metadata.

Subclassed by arrow::flight::FlightMessageReader, arrow::flight::FlightStreamReader

Public Functions

virtual std::shared_ptr<Schema> schema() const = 0

Get the schema for this stream.

virtual Status Next(FlightStreamChunk *next) = 0

Get the next message from Flight.

If the stream is finished, then the members of FlightStreamChunk will be nullptr.

virtual Status ReadAll(std::vector<std::shared_ptr<RecordBatch>> *batches)

Consume entire stream as a vector of record batches.

virtual Status ReadAll(std::shared_ptr<Table> *table)

Consume entire stream as a Table.

struct Result

Opaque result returned after executing an action.

class ResultStream

An iterator to Result instances returned by DoAction.

Subclassed by arrow::flight::SimpleResultStream, arrow::py::flight::PyFlightResultStream

Public Functions

virtual Status Next(std::unique_ptr<Result> *info) = 0

Retrieve the next Result from the iterator.

Return

Status

Parameters
  • [out] info: A single result. Set to nullptr if there are none left.

struct Ticket

Data structure providing an opaque identifier or credential to use when requesting a data stream with the DoGet RPC.

Public Functions

Status SerializeToString(std::string *out) const

Get the wire-format representation of this type.

Useful when interoperating with non-Flight systems (e.g. REST services) that may want to return Flight types.

Public Static Functions

static Status Deserialize(const std::string &serialized, Ticket *out)

Parse the wire-format representation of this type.

Useful when interoperating with non-Flight systems (e.g. REST services) that may want to return Flight types.

Clients

class FlightClient

Client class for Arrow Flight RPC services (gRPC-based).

API experimental for now

Public Functions

Status Authenticate(const FlightCallOptions &options, std::unique_ptr<ClientAuthHandler> auth_handler)

Authenticate to the server using the given handler.

Return

Status OK if the client authenticated successfully

Parameters
  • [in] options: Per-RPC options

  • [in] auth_handler: The authentication mechanism to use

Status DoAction(const FlightCallOptions &options, const Action &action, std::unique_ptr<ResultStream> *results)

Perform the indicated action, returning an iterator to the stream of results, if any.

Return

Status

Parameters
  • [in] options: Per-RPC options

  • [in] action: the action to be performed

  • [out] results: an iterator object for reading the returned results

Status ListActions(const FlightCallOptions &options, std::vector<ActionType> *actions)

Retrieve a list of available Action types.

Return

Status

Parameters
  • [in] options: Per-RPC options

  • [out] actions: the available actions

Status GetFlightInfo(const FlightCallOptions &options, const FlightDescriptor &descriptor, std::unique_ptr<FlightInfo> *info)

Request access plan for a single flight, which may be an existing dataset or a command to be executed.

Return

Status

Parameters
  • [in] options: Per-RPC options

  • [in] descriptor: the dataset request, whether a named dataset or command

  • [out] info: the FlightInfo describing where to access the dataset

Status GetSchema(const FlightCallOptions &options, const FlightDescriptor &descriptor, std::unique_ptr<SchemaResult> *schema_result)

Request schema for a single flight, which may be an existing dataset or a command to be executed.

Return

Status

Parameters
  • [in] options: Per-RPC options

  • [in] descriptor: the dataset request, whether a named dataset or command

  • [out] schema_result: the SchemaResult describing the dataset schema

Status ListFlights(std::unique_ptr<FlightListing> *listing)

List all available flights known to the server.

Return

Status

Parameters
  • [out] listing: an iterator that returns a FlightInfo for each flight

Status ListFlights(const FlightCallOptions &options, const Criteria &criteria, std::unique_ptr<FlightListing> *listing)

List available flights given indicated filter criteria.

Return

Status

Parameters
  • [in] options: Per-RPC options

  • [in] criteria: the filter criteria (opaque)

  • [out] listing: an iterator that returns a FlightInfo for each flight

Status DoGet(const FlightCallOptions &options, const Ticket &ticket, std::unique_ptr<FlightStreamReader> *stream)

Given a flight ticket and schema, request to be sent the stream.

Returns record batch stream reader

Return

Status

Parameters
  • [in] options: Per-RPC options

  • [in] ticket: The flight ticket to use

  • [out] stream: the returned RecordBatchReader

Status DoPut(const FlightCallOptions &options, const FlightDescriptor &descriptor, const std::shared_ptr<Schema> &schema, std::unique_ptr<FlightStreamWriter> *stream, std::unique_ptr<FlightMetadataReader> *reader)

Upload data to a Flight described by the given descriptor.

The caller must call Close() on the returned stream once they are done writing.

The reader and writer are linked; closing the writer will also close the reader. Use DoneWriting to only close the write side of the channel.

Return

Status

Parameters
  • [in] options: Per-RPC options

  • [in] descriptor: the descriptor of the stream

  • [in] schema: the schema for the data to upload

  • [out] stream: a writer to write record batches to

  • [out] reader: a reader for application metadata from the server

Public Static Functions

static Status Connect(const Location &location, std::unique_ptr<FlightClient> *client)

Connect to an unauthenticated flight service.

Return

Status OK status may not indicate that the connection was successful

Parameters
  • [in] location: the URI

  • [out] client: the created FlightClient

static Status Connect(const Location &location, const FlightClientOptions &options, std::unique_ptr<FlightClient> *client)

Connect to an unauthenticated flight service.

Return

Status OK status may not indicate that the connection was successful

Parameters
  • [in] location: the URI

  • [in] options: Other options for setting up the client

  • [out] client: the created FlightClient

class FlightClientOptions

Public Members

std::string tls_root_certs

Root certificates to use for validating server certificates.

std::string override_hostname

Override the hostname checked by TLS. Use with caution.

class FlightCallOptions

Hints to the underlying RPC layer for Arrow Flight calls.

Public Functions

FlightCallOptions()

Create a default set of call options.

Public Members

TimeoutDuration timeout

An optional timeout for this call.

Negative durations mean an implementation-defined default behavior will be used instead. This is the default value.

class ClientAuthHandler

An authentication implementation for a Flight service.

Authentication includes both an initial negotiation and a per-call token validation. Implementations may choose to use either or both mechanisms.

Subclassed by arrow::py::flight::PyClientAuthHandler

Public Functions

virtual Status Authenticate(ClientAuthSender *outgoing, ClientAuthReader *incoming) = 0

Authenticate the client on initial connection.

The client can send messages to/read responses from the server at any time.

Return

Status OK if authenticated successfully

virtual Status GetToken(std::string *token) = 0

Get a per-call token.

Parameters
  • [out] token: The token to send to the server.

typedef std::chrono::duration<double, std::chrono::seconds::period> arrow::flight::TimeoutDuration

A duration type for Flight call timeouts.

class FlightStreamReader : public arrow::flight::MetadataRecordBatchReader

A RecordBatchReader exposing Flight metadata and cancel operations.

Public Functions

virtual void Cancel() = 0

Try to cancel the call.

class FlightStreamWriter : public arrow::ipc::RecordBatchWriter

A RecordBatchWriter that also allows sending application-defined metadata via the Flight protocol.

Public Functions

virtual Status DoneWriting() = 0

Indicate that the application is done writing to this stream.

The application may not write to this stream after calling this. This differs from closing the stream because this writer may represent only one half of a readable and writable stream.

Servers

class FlightServerBase

Skeleton RPC server implementation which can be used to create custom servers by implementing its abstract methods.

Subclassed by arrow::py::flight::PyFlightServer

Public Functions

Status Init(FlightServerOptions &options)

Initialize a Flight server listening at the given location.

This method must be called before any other method.

Parameters
  • [in] options: The configuration for this server.

int port() const

Get the port that the Flight server is listening on.

This method must only be called after Init(). Will return a non-positive value if no port exists (e.g. when listening on a domain socket).

Status SetShutdownOnSignals(const std::vector<int> sigs)

Set the server to stop when receiving any of the given signal numbers.

This method must be called before Serve().

Status Serve()

Start serving.

This method blocks until either Shutdown() is called or one of the signals registered in SetShutdownOnSignals() is received.

int GotSignal() const

Query whether Serve() was interrupted by a signal.

This method must be called after Serve() has returned.

Return

int the signal number that interrupted Serve(), if any, otherwise 0

Status Shutdown()

Shut down the server.

Can be called from signal handler or another thread while Serve() blocks.

TODO(wesm): Shutdown with deadline

Status Wait()

Block until server is terminated with Shutdown.

virtual Status ListFlights(const ServerCallContext &context, const Criteria *criteria, std::unique_ptr<FlightListing> *listings)

Retrieve a list of available fields given an optional opaque criteria.

Return

Status

Parameters
  • [in] context: The call context.

  • [in] criteria: may be null

  • [out] listings: the returned listings iterator

virtual Status GetFlightInfo(const ServerCallContext &context, const FlightDescriptor &request, std::unique_ptr<FlightInfo> *info)

Retrieve the schema and an access plan for the indicated descriptor.

Return

Status

Parameters
  • [in] context: The call context.

  • [in] request: may be null

  • [out] info: the returned flight info provider

virtual Status GetSchema(const ServerCallContext &context, const FlightDescriptor &request, std::unique_ptr<SchemaResult> *schema)

Retrieve the schema for the indicated descriptor.

Return

Status

Parameters
  • [in] context: The call context.

  • [in] request: may be null

  • [out] schema: the returned flight schema provider

virtual Status DoGet(const ServerCallContext &context, const Ticket &request, std::unique_ptr<FlightDataStream> *stream)

Get a stream of IPC payloads to put on the wire.

Return

Status

Parameters
  • [in] context: The call context.

  • [in] request: an opaque ticket

  • [out] stream: the returned stream provider

virtual Status DoPut(const ServerCallContext &context, std::unique_ptr<FlightMessageReader> reader, std::unique_ptr<FlightMetadataWriter> writer)

Process a stream of IPC payloads sent from a client.

Return

Status

Parameters
  • [in] context: The call context.

  • [in] reader: a sequence of uploaded record batches

  • [in] writer: send metadata back to the client

virtual Status DoAction(const ServerCallContext &context, const Action &action, std::unique_ptr<ResultStream> *result)

Execute an action, return stream of zero or more results.

Return

Status

Parameters
  • [in] context: The call context.

  • [in] action: the action to execute, with type and body

  • [out] result: the result iterator

virtual Status ListActions(const ServerCallContext &context, std::vector<ActionType> *actions)

Retrieve the list of available actions.

Return

Status

Parameters
  • [in] context: The call context.

  • [out] actions: a vector of available action types

class FlightServerOptions

Public Members

Location location

The host & port (or domain socket path) to listen on.

Use port 0 to bind to an available port.

std::unique_ptr<ServerAuthHandler> auth_handler

The authentication handler to use.

std::vector<CertKeyPair> tls_certificates

A list of TLS certificate+key pairs to use.

std::function<void(void *)> builder_hook

A Flight implementation-specific callback to customize transport-specific options.

Not guaranteed to be called. The type of the parameter is specific to the Flight implementation. Users should take care to link to the same transport implementation as Flight to avoid runtime problems.

struct CertKeyPair

A TLS certificate plus key.

Public Members

std::string pem_cert

The certificate in PEM format.

std::string pem_key

The key in PEM format.

class FlightDataStream

Interface that produces a sequence of IPC payloads to be sent in FlightData protobuf messages.

Subclassed by arrow::flight::RecordBatchStream, arrow::py::flight::PyFlightDataStream, arrow::py::flight::PyGeneratorFlightDataStream

Public Functions

virtual Status GetSchemaPayload(FlightPayload *payload) = 0

Compute FlightPayload containing serialized RecordBatch schema.

class FlightMessageReader : public arrow::flight::MetadataRecordBatchReader

A reader for IPC payloads uploaded by a client.

Also allows reading application-defined metadata via the Flight protocol.

Public Functions

virtual const FlightDescriptor &descriptor() const = 0

Get the descriptor for this upload.

class FlightMetadataWriter

A writer for application-specific metadata sent back to the client during an upload.

Public Functions

virtual Status WriteMetadata(const Buffer &app_metadata) = 0

Send a message to the client.

class RecordBatchStream : public arrow::flight::FlightDataStream

A basic implementation of FlightDataStream that will provide a sequence of FlightData messages to be written to a gRPC stream.

Public Functions

RecordBatchStream(const std::shared_ptr<RecordBatchReader> &reader, MemoryPool *pool = default_memory_pool())

Parameters
  • [in] reader: produces a sequence of record batches

  • [inout] pool: a MemoryPool to use for allocations

Status GetSchemaPayload(FlightPayload *payload)

Compute FlightPayload containing serialized RecordBatch schema.

class ServerAuthHandler

An authentication implementation for a Flight service.

Authentication includes both an initial negotiation and a per-call token validation. Implementations may choose to use either or both mechanisms. An implementation may need to track some state, e.g. a mapping of client tokens to authenticated identities.

Subclassed by arrow::flight::NoOpAuthHandler, arrow::py::flight::PyServerAuthHandler

Public Functions

virtual Status Authenticate(ServerAuthSender *outgoing, ServerAuthReader *incoming) = 0

Authenticate the client on initial connection.

The server can send and read responses from the client at any time.

virtual Status IsValid(const std::string &token, std::string *peer_identity) = 0

Validate a per-call client token.

Return

Status OK if the token is valid, any other status if validation failed

Parameters
  • [in] token: The client token. May be the empty string if the client does not provide a token.

  • [out] peer_identity: The identity of the peer, if this authentication method supports it.

class ServerCallContext

Call state/contextual data.

Public Functions

virtual const std::string &peer_identity() const = 0

The name of the authenticated peer (may be the empty string)

class SimpleFlightListing : public arrow::flight::FlightListing

A FlightListing implementation based on a vector of FlightInfo objects.

This can be iterated once, then it is consumed.

Public Functions

Status Next(std::unique_ptr<FlightInfo> *info)

Retrieve the next FlightInfo from the iterator.

Return

Status

Parameters
  • [out] info: A single FlightInfo. Set to nullptr if there are none left.

class SimpleResultStream : public arrow::flight::ResultStream

A ResultStream implementation based on a vector of Result objects.

This can be iterated once, then it is consumed.

Public Functions

Status Next(std::unique_ptr<Result> *info)

Retrieve the next Result from the iterator.

Return

Status

Parameters
  • [out] info: A single result. Set to nullptr if there are none left.

Error Handling

Error handling uses the normal arrow::Status class, combined with a custom arrow::StatusDetail object for Flight-specific error codes.

enum arrow::flight::FlightStatusCode

A Flight-specific status code.

Values:

Internal

An implementation error has occurred.

TimedOut

A request timed out.

Cancelled

A request was cancelled.

Unauthenticated

We are not authenticated to the remote service.

Unauthorized

We do not have permission to make this request.

Unavailable

The remote service cannot handle this request at the moment.

class FlightStatusDetail : public arrow::StatusDetail

Flight-specific error information in a Status.

Public Functions

const char *type_id() const

Return a unique id for the type of the StatusDetail (effectively a poor man’s substitude for RTTI).

std::string ToString() const

Produce a human-readable description of this status.

FlightStatusCode code() const

Get the Flight status code.

std::string CodeAsString() const

Get the human-readable name of the status code.

Public Static Functions

static std::shared_ptr<FlightStatusDetail> UnwrapStatus(const arrow::Status &status)

Try to extract a FlightStatusDetail from any Arrow status.

Return

a FlightStatusDetail if it could be unwrapped, nullptr otherwise

Status arrow::flight::MakeFlightError(FlightStatusCode code, const std::string &message)

Make an appropriate Arrow status for the given Flight-specific status.

Parameters
  • code: The status code.

  • message: The message for the error.