Arrow Flight RPC¶
Note
Flight is currently unstable. APIs are subject to change, though we don’t expect drastic changes.
Note
Flight is currently only available when built from source appropriately.
Common Types¶
-
struct
arrow::flight
::
Action
¶ An action to perform with the DoAction RPC.
-
struct
arrow::flight
::
ActionType
¶ A type of action that can be performed with the DoAction RPC.
-
class
arrow::flight
::
AddCallHeaders
¶ A write-only wrapper around headers for an RPC call.
Public Functions
-
void
AddHeader
(const std::string &key, const std::string &value) = 0¶ Add a header to be sent to the client.
-
void
-
struct
arrow::flight
::
CallInfo
¶ Information about an instance of a Flight RPC.
Public Members
-
FlightMethod
method
¶ The RPC method of this call.
-
FlightMethod
-
struct
arrow::flight
::
Criteria
¶ Opaque selection criteria for ListFlights RPC.
Public Members
-
std::string
expression
¶ Opaque criteria expression, dependent on server implementation.
-
std::string
-
struct
arrow::flight
::
FlightDescriptor
¶ A request to retrieve or generate a dataset.
Public Functions
-
std::string
ToString
() const¶ Get a human-readable form of this descriptor.
Public Members
-
DescriptorType
type
¶ The descriptor type.
-
std::string
cmd
¶ Opaque value used to express a command.
Should only be defined when type is CMD
-
std::vector<std::string>
path
¶ List of strings identifying a particular dataset.
Should only be defined when type is PATH
Public Static Functions
-
Status
Deserialize
(const std::string &serialized, FlightDescriptor *out)¶ Parse the wire-format representation of this type.
Useful when interoperating with non-Flight systems (e.g. REST services) that may want to return Flight types.
-
std::string
-
struct
arrow::flight
::
FlightEndpoint
¶ A flight ticket and list of locations where the ticket can be redeemed.
-
class
arrow::flight
::
FlightInfo
¶ The access coordinates for retireval of a dataset, returned by GetFlightInfo.
Public Functions
Deserialize the Arrow schema of the dataset, to be passed to each call to DoGet.
Populate any dictionary encoded fields into a DictionaryMemo for bookkeeping
- Parameters
[inout] dictionary_memo
: for dictionary bookkeeping, will be modified[out] out
: the reconstructed Schema
-
const FlightDescriptor &
descriptor
() const¶ The descriptor associated with this flight, may not be set.
-
const std::vector<FlightEndpoint> &
endpoints
() const¶ A list of endpoints associated with the flight (dataset).
To consume the whole flight, all endpoints must be consumed
-
int64_t
total_records
() const¶ The total number of records (rows) in the dataset. If unknown, set to -1.
-
int64_t
total_bytes
() const¶ The total number of bytes in the dataset. If unknown, set to -1.
Public Static Functions
-
arrow::Result<FlightInfo>
Make
(const Schema &schema, const FlightDescriptor &descriptor, const std::vector<FlightEndpoint> &endpoints, int64_t total_records, int64_t total_bytes)¶ Factory method to construct a FlightInfo.
-
Status
Deserialize
(const std::string &serialized, std::unique_ptr<FlightInfo> *out)¶ Parse the wire-format representation of this type.
Useful when interoperating with non-Flight systems (e.g. REST services) that may want to return Flight types.
-
struct
Data
¶
-
struct
FlightPayload
¶ Staging data structure for messages about to be put on the wire.
This structure corresponds to FlightData in the protocol.
-
class
arrow::flight
::
FlightListing
¶ An iterator to FlightInfo instances returned by ListFlights.
Subclassed by arrow::flight::SimpleFlightListing
Public Functions
-
Status
Next
(std::unique_ptr<FlightInfo> *info) = 0¶ Retrieve the next FlightInfo from the iterator.
- Return
- Parameters
[out] info
: A single FlightInfo. Set to nullptr if there are none left.
-
Status
-
enum
arrow::flight
::
FlightMethod
¶ An enumeration of the RPC methods Flight implements.
Values:
-
enumerator
Invalid
¶
-
enumerator
Handshake
¶
-
enumerator
ListFlights
¶
-
enumerator
GetFlightInfo
¶
-
enumerator
GetSchema
¶
-
enumerator
DoGet
¶
-
enumerator
DoPut
¶
-
enumerator
DoAction
¶
-
enumerator
ListActions
¶
-
enumerator
DoExchange
¶
-
enumerator
-
struct
arrow::flight
::
Location
¶ A host location (a URI)
Public Functions
-
Location
()¶ Initialize a blank location.
-
std::string
ToString
() const¶ Get a representation of this URI as a string.
-
std::string
scheme
() const¶ Get the scheme of this URI.
Public Static Functions
-
Status
Parse
(const std::string &uri_string, Location *location)¶ Initialize a location by parsing a URI string.
-
Status
ForGrpcTcp
(const std::string &host, const int port, Location *location)¶ Initialize a location for a non-TLS, gRPC-based Flight service from a host and port.
- Parameters
[in] host
: The hostname to connect to[in] port
: The port[out] location
: The resulting location
-
-
class
arrow::flight
::
MetadataRecordBatchReader
¶ An interface to read Flight data with metadata.
Subclassed by arrow::flight::FlightMessageReader, arrow::flight::FlightStreamReader
-
struct
Result
¶ Opaque result returned after executing an action.
-
class
arrow::flight
::
ResultStream
¶ An iterator to Result instances returned by DoAction.
Subclassed by arrow::flight::SimpleResultStream, arrow::py::flight::PyFlightResultStream
-
struct
arrow::flight
::
Ticket
¶ Data structure providing an opaque identifier or credential to use when requesting a data stream with the DoGet RPC.
Public Functions
Clients¶
-
class
arrow::flight
::
FlightClient
¶ Client class for Arrow Flight RPC services (gRPC-based).
API experimental for now
Public Functions
-
Status
Authenticate
(const FlightCallOptions &options, std::unique_ptr<ClientAuthHandler> auth_handler)¶ Authenticate to the server using the given handler.
- Return
Status OK if the client authenticated successfully
- Parameters
[in] options
: Per-RPC options[in] auth_handler
: The authentication mechanism to use
-
Status
DoAction
(const FlightCallOptions &options, const Action &action, std::unique_ptr<ResultStream> *results)¶ Perform the indicated action, returning an iterator to the stream of results, if any.
- Return
- Parameters
[in] options
: Per-RPC options[in] action
: the action to be performed[out] results
: an iterator object for reading the returned results
-
Status
ListActions
(const FlightCallOptions &options, std::vector<ActionType> *actions)¶ Retrieve a list of available Action types.
- Return
- Parameters
[in] options
: Per-RPC options[out] actions
: the available actions
-
Status
GetFlightInfo
(const FlightCallOptions &options, const FlightDescriptor &descriptor, std::unique_ptr<FlightInfo> *info)¶ Request access plan for a single flight, which may be an existing dataset or a command to be executed.
- Return
- Parameters
[in] options
: Per-RPC options[in] descriptor
: the dataset request, whether a named dataset or command[out] info
: the FlightInfo describing where to access the dataset
-
Status
GetSchema
(const FlightCallOptions &options, const FlightDescriptor &descriptor, std::unique_ptr<SchemaResult> *schema_result)¶ Request schema for a single flight, which may be an existing dataset or a command to be executed.
- Return
- Parameters
[in] options
: Per-RPC options[in] descriptor
: the dataset request, whether a named dataset or command[out] schema_result
: the SchemaResult describing the dataset schema
-
Status
ListFlights
(std::unique_ptr<FlightListing> *listing)¶ List all available flights known to the server.
- Return
- Parameters
[out] listing
: an iterator that returns a FlightInfo for each flight
-
Status
ListFlights
(const FlightCallOptions &options, const Criteria &criteria, std::unique_ptr<FlightListing> *listing)¶ List available flights given indicated filter criteria.
- Return
- Parameters
[in] options
: Per-RPC options[in] criteria
: the filter criteria (opaque)[out] listing
: an iterator that returns a FlightInfo for each flight
-
Status
DoGet
(const FlightCallOptions &options, const Ticket &ticket, std::unique_ptr<FlightStreamReader> *stream)¶ Given a flight ticket and schema, request to be sent the stream.
Returns record batch stream reader
- Return
- Parameters
[in] options
: Per-RPC options[in] ticket
: The flight ticket to use[out] stream
: the returned RecordBatchReader
Upload data to a Flight described by the given descriptor.
The caller must call Close() on the returned stream once they are done writing.
The reader and writer are linked; closing the writer will also close the reader. Use DoneWriting to only close the write side of the channel.
- Return
- Parameters
[in] options
: Per-RPC options[in] descriptor
: the descriptor of the stream[in] schema
: the schema for the data to upload[out] stream
: a writer to write record batches to[out] reader
: a reader for application metadata from the server
Public Static Functions
-
Status
Connect
(const Location &location, std::unique_ptr<FlightClient> *client)¶ Connect to an unauthenticated flight service.
- Return
Status OK status may not indicate that the connection was successful
- Parameters
[in] location
: the URI[out] client
: the created FlightClient
-
Status
Connect
(const Location &location, const FlightClientOptions &options, std::unique_ptr<FlightClient> *client)¶ Connect to an unauthenticated flight service.
- Return
Status OK status may not indicate that the connection was successful
- Parameters
[in] location
: the URI[in] options
: Other options for setting up the client[out] client
: the created FlightClient
-
Status
-
class
arrow::flight
::
FlightClientOptions
¶ Public Members
-
std::string
tls_root_certs
¶ Root certificates to use for validating server certificates.
-
std::string
override_hostname
¶ Override the hostname checked by TLS. Use with caution.
-
std::string
cert_chain
¶ The client certificate to use if using Mutual TLS.
-
std::string
private_key
¶ The private key associated with the client certificate for Mutual TLS.
-
std::vector<std::shared_ptr<ClientMiddlewareFactory>>
middleware
¶ A list of client middleware to apply.
-
int64_t
write_size_limit_bytes
¶ A soft limit on the number of bytes to write in a single batch when sending Arrow data to a server.
Used to help limit server memory consumption. Only enabled if positive. When enabled, FlightStreamWriter.Write* may yield a IOError with error detail FlightWriteSizeStatusDetail.
-
std::vector<std::pair<std::string, util::variant<int, std::string>>>
generic_options
¶ Generic connection options, passed to the underlying transport; interpretation is implementation-dependent.
-
bool
disable_server_verification
¶ Use TLS without validating the server certificate. Use with caution.
Public Static Functions
-
FlightClientOptions
Defaults
()¶ Get default options.
-
std::string
-
class
arrow::flight
::
FlightCallOptions
¶ Hints to the underlying RPC layer for Arrow Flight calls.
Public Functions
-
FlightCallOptions
()¶ Create a default set of call options.
Public Members
-
TimeoutDuration
timeout
¶ An optional timeout for this call.
Negative durations mean an implementation-defined default behavior will be used instead. This is the default value.
-
ipc::IpcReadOptions
read_options
¶ IPC reader options, if applicable for the call.
-
ipc::IpcWriteOptions
write_options
¶ IPC writer options, if applicable for the call.
-
-
class
arrow::flight
::
ClientAuthHandler
¶ An authentication implementation for a Flight service.
Authentication includes both an initial negotiation and a per-call token validation. Implementations may choose to use either or both mechanisms.
Subclassed by arrow::py::flight::PyClientAuthHandler
-
class
arrow::flight
::
ClientMiddleware
¶ Client-side middleware for a call, instantiated per RPC.
Middleware should be fast and must be infallible: there is no way to reject the call or report errors from the middleware instance.
Subclassed by arrow::py::flight::PyClientMiddleware
Public Functions
-
void
SendingHeaders
(AddCallHeaders *outgoing_headers) = 0¶ A callback before headers are sent.
Extra headers can be added, but existing ones cannot be read.
-
void
ReceivedHeaders
(const CallHeaders &incoming_headers) = 0¶ A callback when headers are received from the server.
-
void
-
class
arrow::flight
::
ClientMiddlewareFactory
¶ A factory for new middleware instances.
If added to a client, this will be called for each RPC (including Handshake) to give the opportunity to intercept the call.
It is guaranteed that all client middleware methods are called from the same thread that calls the RPC method implementation.
Subclassed by arrow::py::flight::PyClientMiddlewareFactory
Public Functions
-
void
StartCall
(const CallInfo &info, std::unique_ptr<ClientMiddleware> *middleware) = 0¶ A callback for the start of a new call.
- Parameters
info
: Information about the call.[out] middleware
: The middleware instance for this call. If unset, will not add middleware to this call instance from this factory.
-
void
-
typedef std::chrono::duration<double, std::chrono::seconds::period>
arrow::flight
::
TimeoutDuration
¶ A duration type for Flight call timeouts.
-
class
arrow::flight
::
FlightStreamReader
: public arrow::flight::MetadataRecordBatchReader¶ A RecordBatchReader exposing Flight metadata and cancel operations.
Public Functions
-
void
Cancel
() = 0¶ Try to cancel the call.
-
void
-
class
arrow::flight
::
FlightStreamWriter
: public arrow::flight::MetadataRecordBatchWriter¶ A RecordBatchWriter that also allows sending application-defined metadata via the Flight protocol.
Servers¶
-
class
arrow::flight
::
FlightServerBase
¶ Skeleton RPC server implementation which can be used to create custom servers by implementing its abstract methods.
Subclassed by arrow::py::flight::PyFlightServer
Public Functions
-
Status
Init
(const FlightServerOptions &options)¶ Initialize a Flight server listening at the given location.
This method must be called before any other method.
- Parameters
[in] options
: The configuration for this server.
-
int
port
() const¶ Get the port that the Flight server is listening on.
This method must only be called after Init(). Will return a non-positive value if no port exists (e.g. when listening on a domain socket).
-
Status
SetShutdownOnSignals
(const std::vector<int> sigs)¶ Set the server to stop when receiving any of the given signal numbers.
This method must be called before Serve().
-
Status
Serve
()¶ Start serving.
This method blocks until either Shutdown() is called or one of the signals registered in SetShutdownOnSignals() is received.
-
int
GotSignal
() const¶ Query whether Serve() was interrupted by a signal.
This method must be called after Serve() has returned.
- Return
int the signal number that interrupted Serve(), if any, otherwise 0
-
Status
Shutdown
()¶ Shut down the server.
Can be called from signal handler or another thread while Serve() blocks.
TODO(wesm): Shutdown with deadline
-
Status
ListFlights
(const ServerCallContext &context, const Criteria *criteria, std::unique_ptr<FlightListing> *listings)¶ Retrieve a list of available fields given an optional opaque criteria.
- Return
- Parameters
[in] context
: The call context.[in] criteria
: may be null[out] listings
: the returned listings iterator
-
Status
GetFlightInfo
(const ServerCallContext &context, const FlightDescriptor &request, std::unique_ptr<FlightInfo> *info)¶ Retrieve the schema and an access plan for the indicated descriptor.
- Return
- Parameters
[in] context
: The call context.[in] request
: may be null[out] info
: the returned flight info provider
-
Status
GetSchema
(const ServerCallContext &context, const FlightDescriptor &request, std::unique_ptr<SchemaResult> *schema)¶ Retrieve the schema for the indicated descriptor.
- Return
- Parameters
[in] context
: The call context.[in] request
: may be null[out] schema
: the returned flight schema provider
-
Status
DoGet
(const ServerCallContext &context, const Ticket &request, std::unique_ptr<FlightDataStream> *stream)¶ Get a stream of IPC payloads to put on the wire.
- Return
- Parameters
[in] context
: The call context.[in] request
: an opaque ticket[out] stream
: the returned stream provider
-
Status
DoPut
(const ServerCallContext &context, std::unique_ptr<FlightMessageReader> reader, std::unique_ptr<FlightMetadataWriter> writer)¶ Process a stream of IPC payloads sent from a client.
- Return
- Parameters
[in] context
: The call context.[in] reader
: a sequence of uploaded record batches[in] writer
: send metadata back to the client
-
Status
DoExchange
(const ServerCallContext &context, std::unique_ptr<FlightMessageReader> reader, std::unique_ptr<FlightMessageWriter> writer)¶ Process a bidirectional stream of IPC payloads.
- Return
- Parameters
[in] context
: The call context.[in] reader
: a sequence of uploaded record batches[in] writer
: send data back to the client
-
Status
DoAction
(const ServerCallContext &context, const Action &action, std::unique_ptr<ResultStream> *result)¶ Execute an action, return stream of zero or more results.
- Return
- Parameters
[in] context
: The call context.[in] action
: the action to execute, with type and body[out] result
: the result iterator
-
Status
ListActions
(const ServerCallContext &context, std::vector<ActionType> *actions)¶ Retrieve the list of available actions.
- Return
- Parameters
[in] context
: The call context.[out] actions
: a vector of available action types
-
Status
-
class
arrow::flight
::
FlightServerOptions
¶ Public Members
-
Location
location
¶ The host & port (or domain socket path) to listen on.
Use port 0 to bind to an available port.
-
std::shared_ptr<ServerAuthHandler>
auth_handler
¶ The authentication handler to use.
-
std::vector<CertKeyPair>
tls_certificates
¶ A list of TLS certificate+key pairs to use.
-
bool
verify_client
¶ Enable mTLS and require that the client present a certificate.
-
std::string
root_certificates
¶ If using mTLS, the PEM-encoded root certificate to use.
-
std::vector<std::pair<std::string, std::shared_ptr<ServerMiddlewareFactory>>>
middleware
¶ A list of server middleware to apply, along with a key to identify them by.
Middleware are always applied in the order provided. Duplicate keys are an error.
-
std::function<void(void*)>
builder_hook
¶ A Flight implementation-specific callback to customize transport-specific options.
Not guaranteed to be called. The type of the parameter is specific to the Flight implementation. Users should take care to link to the same transport implementation as Flight to avoid runtime problems.
-
Location
-
struct
arrow::flight
::
CertKeyPair
¶ A TLS certificate plus key.
-
class
arrow::flight
::
FlightDataStream
¶ Interface that produces a sequence of IPC payloads to be sent in FlightData protobuf messages.
Subclassed by arrow::flight::RecordBatchStream, arrow::py::flight::PyFlightDataStream, arrow::py::flight::PyGeneratorFlightDataStream
Public Functions
-
Status
GetSchemaPayload
(FlightPayload *payload) = 0¶ Compute FlightPayload containing serialized RecordBatch schema.
-
Status
-
class
arrow::flight
::
FlightMessageReader
: public arrow::flight::MetadataRecordBatchReader¶ A reader for IPC payloads uploaded by a client.
Also allows reading application-defined metadata via the Flight protocol.
Public Functions
-
const FlightDescriptor &
descriptor
() const = 0¶ Get the descriptor for this upload.
-
const FlightDescriptor &
-
class
arrow::flight
::
FlightMetadataWriter
¶ A writer for application-specific metadata sent back to the client during an upload.
-
class
arrow::flight
::
RecordBatchStream
: public arrow::flight::FlightDataStream¶ A basic implementation of FlightDataStream that will provide a sequence of FlightData messages to be written to a gRPC stream.
Public Functions
- Parameters
[in] reader
: produces a sequence of record batches[in] options
: IPC options for writing
-
Status
GetSchemaPayload
(FlightPayload *payload) override¶ Compute FlightPayload containing serialized RecordBatch schema.
-
class
arrow::flight
::
ServerAuthHandler
¶ An authentication implementation for a Flight service.
Authentication includes both an initial negotiation and a per-call token validation. Implementations may choose to use either or both mechanisms. An implementation may need to track some state, e.g. a mapping of client tokens to authenticated identities.
Subclassed by arrow::flight::NoOpAuthHandler, arrow::py::flight::PyServerAuthHandler
Public Functions
-
Status
Authenticate
(ServerAuthSender *outgoing, ServerAuthReader *incoming) = 0¶ Authenticate the client on initial connection.
The server can send and read responses from the client at any time.
-
Status
IsValid
(const std::string &token, std::string *peer_identity) = 0¶ Validate a per-call client token.
- Return
Status OK if the token is valid, any other status if validation failed
- Parameters
[in] token
: The client token. May be the empty string if the client does not provide a token.[out] peer_identity
: The identity of the peer, if this authentication method supports it.
-
Status
-
class
arrow::flight
::
ServerCallContext
¶ Call state/contextual data.
Public Functions
-
const std::string &
peer_identity
() const = 0¶ The name of the authenticated peer (may be the empty string)
-
const std::string &
peer
() const = 0¶ The peer address (not validated)
-
ServerMiddleware *
GetMiddleware
(const std::string &key) const = 0¶ Look up a middleware by key.
Do not maintain a reference to the object beyond the request body.
- Return
The middleware, or nullptr if not found.
-
const std::string &
-
class
arrow::flight
::
ServerMiddleware
¶ Server-side middleware for a call, instantiated per RPC.
Middleware should be fast and must be infallible: there is no way to reject the call or report errors from the middleware instance.
Subclassed by arrow::py::flight::PyServerMiddleware
Public Functions
-
std::string
name
() const = 0¶ Unique name of middleware, used as alternative to RTTI.
- Return
the string name of the middleware
-
void
SendingHeaders
(AddCallHeaders *outgoing_headers) = 0¶ A callback before headers are sent.
Extra headers can be added, but existing ones cannot be read.
-
std::string
-
class
arrow::flight
::
ServerMiddlewareFactory
¶ A factory for new middleware instances.
If added to a server, this will be called for each RPC (including Handshake) to give the opportunity to intercept the call.
It is guaranteed that all server middleware methods are called from the same thread that calls the RPC method implementation.
Subclassed by arrow::py::flight::PyServerMiddlewareFactory
Public Functions
A callback for the start of a new call.
Return a non-OK status to reject the call with the given status.
- Return
Status A non-OK status will reject the call with the given status. Middleware previously in the chain will have their CallCompleted callback called. Other middleware factories will not be called.
- Parameters
info
: Information about the call.incoming_headers
: Headers sent by the client for this call. Do not retain a reference to this object.[out] middleware
: The middleware instance for this call. If null, no middleware will be added to this call instance from this factory.
-
class
arrow::flight
::
SimpleFlightListing
: public arrow::flight::FlightListing¶ A FlightListing implementation based on a vector of FlightInfo objects.
This can be iterated once, then it is consumed.
Public Functions
-
Status
Next
(std::unique_ptr<FlightInfo> *info) override¶ Retrieve the next FlightInfo from the iterator.
- Return
- Parameters
[out] info
: A single FlightInfo. Set to nullptr if there are none left.
-
Status
-
class
arrow::flight
::
SimpleResultStream
: public arrow::flight::ResultStream¶ A ResultStream implementation based on a vector of Result objects.
This can be iterated once, then it is consumed.
Error Handling¶
Error handling uses the normal arrow::Status
class, combined
with a custom arrow::StatusDetail
object for Flight-specific
error codes.
-
enum
arrow::flight
::
FlightStatusCode
¶ A Flight-specific status code.
Values:
-
enumerator
Internal
¶ An implementation error has occurred.
-
enumerator
TimedOut
¶ A request timed out.
-
enumerator
Cancelled
¶ A request was cancelled.
-
enumerator
Unauthenticated
¶ We are not authenticated to the remote service.
We do not have permission to make this request.
The remote service cannot handle this request at the moment.
-
enumerator
Failed
¶ A request failed for some other reason.
-
enumerator
-
class
arrow::flight
::
FlightStatusDetail
: public arrow::StatusDetail¶ Flight-specific error information in a Status.
Public Functions
-
const char *
type_id
() const override¶ Return a unique id for the type of the StatusDetail (effectively a poor man’s substitute for RTTI).
-
std::string
ToString
() const override¶ Produce a human-readable description of this status.
-
FlightStatusCode
code
() const¶ Get the Flight status code.
-
std::string
extra_info
() const¶ Get the extra error info.
-
std::string
CodeAsString
() const¶ Get the human-readable name of the status code.
-
void
set_extra_info
(std::string extra_info)¶ Set the extra error info.
Public Static Functions
-
std::shared_ptr<FlightStatusDetail>
UnwrapStatus
(const arrow::Status &status)¶ Try to extract a FlightStatusDetail from any Arrow status.
- Return
a FlightStatusDetail if it could be unwrapped, nullptr otherwise
-
const char *
Warning
doxygenfunction: Unable to resolve multiple matches for function “arrow::flight::MakeFlightError” with arguments () in doxygen xml output for project “arrow_cpp” from directory: ../../cpp/apidoc/xml. Potential matches:
- Status MakeFlightError(FlightStatusCode code, const std::string &message)
- Status MakeFlightError(FlightStatusCode code, const std::string &message, const std::string &extra_info)