pyarrow.plasma.PlasmaClient¶

class pyarrow.plasma.PlasmaClient¶

Bases: _Weakrefable

The PlasmaClient is used to interface with a plasma store and manager.

The PlasmaClient can ask the PlasmaStore to allocate a new buffer, seal a buffer, and get a buffer. Buffers are referred to by object IDs, which are strings.

__init__(*args, **kwargs)¶

Methods

__init__(*args, **kwargs)

contains(self, ObjectID object_id)

Check if the object is present and sealed in the PlasmaStore.

create(self, ObjectID object_id, ...)

Create a new buffer in the PlasmaStore for a particular object ID.

create_and_seal(self, ObjectID object_id, ...)

Store a new object in the PlasmaStore for a particular object ID.

debug_string(self)

decode_notifications(self, const uint8_t *buf)

Get the notification from the buffer.

delete(self, object_ids)

Delete the objects with the given IDs from other object store.

disconnect(self)

Disconnect this client from the Plasma store.

evict(self, int64_t num_bytes)

Evict some objects until to recover some bytes.

get(self, object_ids, int timeout_ms=-1[, ...])

Get one or more Python values from the object store.

get_buffers(self, object_ids[, timeout_ms, ...])

Returns data buffer from the PlasmaStore based on object ID.

get_metadata(self, object_ids[, timeout_ms])

Returns metadata buffer from the PlasmaStore based on object ID.

get_next_notification(self)

Get the next notification from the notification socket.

get_notification_socket(self)

Get the notification socket.

hash(self, ObjectID object_id)

Compute the checksum of an object in the object store.

list(self)

Experimental: List the objects in the store.

put(self, value, ObjectID object_id=None, ...)

Store a Python value into the object store.

put_raw_buffer(self, value, ...)

Store Python buffer into the object store.

seal(self, ObjectID object_id)

Seal the buffer in the PlasmaStore for a particular object ID.

set_client_options(self, client_name, ...)

store_capacity(self)

Get the memory capacity of the store.

subscribe(self)

Subscribe to notifications about sealed objects.

to_capsule(self)

Attributes

store_socket_name

contains(self, ObjectID object_id)¶

Check if the object is present and sealed in the PlasmaStore.

Parameters:
object_idObjectID

A string used to identify an object.

create(self, ObjectID object_id, int64_t data_size, string metadata=b'')¶

Create a new buffer in the PlasmaStore for a particular object ID.

The returned buffer is mutable until seal() is called.

Parameters:
object_idObjectID

The object ID used to identify an object.

data_sizeint

The size in bytes of the created buffer.

metadatabytes

An optional string of bytes encoding whatever metadata the user wishes to encode.

Returns:
bufferBuffer

A mutable buffer where to write the object data.

Raises:
PlasmaObjectExists

This exception is raised if the object could not be created because there already is an object with the same ID in the plasma store.

PlasmaStoreFull

This exception is raised if the object could not be created because the plasma store is unable to evict enough objects to create room for it.

create_and_seal(self, ObjectID object_id, string data, string metadata=b'')¶

Store a new object in the PlasmaStore for a particular object ID.

Parameters:
object_idObjectID

The object ID used to identify an object.

databytes

The object to store.

metadatabytes

An optional string of bytes encoding whatever metadata the user wishes to encode.

Raises:
PlasmaObjectExists

This exception is raised if the object could not be created because there already is an object with the same ID in the plasma store.

PlasmaStoreFull: This exception is raised if the object could

not be created because the plasma store is unable to evict enough objects to create room for it.

debug_string(self)¶
decode_notifications(self, const uint8_t *buf)¶

Get the notification from the buffer.

Returns:
[ObjectID]

The list of object IDs in the notification message.

c_vector[int64_t]

The data sizes of the objects in the notification message.

c_vector[int64_t]

The metadata sizes of the objects in the notification message.

delete(self, object_ids)¶

Delete the objects with the given IDs from other object store.

Parameters:
object_idslist

A list of strings used to identify the objects.

disconnect(self)¶

Disconnect this client from the Plasma store.

evict(self, int64_t num_bytes)¶

Evict some objects until to recover some bytes.

Recover at least num_bytes bytes if possible.

Parameters:
num_bytesint

The number of bytes to attempt to recover.

get(self, object_ids, int timeout_ms=-1, serialization_context=None)¶

Get one or more Python values from the object store.

Parameters:
object_idslist or ObjectID

Object ID or list of object IDs associated to the values we get from the store.

timeout_msint, default -1

The number of milliseconds that the get call should block before timing out and returning. Pass -1 if the call should block and 0 if the call should return immediately.

serialization_contextpyarrow.SerializationContext, default None

Custom serialization and deserialization context.

Returns:
list or object

Python value or list of Python values for the data associated with the object_ids and ObjectNotAvailable if the object was not available.

get_buffers(self, object_ids, timeout_ms=-1, with_meta=False)¶

Returns data buffer from the PlasmaStore based on object ID.

If the object has not been sealed yet, this call will block. The retrieved buffer is immutable.

Parameters:
object_idslist

A list of ObjectIDs used to identify some objects.

timeout_msint

The number of milliseconds that the get call should block before timing out and returning. Pass -1 if the call should block and 0 if the call should return immediately.

with_metabool
Returns:
list

If with_meta=False, this is a list of PlasmaBuffers for the data associated with the object_ids and None if the object was not available. If with_meta=True, this is a list of tuples of PlasmaBuffer and metadata bytes.

get_metadata(self, object_ids, timeout_ms=-1)¶

Returns metadata buffer from the PlasmaStore based on object ID.

If the object has not been sealed yet, this call will block. The retrieved buffer is immutable.

Parameters:
object_idslist

A list of ObjectIDs used to identify some objects.

timeout_msint

The number of milliseconds that the get call should block before timing out and returning. Pass -1 if the call should block and 0 if the call should return immediately.

Returns:
list

List of PlasmaBuffers for the metadata associated with the object_ids and None if the object was not available.

get_next_notification(self)¶

Get the next notification from the notification socket.

Returns:
ObjectID

The object ID of the object that was stored.

int

The data size of the object that was stored.

int

The metadata size of the object that was stored.

get_notification_socket(self)¶

Get the notification socket.

hash(self, ObjectID object_id)¶

Compute the checksum of an object in the object store.

Parameters:
object_idObjectID

A string used to identify an object.

Returns:
bytes

A digest string object’s hash. If the object isn’t in the object store, the string will have length zero.

list(self)¶

Experimental: List the objects in the store.

Returns:
dict

Dictionary from ObjectIDs to an ‚Äúinfo‚ÄĚ dictionary describing the object. The ‚Äúinfo‚ÄĚ dictionary has the following entries:

data_size

size of the object in bytes

metadata_size

size of the object metadata in bytes

ref_count

Number of clients referencing the object buffer

create_time

Unix timestamp of the creation of the object

construct_duration

Time the creation of the object took in seconds

state

‚Äúcreated‚ÄĚ if the object is still being created and ‚Äúsealed‚ÄĚ if it is already sealed

put(self, value, ObjectID object_id=None, int memcopy_threads=6, serialization_context=None)¶

Store a Python value into the object store.

Parameters:
valueobject

A Python object to store.

object_idObjectID, default None

If this is provided, the specified object ID will be used to refer to the object.

memcopy_threadsint, default 6

The number of threads to use to write the serialized object into the object store for large objects.

serialization_contextpyarrow.SerializationContext, default None

Custom serialization and deserialization context.

Returns:
ObjectID

The object ID associated to the Python object.

put_raw_buffer(self, value, ObjectID object_id=None, string metadata=b'', int memcopy_threads=6)¶

Store Python buffer into the object store.

Parameters:
valuePython object that implements the buffer protocol

A Python buffer object to store.

object_idObjectID, default None

If this is provided, the specified object ID will be used to refer to the object.

metadatabytes

An optional string of bytes encoding whatever metadata the user wishes to encode.

memcopy_threadsint, default 6

The number of threads to use to write the serialized object into the object store for large objects.

Returns:
ObjectID

The object ID associated to the Python buffer object.

seal(self, ObjectID object_id)¶

Seal the buffer in the PlasmaStore for a particular object ID.

Once a buffer has been sealed, the buffer is immutable and can only be accessed through get.

Parameters:
object_idObjectID

A string used to identify an object.

set_client_options(self, client_name, int64_t limit_output_memory)¶
store_capacity(self)¶

Get the memory capacity of the store.

Returns:
int

The memory capacity of the store in bytes.

store_socket_name¶
subscribe(self)¶

Subscribe to notifications about sealed objects.

to_capsule(self)¶