pyarrow.plasma.PlasmaClient#

class pyarrow.plasma.PlasmaClient#

Bases: pyarrow.lib._Weakrefable

The PlasmaClient is used to interface with a plasma store and manager.

The PlasmaClient can ask the PlasmaStore to allocate a new buffer, seal a buffer, and get a buffer. Buffers are referred to by object IDs, which are strings.

__init__(*args, **kwargs)#

Methods

__init__(*args, **kwargs)

contains(self, ObjectID object_id)

Check if the object is present and sealed in the PlasmaStore.

create(self, ObjectID object_id, ...)

Create a new buffer in the PlasmaStore for a particular object ID.

create_and_seal(self, ObjectID object_id, ...)

Store a new object in the PlasmaStore for a particular object ID.

debug_string(self)

decode_notifications(self, const uint8_t *buf)

Get the notification from the buffer.

delete(self, object_ids)

Delete the objects with the given IDs from other object store.

disconnect(self)

Disconnect this client from the Plasma store.

evict(self, int64_t num_bytes)

Evict some objects until to recover some bytes.

get(self, object_ids, int timeout_ms=-1[, ...])

Get one or more Python values from the object store.

get_buffers(self, object_ids[, timeout_ms, ...])

Returns data buffer from the PlasmaStore based on object ID.

get_metadata(self, object_ids[, timeout_ms])

Returns metadata buffer from the PlasmaStore based on object ID.

get_next_notification(self)

Get the next notification from the notification socket.

get_notification_socket(self)

Get the notification socket.

hash(self, ObjectID object_id)

Compute the checksum of an object in the object store.

list(self)

Experimental: List the objects in the store.

put(self, value, ObjectID object_id=None, ...)

Store a Python value into the object store.

put_raw_buffer(self, value, ...)

Store Python buffer into the object store.

seal(self, ObjectID object_id)

Seal the buffer in the PlasmaStore for a particular object ID.

set_client_options(self, client_name, ...)

store_capacity(self)

Get the memory capacity of the store.

subscribe(self)

Subscribe to notifications about sealed objects.

to_capsule(self)

Attributes

store_socket_name

contains(self, ObjectID object_id)#

Check if the object is present and sealed in the PlasmaStore.

Parameters
object_idObjectID

A string used to identify an object.

create(self, ObjectID object_id, int64_t data_size, string metadata=b'')#

Create a new buffer in the PlasmaStore for a particular object ID.

The returned buffer is mutable until seal() is called.

Parameters
object_idObjectID

The object ID used to identify an object.

data_sizeint

The size in bytes of the created buffer.

metadatabytes

An optional string of bytes encoding whatever metadata the user wishes to encode.

Returns
bufferBuffer

A mutable buffer where to write the object data.

Raises
PlasmaObjectExists

This exception is raised if the object could not be created because there already is an object with the same ID in the plasma store.

PlasmaStoreFull

This exception is raised if the object could not be created because the plasma store is unable to evict enough objects to create room for it.

create_and_seal(self, ObjectID object_id, string data, string metadata=b'')#

Store a new object in the PlasmaStore for a particular object ID.

Parameters
object_idObjectID

The object ID used to identify an object.

databytes

The object to store.

metadatabytes

An optional string of bytes encoding whatever metadata the user wishes to encode.

Raises
PlasmaObjectExists

This exception is raised if the object could not be created because there already is an object with the same ID in the plasma store.

PlasmaStoreFull: This exception is raised if the object could

not be created because the plasma store is unable to evict enough objects to create room for it.

debug_string(self)#
decode_notifications(self, const uint8_t *buf)#

Get the notification from the buffer.

Returns
[ObjectID]

The list of object IDs in the notification message.

c_vector[int64_t]

The data sizes of the objects in the notification message.

c_vector[int64_t]

The metadata sizes of the objects in the notification message.

delete(self, object_ids)#

Delete the objects with the given IDs from other object store.

Parameters
object_idslist

A list of strings used to identify the objects.

disconnect(self)#

Disconnect this client from the Plasma store.

evict(self, int64_t num_bytes)#

Evict some objects until to recover some bytes.

Recover at least num_bytes bytes if possible.

Parameters
num_bytesint

The number of bytes to attempt to recover.

get(self, object_ids, int timeout_ms=-1, serialization_context=None)#

Get one or more Python values from the object store.

Parameters
object_idslist or ObjectID

Object ID or list of object IDs associated to the values we get from the store.

timeout_msint, default -1

The number of milliseconds that the get call should block before timing out and returning. Pass -1 if the call should block and 0 if the call should return immediately.

serialization_contextpyarrow.SerializationContext, default None

Custom serialization and deserialization context.

Returns
list or object

Python value or list of Python values for the data associated with the object_ids and ObjectNotAvailable if the object was not available.

get_buffers(self, object_ids, timeout_ms=- 1, with_meta=False)#

Returns data buffer from the PlasmaStore based on object ID.

If the object has not been sealed yet, this call will block. The retrieved buffer is immutable.

Parameters
object_idslist

A list of ObjectIDs used to identify some objects.

timeout_msint

The number of milliseconds that the get call should block before timing out and returning. Pass -1 if the call should block and 0 if the call should return immediately.

with_metabool
Returns
list

If with_meta=False, this is a list of PlasmaBuffers for the data associated with the object_ids and None if the object was not available. If with_meta=True, this is a list of tuples of PlasmaBuffer and metadata bytes.

get_metadata(self, object_ids, timeout_ms=- 1)#

Returns metadata buffer from the PlasmaStore based on object ID.

If the object has not been sealed yet, this call will block. The retrieved buffer is immutable.

Parameters
object_idslist

A list of ObjectIDs used to identify some objects.

timeout_msint

The number of milliseconds that the get call should block before timing out and returning. Pass -1 if the call should block and 0 if the call should return immediately.

Returns
list

List of PlasmaBuffers for the metadata associated with the object_ids and None if the object was not available.

get_next_notification(self)#

Get the next notification from the notification socket.

Returns
ObjectID

The object ID of the object that was stored.

int

The data size of the object that was stored.

int

The metadata size of the object that was stored.

get_notification_socket(self)#

Get the notification socket.

hash(self, ObjectID object_id)#

Compute the checksum of an object in the object store.

Parameters
object_idObjectID

A string used to identify an object.

Returns
bytes

A digest string object’s hash. If the object isn’t in the object store, the string will have length zero.

list(self)#

Experimental: List the objects in the store.

Returns
dict

Dictionary from ObjectIDs to an ‚Äúinfo‚ÄĚ dictionary describing the object. The ‚Äúinfo‚ÄĚ dictionary has the following entries:

data_size

size of the object in bytes

metadata_size

size of the object metadata in bytes

ref_count

Number of clients referencing the object buffer

create_time

Unix timestamp of the creation of the object

construct_duration

Time the creation of the object took in seconds

state

‚Äúcreated‚ÄĚ if the object is still being created and ‚Äúsealed‚ÄĚ if it is already sealed

put(self, value, ObjectID object_id=None, int memcopy_threads=6, serialization_context=None)#

Store a Python value into the object store.

Parameters
valueobject

A Python object to store.

object_idObjectID, default None

If this is provided, the specified object ID will be used to refer to the object.

memcopy_threadsint, default 6

The number of threads to use to write the serialized object into the object store for large objects.

serialization_contextpyarrow.SerializationContext, default None

Custom serialization and deserialization context.

Returns
ObjectID

The object ID associated to the Python object.

put_raw_buffer(self, value, ObjectID object_id=None, string metadata=b'', int memcopy_threads=6)#

Store Python buffer into the object store.

Parameters
valuePython object that implements the buffer protocol

A Python buffer object to store.

object_idObjectID, default None

If this is provided, the specified object ID will be used to refer to the object.

metadatabytes

An optional string of bytes encoding whatever metadata the user wishes to encode.

memcopy_threadsint, default 6

The number of threads to use to write the serialized object into the object store for large objects.

Returns
ObjectID

The object ID associated to the Python buffer object.

seal(self, ObjectID object_id)#

Seal the buffer in the PlasmaStore for a particular object ID.

Once a buffer has been sealed, the buffer is immutable and can only be accessed through get.

Parameters
object_idObjectID

A string used to identify an object.

set_client_options(self, client_name, int64_t limit_output_memory)#
store_capacity(self)#

Get the memory capacity of the store.

Returns
int

The memory capacity of the store in bytes.

store_socket_name#
subscribe(self)#

Subscribe to notifications about sealed objects.

to_capsule(self)#