pyarrow.plasma.PlasmaClient¶
-
class
pyarrow.plasma.
PlasmaClient
¶ Bases:
pyarrow.lib._Weakrefable
The PlasmaClient is used to interface with a plasma store and manager.
The PlasmaClient can ask the PlasmaStore to allocate a new buffer, seal a buffer, and get a buffer. Buffers are referred to by object IDs, which are strings.
-
__init__
()¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
()Initialize self.
contains
(self, ObjectID object_id)Check if the object is present and sealed in the PlasmaStore.
create
(self, ObjectID object_id, …)Create a new buffer in the PlasmaStore for a particular object ID.
create_and_seal
(self, ObjectID object_id, …)Store a new object in the PlasmaStore for a particular object ID.
debug_string
(self)decode_notifications
(self, const uint8_t *buf)Get the notification from the buffer.
delete
(self, object_ids)Delete the objects with the given IDs from other object store.
disconnect
(self)Disconnect this client from the Plasma store.
evict
(self, int64_t num_bytes)Evict some objects until to recover some bytes.
get
(self, object_ids, int timeout_ms=-1[, …])Get one or more Python values from the object store.
get_buffers
(self, object_ids[, timeout_ms, …])Returns data buffer from the PlasmaStore based on object ID.
get_metadata
(self, object_ids[, timeout_ms])Returns metadata buffer from the PlasmaStore based on object ID.
get_next_notification
(self)Get the next notification from the notification socket.
get_notification_socket
(self)Get the notification socket.
hash
(self, ObjectID object_id)Compute the checksum of an object in the object store.
list
(self)Experimental: List the objects in the store.
put
(self, value, ObjectID object_id=None, …)Store a Python value into the object store.
put_raw_buffer
(self, value, …)Store Python buffer into the object store.
seal
(self, ObjectID object_id)Seal the buffer in the PlasmaStore for a particular object ID.
set_client_options
(self, client_name, …)store_capacity
(self)Get the memory capacity of the store.
subscribe
(self)Subscribe to notifications about sealed objects.
to_capsule
(self)Attributes
-
contains
(self, ObjectID object_id)¶ Check if the object is present and sealed in the PlasmaStore.
- Parameters
object_id (ObjectID) – A string used to identify an object.
-
create
(self, ObjectID object_id, int64_t data_size, string metadata=b'')¶ Create a new buffer in the PlasmaStore for a particular object ID.
The returned buffer is mutable until seal is called.
- Parameters
object_id (ObjectID) – The object ID used to identify an object.
size (int) – The size in bytes of the created buffer.
metadata (bytes) – An optional string of bytes encoding whatever metadata the user wishes to encode.
- Raises
PlasmaObjectExists – This exception is raised if the object could not be created because there already is an object with the same ID in the plasma store.
PlasmaStoreFull – This exception is raised if the object could: not be created because the plasma store is unable to evict enough objects to create room for it.
-
create_and_seal
(self, ObjectID object_id, string data, string metadata=b'')¶ Store a new object in the PlasmaStore for a particular object ID.
- Parameters
object_id (ObjectID) – The object ID used to identify an object.
data (bytes) – The object to store.
metadata (bytes) – An optional string of bytes encoding whatever metadata the user wishes to encode.
- Raises
PlasmaObjectExists – This exception is raised if the object could not be created because there already is an object with the same ID in the plasma store.
PlasmaStoreFull – This exception is raised if the object could: not be created because the plasma store is unable to evict enough objects to create room for it.
-
debug_string
(self)¶
-
decode_notifications
(self, const uint8_t *buf)¶ Get the notification from the buffer.
- Returns
[ObjectID] – The list of object IDs in the notification message.
c_vector[int64_t] – The data sizes of the objects in the notification message.
c_vector[int64_t] – The metadata sizes of the objects in the notification message.
-
delete
(self, object_ids)¶ Delete the objects with the given IDs from other object store.
- Parameters
object_ids (list) – A list of strings used to identify the objects.
-
disconnect
(self)¶ Disconnect this client from the Plasma store.
-
evict
(self, int64_t num_bytes)¶ Evict some objects until to recover some bytes.
Recover at least num_bytes bytes if possible.
- Parameters
num_bytes (int) – The number of bytes to attempt to recover.
-
get
(self, object_ids, int timeout_ms=-1, serialization_context=None)¶ Get one or more Python values from the object store.
- Parameters
object_ids (list or ObjectID) – Object ID or list of object IDs associated to the values we get from the store.
timeout_ms (int, default -1) – The number of milliseconds that the get call should block before timing out and returning. Pass -1 if the call should block and 0 if the call should return immediately.
serialization_context (pyarrow.SerializationContext, default None) – Custom serialization and deserialization context.
- Returns
list or object – Python value or list of Python values for the data associated with the object_ids and ObjectNotAvailable if the object was not available.
-
get_buffers
(self, object_ids, timeout_ms=- 1, with_meta=False)¶ Returns data buffer from the PlasmaStore based on object ID.
If the object has not been sealed yet, this call will block. The retrieved buffer is immutable.
- Parameters
object_ids (list) – A list of ObjectIDs used to identify some objects.
timeout_ms (int) – The number of milliseconds that the get call should block before timing out and returning. Pass -1 if the call should block and 0 if the call should return immediately.
with_meta (bool) –
- Returns
list – If with_meta=False, this is a list of PlasmaBuffers for the data associated with the object_ids and None if the object was not available. If with_meta=True, this is a list of tuples of PlasmaBuffer and metadata bytes.
-
get_metadata
(self, object_ids, timeout_ms=- 1)¶ Returns metadata buffer from the PlasmaStore based on object ID.
If the object has not been sealed yet, this call will block. The retrieved buffer is immutable.
- Parameters
object_ids (list) – A list of ObjectIDs used to identify some objects.
timeout_ms (int) – The number of milliseconds that the get call should block before timing out and returning. Pass -1 if the call should block and 0 if the call should return immediately.
- Returns
list – List of PlasmaBuffers for the metadata associated with the object_ids and None if the object was not available.
-
get_next_notification
(self)¶ Get the next notification from the notification socket.
- Returns
ObjectID – The object ID of the object that was stored.
int – The data size of the object that was stored.
int – The metadata size of the object that was stored.
-
get_notification_socket
(self)¶ Get the notification socket.
-
hash
(self, ObjectID object_id)¶ Compute the checksum of an object in the object store.
- Parameters
object_id (ObjectID) – A string used to identify an object.
- Returns
bytes – A digest string object’s hash. If the object isn’t in the object store, the string will have length zero.
-
list
(self)¶ Experimental: List the objects in the store.
- Returns
dict – Dictionary from ObjectIDs to an “info” dictionary describing the object. The “info” dictionary has the following entries:
- data_size
size of the object in bytes
- metadata_size
size of the object metadata in bytes
- ref_count
Number of clients referencing the object buffer
- create_time
Unix timestamp of the creation of the object
- construct_duration
Time the creation of the object took in seconds
- state
”created” if the object is still being created and “sealed” if it is already sealed
-
put
(self, value, ObjectID object_id=None, int memcopy_threads=6, serialization_context=None)¶ Store a Python value into the object store.
- Parameters
value (object) – A Python object to store.
object_id (ObjectID, default None) – If this is provided, the specified object ID will be used to refer to the object.
memcopy_threads (int, default 6) – The number of threads to use to write the serialized object into the object store for large objects.
serialization_context (pyarrow.SerializationContext, default None) – Custom serialization and deserialization context.
- Returns
The object ID associated to the Python object.
-
put_raw_buffer
(self, value, ObjectID object_id=None, string metadata=b'', int memcopy_threads=6)¶ Store Python buffer into the object store.
- Parameters
value (Python object that implements the buffer protocol) – A Python buffer object to store.
object_id (ObjectID, default None) – If this is provided, the specified object ID will be used to refer to the object.
metadata (bytes) – An optional string of bytes encoding whatever metadata the user wishes to encode.
memcopy_threads (int, default 6) – The number of threads to use to write the serialized object into the object store for large objects.
- Returns
The object ID associated to the Python buffer object.
-
seal
(self, ObjectID object_id)¶ Seal the buffer in the PlasmaStore for a particular object ID.
Once a buffer has been sealed, the buffer is immutable and can only be accessed through get.
- Parameters
object_id (ObjectID) – A string used to identify an object.
-
set_client_options
(self, client_name, int64_t limit_output_memory)¶
-
store_capacity
(self)¶ Get the memory capacity of the store.
- Returns
int – The memory capacity of the store in bytes.
-
store_socket_name
¶
-
subscribe
(self)¶ Subscribe to notifications about sealed objects.
-
to_capsule
(self)¶
-