java.lang.Object
org.apache.arrow.vector.ipc.message.MessageSerializer
Utility class for serializing Messages. Messages are all serialized a similar way. 1. 4 byte
little endian message header prefix 2. FB serialized Message: This includes it the body length,
which is the serialized body and the type of the message. 3. Serialized message.
For schema messages, the serialization is simply the FB serialized Schema.
For RecordBatch messages the serialization is: 1. 4 byte little endian batch metadata header 2. FB serialized RowBatch 3. Padding to align to 8 byte boundary. 4. serialized RowBatch buffers.
-
Field Summary
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic int
bytesToInt
(byte[] bytes) Convert an array of 4 bytes in little-endian to an native-endian i32 value.static ArrowDictionaryBatch
deserializeDictionaryBatch
(Message message, ArrowBuf bodyBuffer) Deserializes an ArrowDictionaryBatch from a dictionary batch Message and data in an ArrowBuf.static ArrowDictionaryBatch
deserializeDictionaryBatch
(MessageMetadataResult message, ArrowBuf bodyBuffer) Deserializes an ArrowDictionaryBatch from a dictionary batch Message and data in an ArrowBuf.static ArrowDictionaryBatch
deserializeDictionaryBatch
(ReadChannel in, BufferAllocator allocator) Deserializes an ArrowDictionaryBatch read from the input channel.static ArrowDictionaryBatch
deserializeDictionaryBatch
(ReadChannel in, ArrowBlock block, BufferAllocator alloc) Deserializes a DictionaryBatch knowing the size of the entire message up front.static ArrowMessage
Deserialize a message that is either an ArrowDictionaryBatch or ArrowRecordBatch.static ArrowMessage
deserializeMessageBatch
(ReadChannel in, BufferAllocator alloc) Deserialize a message that is either an ArrowDictionaryBatch or ArrowRecordBatch.static ArrowRecordBatch
deserializeRecordBatch
(Message recordBatchMessage, ArrowBuf bodyBuffer) Deserializes an ArrowRecordBatch from a record batch message and data in an ArrowBuf.static ArrowRecordBatch
deserializeRecordBatch
(RecordBatch recordBatchFB, ArrowBuf body) Deserializes an ArrowRecordBatch given the Flatbuffer metadata and in-memory body.static ArrowRecordBatch
deserializeRecordBatch
(MessageMetadataResult serializedMessage, ArrowBuf underlying) Reads a record batch based on the metadata in serializedMessage and the underlying data buffer.static ArrowRecordBatch
deserializeRecordBatch
(ReadChannel in, BufferAllocator allocator) Deserializes an ArrowRecordBatch read from the input channel.static ArrowRecordBatch
deserializeRecordBatch
(ReadChannel in, ArrowBlock block, BufferAllocator alloc) Deserializes an ArrowRecordBatch knowing the size of the entire message up front.static Schema
deserializeSchema
(Message schemaMessage) Deserializes an Arrow Schema object from a schema message.static Schema
deserializeSchema
(MessageMetadataResult message) Deserializes an Arrow Schema object from aMessageMetadataResult
.static Schema
Deserializes an Arrow Schema read from the input channel.static void
intToBytes
(int value, byte[] bytes) Convert an integer to a little endian 4 byte array.static void
longToBytes
(long value, byte[] bytes) Convert a long to a little-endian 8 byte array.static MessageMetadataResult
Read a Message from the input channel and return a MessageMetadataResult that contains the Message metadata, buffer containing the serialized Message metadata as read, and length of the Message in bytes.static ArrowBuf
readMessageBody
(ReadChannel in, long bodyLength, BufferAllocator allocator) Read a Message body from the in channel into an ArrowBuf.static ArrowBlock
serialize
(WriteChannel out, ArrowDictionaryBatch batch) static ArrowBlock
serialize
(WriteChannel out, ArrowDictionaryBatch batch, IpcOption option) Serializes a dictionary ArrowRecordBatch.static ArrowBlock
serialize
(WriteChannel out, ArrowRecordBatch batch) Serializes an ArrowRecordBatch.static ArrowBlock
serialize
(WriteChannel out, ArrowRecordBatch batch, IpcOption option) Serializes an ArrowRecordBatch.static long
serialize
(WriteChannel out, Schema schema) Serialize a schema object.static long
serialize
(WriteChannel out, Schema schema, IpcOption option) Serialize a schema object.static ByteBuffer
serializeMessage
(com.google.flatbuffers.FlatBufferBuilder builder, byte headerType, int headerOffset, long bodyLength) Deprecated.static ByteBuffer
serializeMessage
(com.google.flatbuffers.FlatBufferBuilder builder, byte headerType, int headerOffset, long bodyLength, IpcOption writeOption) Serializes a message header.static ByteBuffer
serializeMetadata
(ArrowMessage message) Deprecated.static ByteBuffer
serializeMetadata
(ArrowMessage message, IpcOption writeOption) Returns the serialized form ofRecordBatch
wrapped in aMessage
.static ByteBuffer
serializeMetadata
(Schema schema) Deprecated.static ByteBuffer
serializeMetadata
(Schema schema, IpcOption writeOption) Returns the serialized flatbuffer bytes of the schema wrapped in a message table.static long
writeBatchBuffers
(WriteChannel out, ArrowRecordBatch batch) Write the Arrow buffers of the record batch to the output channel.static int
writeMessageBuffer
(WriteChannel out, int messageLength, ByteBuffer messageBuffer) static int
writeMessageBuffer
(WriteChannel out, int messageLength, ByteBuffer messageBuffer, IpcOption option) Write the serialized Message metadata, prefixed by the length, to the output Channel.
-
Field Details
-
IPC_CONTINUATION_TOKEN
public static final int IPC_CONTINUATION_TOKEN- See Also:
-
-
Constructor Details
-
MessageSerializer
public MessageSerializer()
-
-
Method Details
-
bytesToInt
public static int bytesToInt(byte[] bytes) Convert an array of 4 bytes in little-endian to an native-endian i32 value.- Parameters:
bytes
- byte array with minimum length of 4 in little-endian- Returns:
- converted an native-endian 32-bit integer
-
intToBytes
public static void intToBytes(int value, byte[] bytes) Convert an integer to a little endian 4 byte array.- Parameters:
value
- integer value inputbytes
- existing byte array with minimum length of 4 to contain the conversion output
-
longToBytes
public static void longToBytes(long value, byte[] bytes) Convert a long to a little-endian 8 byte array.- Parameters:
value
- long value inputbytes
- existing byte array with minimum length of 8 to contain the conversion output
-
writeMessageBuffer
public static int writeMessageBuffer(WriteChannel out, int messageLength, ByteBuffer messageBuffer) throws IOException - Throws:
IOException
-
writeMessageBuffer
public static int writeMessageBuffer(WriteChannel out, int messageLength, ByteBuffer messageBuffer, IpcOption option) throws IOException Write the serialized Message metadata, prefixed by the length, to the output Channel. This ensures that it aligns to an 8 byte boundary and will adjust the message length to include any padding used for alignment.- Parameters:
out
- Output ChannelmessageLength
- Number of bytes in the message buffer, written as little Endian prefixmessageBuffer
- Message metadata buffer to be written, this does not include any message body data which should be subsequently written to the Channeloption
- IPC write options- Returns:
- Number of bytes written
- Throws:
IOException
- on error
-
serialize
Serialize a schema object.- Throws:
IOException
-
serialize
Serialize a schema object.- Parameters:
out
- where to write the schemaschema
- the object to serialize to out- Returns:
- the number of bytes written
- Throws:
IOException
- if something went wrong
-
serializeMetadata
Deprecated.Returns the serialized flatbuffer bytes of the schema wrapped in a message table. -
serializeMetadata
Returns the serialized flatbuffer bytes of the schema wrapped in a message table. -
deserializeSchema
Deserializes an Arrow Schema object from a schema message. Format is from serialize().- Parameters:
schemaMessage
- a Message of type MessageHeader.Schema- Returns:
- the deserialized Arrow Schema
-
deserializeSchema
Deserializes an Arrow Schema read from the input channel. Format is from serialize().- Parameters:
in
- the channel to deserialize from- Returns:
- the deserialized Arrow Schema
- Throws:
IOException
- if something went wrong
-
deserializeSchema
Deserializes an Arrow Schema object from aMessageMetadataResult
. Format is from serialize().- Parameters:
message
- a Message of type MessageHeader.Schema- Returns:
- the deserialized Arrow Schema
-
serialize
Serializes an ArrowRecordBatch. Returns the offset and length of the written batch.- Throws:
IOException
-
serialize
public static ArrowBlock serialize(WriteChannel out, ArrowRecordBatch batch, IpcOption option) throws IOException Serializes an ArrowRecordBatch. Returns the offset and length of the written batch.- Parameters:
out
- where to write the batchbatch
- the object to serialize to out- Returns:
- the serialized block metadata
- Throws:
IOException
- if something went wrong
-
writeBatchBuffers
Write the Arrow buffers of the record batch to the output channel.- Parameters:
out
- the output channel to write the buffers tobatch
- an ArrowRecordBatch containing buffers to be written- Returns:
- the number of bytes written
- Throws:
IOException
- on error
-
serializeMetadata
Deprecated.Returns the serialized form ofRecordBatch
wrapped in aMessage
. -
serializeMetadata
Returns the serialized form ofRecordBatch
wrapped in aMessage
. -
deserializeRecordBatch
public static ArrowRecordBatch deserializeRecordBatch(Message recordBatchMessage, ArrowBuf bodyBuffer) throws IOException Deserializes an ArrowRecordBatch from a record batch message and data in an ArrowBuf.- Parameters:
recordBatchMessage
- a Message of type MessageHeader.RecordBatchbodyBuffer
- Arrow buffer containing the RecordBatch data- Returns:
- the deserialized ArrowRecordBatch
- Throws:
IOException
- if something went wrong
-
deserializeRecordBatch
public static ArrowRecordBatch deserializeRecordBatch(ReadChannel in, BufferAllocator allocator) throws IOException Deserializes an ArrowRecordBatch read from the input channel. This uses the given allocator to create an ArrowBuf for the batch body data.- Parameters:
in
- Channel to read a RecordBatch message and data fromallocator
- BufferAllocator to allocate an Arrow buffer to read message body data- Returns:
- the deserialized ArrowRecordBatch
- Throws:
IOException
- on error
-
deserializeRecordBatch
public static ArrowRecordBatch deserializeRecordBatch(ReadChannel in, ArrowBlock block, BufferAllocator alloc) throws IOException Deserializes an ArrowRecordBatch knowing the size of the entire message up front. This minimizes the number of reads to the underlying stream.- Parameters:
in
- the channel to deserialize fromblock
- the object to deserialize toalloc
- to allocate buffers- Returns:
- the deserialized ArrowRecordBatch
- Throws:
IOException
- if something went wrong
-
deserializeRecordBatch
public static ArrowRecordBatch deserializeRecordBatch(RecordBatch recordBatchFB, ArrowBuf body) throws IOException Deserializes an ArrowRecordBatch given the Flatbuffer metadata and in-memory body.- Parameters:
recordBatchFB
- Deserialized FlatBuffer record batchbody
- Read body of the record batch- Returns:
- ArrowRecordBatch from metadata and in-memory body
- Throws:
IOException
- on error
-
deserializeRecordBatch
public static ArrowRecordBatch deserializeRecordBatch(MessageMetadataResult serializedMessage, ArrowBuf underlying) throws IOException Reads a record batch based on the metadata in serializedMessage and the underlying data buffer.- Throws:
IOException
-
serialize
- Throws:
IOException
-
serialize
public static ArrowBlock serialize(WriteChannel out, ArrowDictionaryBatch batch, IpcOption option) throws IOException Serializes a dictionary ArrowRecordBatch. Returns the offset and length of the written batch.- Parameters:
out
- where to serializebatch
- the batch to serializeoption
- options for IPC- Returns:
- the metadata of the serialized block
- Throws:
IOException
- if something went wrong
-
deserializeDictionaryBatch
public static ArrowDictionaryBatch deserializeDictionaryBatch(Message message, ArrowBuf bodyBuffer) throws IOException Deserializes an ArrowDictionaryBatch from a dictionary batch Message and data in an ArrowBuf.- Parameters:
message
- a message of type MessageHeader.DictionaryBatchbodyBuffer
- Arrow buffer containing the DictionaryBatch data of type MessageHeader.DictionaryBatch- Returns:
- the deserialized ArrowDictionaryBatch
- Throws:
IOException
- if something went wrong
-
deserializeDictionaryBatch
public static ArrowDictionaryBatch deserializeDictionaryBatch(MessageMetadataResult message, ArrowBuf bodyBuffer) throws IOException Deserializes an ArrowDictionaryBatch from a dictionary batch Message and data in an ArrowBuf.- Parameters:
message
- a message of type MessageHeader.DictionaryBatchbodyBuffer
- Arrow buffer containing the DictionaryBatch data of type MessageHeader.DictionaryBatch- Returns:
- the deserialized ArrowDictionaryBatch
- Throws:
IOException
- if something went wrong
-
deserializeDictionaryBatch
public static ArrowDictionaryBatch deserializeDictionaryBatch(ReadChannel in, BufferAllocator allocator) throws IOException Deserializes an ArrowDictionaryBatch read from the input channel. This uses the given allocator to create an ArrowBuf for the batch body data.- Parameters:
in
- Channel to read a DictionaryBatch message and data fromallocator
- BufferAllocator to allocate an Arrow buffer to read message body data- Returns:
- the deserialized ArrowDictionaryBatch
- Throws:
IOException
- on error
-
deserializeDictionaryBatch
public static ArrowDictionaryBatch deserializeDictionaryBatch(ReadChannel in, ArrowBlock block, BufferAllocator alloc) throws IOException Deserializes a DictionaryBatch knowing the size of the entire message up front. This minimizes the number of reads to the underlying stream.- Parameters:
in
- where to read fromblock
- block metadata for deserializingalloc
- to allocate new buffers- Returns:
- the deserialized ArrowDictionaryBatch
- Throws:
IOException
- if something went wrong
-
deserializeMessageBatch
Deserialize a message that is either an ArrowDictionaryBatch or ArrowRecordBatch.- Parameters:
reader
- MessageChannelReader to read a sequence of messages from a ReadChannel- Returns:
- The deserialized record batch
- Throws:
IOException
- if the message is not an ArrowDictionaryBatch or ArrowRecordBatch
-
deserializeMessageBatch
public static ArrowMessage deserializeMessageBatch(ReadChannel in, BufferAllocator alloc) throws IOException Deserialize a message that is either an ArrowDictionaryBatch or ArrowRecordBatch.- Parameters:
in
- ReadChannel to read messages fromalloc
- Allocator for message data- Returns:
- The deserialized record batch
- Throws:
IOException
- if the message is not an ArrowDictionaryBatch or ArrowRecordBatch
-
serializeMessage
@Deprecated public static ByteBuffer serializeMessage(com.google.flatbuffers.FlatBufferBuilder builder, byte headerType, int headerOffset, long bodyLength) Deprecated. -
serializeMessage
public static ByteBuffer serializeMessage(com.google.flatbuffers.FlatBufferBuilder builder, byte headerType, int headerOffset, long bodyLength, IpcOption writeOption) Serializes a message header.- Parameters:
builder
- to write the flatbuf toheaderType
- headerType fieldheaderOffset
- header offset fieldbodyLength
- body length fieldwriteOption
- IPC write options- Returns:
- the corresponding ByteBuffer
-
readMessage
Read a Message from the input channel and return a MessageMetadataResult that contains the Message metadata, buffer containing the serialized Message metadata as read, and length of the Message in bytes. Returns null if the end-of-stream has been reached.- Parameters:
in
- ReadChannel to read messages from- Returns:
- MessageMetadataResult with deserialized Message metadata and message information if a valid Message was read, or null if end-of-stream
- Throws:
IOException
- on error
-
readMessageBody
public static ArrowBuf readMessageBody(ReadChannel in, long bodyLength, BufferAllocator allocator) throws IOException Read a Message body from the in channel into an ArrowBuf.- Parameters:
in
- ReadChannel to read message body frombodyLength
- Length in bytes of the message body to readallocator
- Allocate the ArrowBuf to contain message body data- Returns:
- an ArrowBuf containing the message body data
- Throws:
IOException
- on error
-