Class MessageSerializer

java.lang.Object
org.apache.arrow.vector.ipc.message.MessageSerializer

public class MessageSerializer extends Object
Utility class for serializing Messages. Messages are all serialized a similar way. 1. 4 byte little endian message header prefix 2. FB serialized Message: This includes it the body length, which is the serialized body and the type of the message. 3. Serialized message.

For schema messages, the serialization is simply the FB serialized Schema.

For RecordBatch messages the serialization is: 1. 4 byte little endian batch metadata header 2. FB serialized RowBatch 3. Padding to align to 8 byte boundary. 4. serialized RowBatch buffers.

  • Field Details

    • IPC_CONTINUATION_TOKEN

      public static final int IPC_CONTINUATION_TOKEN
      See Also:
  • Constructor Details

    • MessageSerializer

      public MessageSerializer()
  • Method Details

    • bytesToInt

      public static int bytesToInt(byte[] bytes)
      Convert an array of 4 bytes in little-endian to an native-endian i32 value.
      Parameters:
      bytes - byte array with minimum length of 4 in little-endian
      Returns:
      converted an native-endian 32-bit integer
    • intToBytes

      public static void intToBytes(int value, byte[] bytes)
      Convert an integer to a little endian 4 byte array.
      Parameters:
      value - integer value input
      bytes - existing byte array with minimum length of 4 to contain the conversion output
    • longToBytes

      public static void longToBytes(long value, byte[] bytes)
      Convert a long to a little-endian 8 byte array.
      Parameters:
      value - long value input
      bytes - existing byte array with minimum length of 8 to contain the conversion output
    • writeMessageBuffer

      public static int writeMessageBuffer(WriteChannel out, int messageLength, ByteBuffer messageBuffer) throws IOException
      Throws:
      IOException
    • writeMessageBuffer

      public static int writeMessageBuffer(WriteChannel out, int messageLength, ByteBuffer messageBuffer, IpcOption option) throws IOException
      Write the serialized Message metadata, prefixed by the length, to the output Channel. This ensures that it aligns to an 8 byte boundary and will adjust the message length to include any padding used for alignment.
      Parameters:
      out - Output Channel
      messageLength - Number of bytes in the message buffer, written as little Endian prefix
      messageBuffer - Message metadata buffer to be written, this does not include any message body data which should be subsequently written to the Channel
      option - IPC write options
      Returns:
      Number of bytes written
      Throws:
      IOException - on error
    • serialize

      public static long serialize(WriteChannel out, Schema schema) throws IOException
      Serialize a schema object.
      Throws:
      IOException
    • serialize

      public static long serialize(WriteChannel out, Schema schema, IpcOption option) throws IOException
      Serialize a schema object.
      Parameters:
      out - where to write the schema
      schema - the object to serialize to out
      Returns:
      the number of bytes written
      Throws:
      IOException - if something went wrong
    • serializeMetadata

      @Deprecated public static ByteBuffer serializeMetadata(Schema schema)
      Deprecated.
      Returns the serialized flatbuffer bytes of the schema wrapped in a message table.
    • serializeMetadata

      public static ByteBuffer serializeMetadata(Schema schema, IpcOption writeOption)
      Returns the serialized flatbuffer bytes of the schema wrapped in a message table.
    • deserializeSchema

      public static Schema deserializeSchema(Message schemaMessage)
      Deserializes an Arrow Schema object from a schema message. Format is from serialize().
      Parameters:
      schemaMessage - a Message of type MessageHeader.Schema
      Returns:
      the deserialized Arrow Schema
    • deserializeSchema

      public static Schema deserializeSchema(ReadChannel in) throws IOException
      Deserializes an Arrow Schema read from the input channel. Format is from serialize().
      Parameters:
      in - the channel to deserialize from
      Returns:
      the deserialized Arrow Schema
      Throws:
      IOException - if something went wrong
    • deserializeSchema

      public static Schema deserializeSchema(MessageMetadataResult message)
      Deserializes an Arrow Schema object from a MessageMetadataResult. Format is from serialize().
      Parameters:
      message - a Message of type MessageHeader.Schema
      Returns:
      the deserialized Arrow Schema
    • serialize

      public static ArrowBlock serialize(WriteChannel out, ArrowRecordBatch batch) throws IOException
      Serializes an ArrowRecordBatch. Returns the offset and length of the written batch.
      Throws:
      IOException
    • serialize

      public static ArrowBlock serialize(WriteChannel out, ArrowRecordBatch batch, IpcOption option) throws IOException
      Serializes an ArrowRecordBatch. Returns the offset and length of the written batch.
      Parameters:
      out - where to write the batch
      batch - the object to serialize to out
      Returns:
      the serialized block metadata
      Throws:
      IOException - if something went wrong
    • writeBatchBuffers

      public static long writeBatchBuffers(WriteChannel out, ArrowRecordBatch batch) throws IOException
      Write the Arrow buffers of the record batch to the output channel.
      Parameters:
      out - the output channel to write the buffers to
      batch - an ArrowRecordBatch containing buffers to be written
      Returns:
      the number of bytes written
      Throws:
      IOException - on error
    • serializeMetadata

      @Deprecated public static ByteBuffer serializeMetadata(ArrowMessage message)
      Deprecated.
      Returns the serialized form of RecordBatch wrapped in a Message.
    • serializeMetadata

      public static ByteBuffer serializeMetadata(ArrowMessage message, IpcOption writeOption)
      Returns the serialized form of RecordBatch wrapped in a Message.
    • deserializeRecordBatch

      public static ArrowRecordBatch deserializeRecordBatch(Message recordBatchMessage, ArrowBuf bodyBuffer) throws IOException
      Deserializes an ArrowRecordBatch from a record batch message and data in an ArrowBuf.
      Parameters:
      recordBatchMessage - a Message of type MessageHeader.RecordBatch
      bodyBuffer - Arrow buffer containing the RecordBatch data
      Returns:
      the deserialized ArrowRecordBatch
      Throws:
      IOException - if something went wrong
    • deserializeRecordBatch

      public static ArrowRecordBatch deserializeRecordBatch(ReadChannel in, BufferAllocator allocator) throws IOException
      Deserializes an ArrowRecordBatch read from the input channel. This uses the given allocator to create an ArrowBuf for the batch body data.
      Parameters:
      in - Channel to read a RecordBatch message and data from
      allocator - BufferAllocator to allocate an Arrow buffer to read message body data
      Returns:
      the deserialized ArrowRecordBatch
      Throws:
      IOException - on error
    • deserializeRecordBatch

      public static ArrowRecordBatch deserializeRecordBatch(ReadChannel in, ArrowBlock block, BufferAllocator alloc) throws IOException
      Deserializes an ArrowRecordBatch knowing the size of the entire message up front. This minimizes the number of reads to the underlying stream.
      Parameters:
      in - the channel to deserialize from
      block - the object to deserialize to
      alloc - to allocate buffers
      Returns:
      the deserialized ArrowRecordBatch
      Throws:
      IOException - if something went wrong
    • deserializeRecordBatch

      public static ArrowRecordBatch deserializeRecordBatch(RecordBatch recordBatchFB, ArrowBuf body) throws IOException
      Deserializes an ArrowRecordBatch given the Flatbuffer metadata and in-memory body.
      Parameters:
      recordBatchFB - Deserialized FlatBuffer record batch
      body - Read body of the record batch
      Returns:
      ArrowRecordBatch from metadata and in-memory body
      Throws:
      IOException - on error
    • deserializeRecordBatch

      public static ArrowRecordBatch deserializeRecordBatch(MessageMetadataResult serializedMessage, ArrowBuf underlying) throws IOException
      Reads a record batch based on the metadata in serializedMessage and the underlying data buffer.
      Throws:
      IOException
    • serialize

      public static ArrowBlock serialize(WriteChannel out, ArrowDictionaryBatch batch) throws IOException
      Throws:
      IOException
    • serialize

      public static ArrowBlock serialize(WriteChannel out, ArrowDictionaryBatch batch, IpcOption option) throws IOException
      Serializes a dictionary ArrowRecordBatch. Returns the offset and length of the written batch.
      Parameters:
      out - where to serialize
      batch - the batch to serialize
      option - options for IPC
      Returns:
      the metadata of the serialized block
      Throws:
      IOException - if something went wrong
    • deserializeDictionaryBatch

      public static ArrowDictionaryBatch deserializeDictionaryBatch(Message message, ArrowBuf bodyBuffer) throws IOException
      Deserializes an ArrowDictionaryBatch from a dictionary batch Message and data in an ArrowBuf.
      Parameters:
      message - a message of type MessageHeader.DictionaryBatch
      bodyBuffer - Arrow buffer containing the DictionaryBatch data of type MessageHeader.DictionaryBatch
      Returns:
      the deserialized ArrowDictionaryBatch
      Throws:
      IOException - if something went wrong
    • deserializeDictionaryBatch

      public static ArrowDictionaryBatch deserializeDictionaryBatch(MessageMetadataResult message, ArrowBuf bodyBuffer) throws IOException
      Deserializes an ArrowDictionaryBatch from a dictionary batch Message and data in an ArrowBuf.
      Parameters:
      message - a message of type MessageHeader.DictionaryBatch
      bodyBuffer - Arrow buffer containing the DictionaryBatch data of type MessageHeader.DictionaryBatch
      Returns:
      the deserialized ArrowDictionaryBatch
      Throws:
      IOException - if something went wrong
    • deserializeDictionaryBatch

      public static ArrowDictionaryBatch deserializeDictionaryBatch(ReadChannel in, BufferAllocator allocator) throws IOException
      Deserializes an ArrowDictionaryBatch read from the input channel. This uses the given allocator to create an ArrowBuf for the batch body data.
      Parameters:
      in - Channel to read a DictionaryBatch message and data from
      allocator - BufferAllocator to allocate an Arrow buffer to read message body data
      Returns:
      the deserialized ArrowDictionaryBatch
      Throws:
      IOException - on error
    • deserializeDictionaryBatch

      public static ArrowDictionaryBatch deserializeDictionaryBatch(ReadChannel in, ArrowBlock block, BufferAllocator alloc) throws IOException
      Deserializes a DictionaryBatch knowing the size of the entire message up front. This minimizes the number of reads to the underlying stream.
      Parameters:
      in - where to read from
      block - block metadata for deserializing
      alloc - to allocate new buffers
      Returns:
      the deserialized ArrowDictionaryBatch
      Throws:
      IOException - if something went wrong
    • deserializeMessageBatch

      public static ArrowMessage deserializeMessageBatch(MessageChannelReader reader) throws IOException
      Deserialize a message that is either an ArrowDictionaryBatch or ArrowRecordBatch.
      Parameters:
      reader - MessageChannelReader to read a sequence of messages from a ReadChannel
      Returns:
      The deserialized record batch
      Throws:
      IOException - if the message is not an ArrowDictionaryBatch or ArrowRecordBatch
    • deserializeMessageBatch

      public static ArrowMessage deserializeMessageBatch(ReadChannel in, BufferAllocator alloc) throws IOException
      Deserialize a message that is either an ArrowDictionaryBatch or ArrowRecordBatch.
      Parameters:
      in - ReadChannel to read messages from
      alloc - Allocator for message data
      Returns:
      The deserialized record batch
      Throws:
      IOException - if the message is not an ArrowDictionaryBatch or ArrowRecordBatch
    • serializeMessage

      @Deprecated public static ByteBuffer serializeMessage(com.google.flatbuffers.FlatBufferBuilder builder, byte headerType, int headerOffset, long bodyLength)
      Deprecated.
    • serializeMessage

      public static ByteBuffer serializeMessage(com.google.flatbuffers.FlatBufferBuilder builder, byte headerType, int headerOffset, long bodyLength, IpcOption writeOption)
      Serializes a message header.
      Parameters:
      builder - to write the flatbuf to
      headerType - headerType field
      headerOffset - header offset field
      bodyLength - body length field
      writeOption - IPC write options
      Returns:
      the corresponding ByteBuffer
    • readMessage

      public static MessageMetadataResult readMessage(ReadChannel in) throws IOException
      Read a Message from the input channel and return a MessageMetadataResult that contains the Message metadata, buffer containing the serialized Message metadata as read, and length of the Message in bytes. Returns null if the end-of-stream has been reached.
      Parameters:
      in - ReadChannel to read messages from
      Returns:
      MessageMetadataResult with deserialized Message metadata and message information if a valid Message was read, or null if end-of-stream
      Throws:
      IOException - on error
    • readMessageBody

      public static ArrowBuf readMessageBody(ReadChannel in, long bodyLength, BufferAllocator allocator) throws IOException
      Read a Message body from the in channel into an ArrowBuf.
      Parameters:
      in - ReadChannel to read message body from
      bodyLength - Length in bytes of the message body to read
      allocator - Allocate the ArrowBuf to contain message body data
      Returns:
      an ArrowBuf containing the message body data
      Throws:
      IOException - on error