Class ArrowReader

java.lang.Object
org.apache.arrow.vector.ipc.ArrowReader
All Implemented Interfaces:
AutoCloseable, DictionaryProvider
Direct Known Subclasses:
ArrowFileReader, ArrowStreamReader

public abstract class ArrowReader extends Object implements DictionaryProvider, AutoCloseable
Abstract class to read Schema and ArrowRecordBatches.
  • Field Details

  • Constructor Details

  • Method Details

    • getVectorSchemaRoot

      public VectorSchemaRoot getVectorSchemaRoot() throws IOException
      Returns the vector schema root. This will be loaded with new values on every call to loadNextBatch.
      Returns:
      the vector schema root
      Throws:
      IOException - if reading of schema fails
    • getDictionaryVectors

      public Map<Long,Dictionary> getDictionaryVectors() throws IOException
      Returns any dictionaries that were loaded along with ArrowRecordBatches.
      Returns:
      Map of dictionaries to dictionary id, empty if no dictionaries loaded
      Throws:
      IOException - if reading of schema fails
    • lookup

      public Dictionary lookup(long id)
      Lookup a dictionary that has been loaded using the dictionary id.
      Specified by:
      lookup in interface DictionaryProvider
      Parameters:
      id - Unique identifier for a dictionary
      Returns:
      the requested dictionary or null if not found
    • getDictionaryIds

      public Set<Long> getDictionaryIds()
      Description copied from interface: DictionaryProvider
      Get all dictionary IDs.
      Specified by:
      getDictionaryIds in interface DictionaryProvider
    • loadNextBatch

      public abstract boolean loadNextBatch() throws IOException
      Load the next ArrowRecordBatch to the vector schema root if available.
      Returns:
      true if a batch was read, false on EOS
      Throws:
      IOException - on error
    • bytesRead

      public abstract long bytesRead()
      Return the number of bytes read from the ReadChannel.
      Returns:
      number of bytes read
    • close

      public void close() throws IOException
      Close resources, including vector schema root and dictionary vectors, and the underlying read source.
      Specified by:
      close in interface AutoCloseable
      Throws:
      IOException - on error
    • close

      public void close(boolean closeReadSource) throws IOException
      Close resources, including vector schema root and dictionary vectors. If the flag closeReadChannel is true then close the underlying read source, otherwise leave it open.
      Parameters:
      closeReadSource - Flag to control if closing the underlying read source
      Throws:
      IOException - on error
    • closeReadSource

      protected abstract void closeReadSource() throws IOException
      Close the underlying read source.
      Throws:
      IOException - on error
    • readSchema

      protected abstract Schema readSchema() throws IOException
      Read the Schema from the source, will be invoked at the beginning the initialization.
      Returns:
      the read Schema
      Throws:
      IOException - on error
    • ensureInitialized

      protected void ensureInitialized() throws IOException
      Initialize if not done previously.
      Throws:
      IOException - on error
    • initialize

      protected void initialize() throws IOException
      Reads the schema and initializes the vectors.
      Throws:
      IOException
    • prepareLoadNextBatch

      protected void prepareLoadNextBatch() throws IOException
      Ensure the reader has been initialized and reset the VectorSchemaRoot row count to 0.
      Throws:
      IOException - on error
    • loadRecordBatch

      protected void loadRecordBatch(ArrowRecordBatch batch)
      Load an ArrowRecordBatch to the readers VectorSchemaRoot.
      Parameters:
      batch - the record batch to load
    • loadDictionary

      protected void loadDictionary(ArrowDictionaryBatch dictionaryBatch)
      Load an ArrowDictionaryBatch to the readers dictionary vectors.
      Parameters:
      dictionaryBatch - dictionary batch to load