Class AbstractArrowResultChunk

  • Direct Known Subclasses:
    ArrowResultChunk, ArrowResultChunkV2

    public abstract class AbstractArrowResultChunk
    extends Object
    An abstract class that represents a chunk of query result.

    This class provides methods for downloading, processing, and releasing the data in the chunk. It also manages the state of the chunk and provides access to the data as Arrow record batches.

    • Field Detail

      • SECONDS_BUFFER_FOR_EXPIRY

        protected static final Integer SECONDS_BUFFER_FOR_EXPIRY
      • numRows

        protected final long numRows
      • rowOffset

        protected final long rowOffset
      • chunkIndex

        protected final long chunkIndex
      • rootAllocator

        protected final org.apache.arrow.memory.BufferAllocator rootAllocator
      • chunkReadyFuture

        protected final CompletableFuture<Void> chunkReadyFuture
        Future to track when the chunk becomes ready for consumption. This includes both the download and processing phases. The state of the Future is updated by the ChunkDownloadTask and indicates when the chunk's data is fully processed and available for use.
      • recordBatchList

        protected List<List<org.apache.arrow.vector.ValueVector>> recordBatchList
      • expiryTime

        protected Instant expiryTime
      • errorMessage

        protected String errorMessage
      • arrowMetadata

        protected List<String> arrowMetadata
      • chunkReadyTimeoutSeconds

        protected int chunkReadyTimeoutSeconds
    • Constructor Detail

      • AbstractArrowResultChunk

        protected AbstractArrowResultChunk​(long numRows,
                                           long rowOffset,
                                           long chunkIndex,
                                           StatementId statementId,
                                           ChunkStatus initialStatus,
                                           ExternalLink chunkLink,
                                           Instant expiryTime,
                                           int chunkReadyTimeoutSeconds)
    • Method Detail

      • getChunkIndex

        public Long getChunkIndex()
        Returns the index of this chunk.
        Returns:
        chunk index
      • isChunkLinkInvalid

        public boolean isChunkLinkInvalid()
        Checks if the chunk link is invalid or expired.
        Returns:
        true if link is invalid, false otherwise
      • releaseChunk

        public boolean releaseChunk()
        Releases all resources associated with this chunk.
        Returns:
        true if chunk was released, false if it was already released
      • setChunkLink

        public void setChunkLink​(ExternalLink chunk)
        Sets the external link details for this chunk.
        Parameters:
        chunk - the external link information
      • getStatus

        public ChunkStatus getStatus()
        Returns the current status of the chunk.
        Returns:
        current ChunkStatus
      • downloadData

        protected abstract void downloadData​(IDatabricksHttpClient httpClient,
                                             CompressionCodec compressionCodec,
                                             double speedThreshold)
                                      throws DatabricksParsingException,
                                             IOException
        Downloads and initializes data for this chunk using the provided HTTP client and compression codec.
        Parameters:
        httpClient - the HTTP client to use for downloading
        compressionCodec - the compression codec to use for decompression
        speedThreshold - the minimum expected download speed in MB/s for logging warnings
        Throws:
        DatabricksParsingException - if there is an error parsing the data
        IOException - if there is an error downloading or reading the data
      • getRecordBatchCountInChunk

        protected int getRecordBatchCountInChunk()
        Returns the number of record batches in the chunk.
        Returns:
        number of record batches
      • getRecordBatchList

        protected List<List<org.apache.arrow.vector.ValueVector>> getRecordBatchList()
        Returns the list of record batches, where each record batch is a list of value vectors.
        Returns:
        List of record batches
      • getNumRows

        protected long getNumRows()
        Returns the total number of rows in the chunk.
        Returns:
        number of rows
      • getColumnVector

        protected org.apache.arrow.vector.ValueVector getColumnVector​(int recordBatchIndex,
                                                                      int columnIndex)
        Returns the value vector for a specific record batch and column.
        Parameters:
        recordBatchIndex - index of the record batch
        columnIndex - index of the column
        Returns:
        ValueVector for the specified position
      • setStatus

        protected void setStatus​(ChunkStatus targetStatus)
        Updates the status of the chunk.
        Parameters:
        targetStatus - new status to set
      • getChunkIterator

        protected ArrowResultChunkIterator getChunkIterator()
        Returns an iterator for traversing the rows in this chunk.
        Returns:
        ArrowResultChunkIterator for this chunk
      • getArrowMetadata

        protected List<String> getArrowMetadata()