Class AbstractArrowResultChunk
- java.lang.Object
-
- com.databricks.jdbc.api.impl.arrow.AbstractArrowResultChunk
-
- Direct Known Subclasses:
ArrowResultChunk,ArrowResultChunkV2
public abstract class AbstractArrowResultChunk extends Object
An abstract class that represents a chunk of query result.This class provides methods for downloading, processing, and releasing the data in the chunk. It also manages the state of the chunk and provides access to the data as Arrow record batches.
-
-
Field Summary
Fields Modifier and Type Field Description protected List<String>arrowMetadataprotected longchunkIndexprotected ExternalLinkchunkLinkprotected CompletableFuture<Void>chunkReadyFutureFuture to track when the chunk becomes ready for consumption.protected intchunkReadyTimeoutSecondsprotected StringerrorMessageprotected InstantexpiryTimeprotected longnumRowsprotected List<List<org.apache.arrow.vector.ValueVector>>recordBatchListprotected org.apache.arrow.memory.BufferAllocatorrootAllocatorprotected longrowOffsetprotected static IntegerSECONDS_BUFFER_FOR_EXPIRYprotected ArrowResultChunkStateMachinestateMachineprotected StatementIdstatementId
-
Constructor Summary
Constructors Modifier Constructor Description protectedAbstractArrowResultChunk(long numRows, long rowOffset, long chunkIndex, StatementId statementId, ChunkStatus initialStatus, ExternalLink chunkLink, Instant expiryTime, int chunkReadyTimeoutSeconds)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected abstract voiddownloadData(IDatabricksHttpClient httpClient, CompressionCodec compressionCodec, double speedThreshold)Downloads and initializes data for this chunk using the provided HTTP client and compression codec.protected List<String>getArrowMetadata()LonggetChunkIndex()Returns the index of this chunk.protected ArrowResultChunkIteratorgetChunkIterator()Returns an iterator for traversing the rows in this chunk.protected CompletableFuture<Void>getChunkReadyFuture()protected org.apache.arrow.vector.ValueVectorgetColumnVector(int recordBatchIndex, int columnIndex)Returns the value vector for a specific record batch and column.protected longgetNumRows()Returns the total number of rows in the chunk.protected intgetRecordBatchCountInChunk()Returns the number of record batches in the chunk.protected List<List<org.apache.arrow.vector.ValueVector>>getRecordBatchList()Returns the list of record batches, where each record batch is a list of value vectors.ChunkStatusgetStatus()Returns the current status of the chunk.protected abstract voidhandleFailure(Exception exception, ChunkStatus failedStatus)Handles a failure during the download or processing of this chunk.protected voidinitializeData(InputStream inputStream)Decompresses the givenInputStreamand initializesrecordBatchListfrom decompressed stream.booleanisChunkLinkInvalid()Checks if the chunk link is invalid or expired.booleanreleaseChunk()Releases all resources associated with this chunk.voidsetChunkLink(ExternalLink chunk)Sets the external link details for this chunk.protected voidsetStatus(ChunkStatus targetStatus)Updates the status of the chunk.protected voidwaitForChunkReady()Waits for the chunk to be ready for consumption.
-
-
-
Field Detail
-
SECONDS_BUFFER_FOR_EXPIRY
protected static final Integer SECONDS_BUFFER_FOR_EXPIRY
-
numRows
protected final long numRows
-
rowOffset
protected final long rowOffset
-
chunkIndex
protected final long chunkIndex
-
statementId
protected final StatementId statementId
-
rootAllocator
protected final org.apache.arrow.memory.BufferAllocator rootAllocator
-
chunkReadyFuture
protected final CompletableFuture<Void> chunkReadyFuture
Future to track when the chunk becomes ready for consumption. This includes both the download and processing phases. The state of the Future is updated by theChunkDownloadTaskand indicates when the chunk's data is fully processed and available for use.
-
stateMachine
protected final ArrowResultChunkStateMachine stateMachine
-
chunkLink
protected ExternalLink chunkLink
-
expiryTime
protected Instant expiryTime
-
errorMessage
protected String errorMessage
-
chunkReadyTimeoutSeconds
protected int chunkReadyTimeoutSeconds
-
-
Constructor Detail
-
AbstractArrowResultChunk
protected AbstractArrowResultChunk(long numRows, long rowOffset, long chunkIndex, StatementId statementId, ChunkStatus initialStatus, ExternalLink chunkLink, Instant expiryTime, int chunkReadyTimeoutSeconds)
-
-
Method Detail
-
getChunkIndex
public Long getChunkIndex()
Returns the index of this chunk.- Returns:
- chunk index
-
isChunkLinkInvalid
public boolean isChunkLinkInvalid()
Checks if the chunk link is invalid or expired.- Returns:
- true if link is invalid, false otherwise
-
releaseChunk
public boolean releaseChunk()
Releases all resources associated with this chunk.- Returns:
- true if chunk was released, false if it was already released
-
setChunkLink
public void setChunkLink(ExternalLink chunk)
Sets the external link details for this chunk.- Parameters:
chunk- the external link information
-
getStatus
public ChunkStatus getStatus()
Returns the current status of the chunk.- Returns:
- current ChunkStatus
-
downloadData
protected abstract void downloadData(IDatabricksHttpClient httpClient, CompressionCodec compressionCodec, double speedThreshold) throws DatabricksParsingException, IOException
Downloads and initializes data for this chunk using the provided HTTP client and compression codec.- Parameters:
httpClient- the HTTP client to use for downloadingcompressionCodec- the compression codec to use for decompressionspeedThreshold- the minimum expected download speed in MB/s for logging warnings- Throws:
DatabricksParsingException- if there is an error parsing the dataIOException- if there is an error downloading or reading the data
-
handleFailure
protected abstract void handleFailure(Exception exception, ChunkStatus failedStatus) throws DatabricksParsingException
Handles a failure during the download or processing of this chunk.- Throws:
DatabricksParsingException
-
getRecordBatchCountInChunk
protected int getRecordBatchCountInChunk()
Returns the number of record batches in the chunk.- Returns:
- number of record batches
-
getRecordBatchList
protected List<List<org.apache.arrow.vector.ValueVector>> getRecordBatchList()
Returns the list of record batches, where each record batch is a list of value vectors.- Returns:
- List of record batches
-
getNumRows
protected long getNumRows()
Returns the total number of rows in the chunk.- Returns:
- number of rows
-
getColumnVector
protected org.apache.arrow.vector.ValueVector getColumnVector(int recordBatchIndex, int columnIndex)Returns the value vector for a specific record batch and column.- Parameters:
recordBatchIndex- index of the record batchcolumnIndex- index of the column- Returns:
- ValueVector for the specified position
-
setStatus
protected void setStatus(ChunkStatus targetStatus)
Updates the status of the chunk.- Parameters:
targetStatus- new status to set
-
getChunkIterator
protected ArrowResultChunkIterator getChunkIterator()
Returns an iterator for traversing the rows in this chunk.- Returns:
- ArrowResultChunkIterator for this chunk
-
getChunkReadyFuture
protected CompletableFuture<Void> getChunkReadyFuture()
-
waitForChunkReady
protected void waitForChunkReady() throws ExecutionException, InterruptedException, TimeoutExceptionWaits for the chunk to be ready for consumption.- Throws:
ExecutionException- if the chunk download or processing throws an exceptionInterruptedException- if the thread is interrupted while waitingTimeoutException- if the chunk is not ready within the timeout
-
initializeData
protected void initializeData(InputStream inputStream) throws DatabricksSQLException, IOException
Decompresses the givenInputStreamand initializesrecordBatchListfrom decompressed stream.- Parameters:
inputStream- the input stream to decompress- Throws:
DatabricksSQLException- if decompression failsIOException- if reading from the stream fails
-
-