class ExternalBlockStoreClient extends BlockStoreClient
Client for reading both RDD blocks and shuffle blocks which points to an external (outside of executor) server. This is instead of reading blocks directly from other executors (via BlockTransferService), which has the downside of losing the data if we lose the executors.
- Alphabetic
- By Inheritance
- ExternalBlockStoreClient
- BlockStoreClient
- Closeable
- AutoCloseable
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
- new ExternalBlockStoreClient(conf: TransportConf, secretKeyHolder: SecretKeyHolder, authEnabled: Boolean, registrationTimeoutMs: Long)
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def checkInit(): Unit
- Attributes
- protected[shuffle]
- Definition Classes
- BlockStoreClient
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- def close(): Unit
- Definition Classes
- ExternalBlockStoreClient → Closeable → AutoCloseable
- Annotations
- @Override()
- def diagnoseCorruption(host: String, port: Int, execId: String, shuffleId: Int, mapId: Long, reduceId: Int, checksum: Long, algorithm: String): Cause
Send the diagnosis request for the corrupted shuffle block to the server.
Send the diagnosis request for the corrupted shuffle block to the server.
- host
the host of the remote node.
- port
the port of the remote node.
- execId
the executor id.
- shuffleId
the shuffleId of the corrupted shuffle block
- mapId
the mapId of the corrupted shuffle block
- reduceId
the reduceId of the corrupted shuffle block
- checksum
the shuffle checksum which calculated at client side for the corrupted shuffle block
- returns
The cause of the shuffle block corruption
- Definition Classes
- BlockStoreClient
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def fetchBlocks(host: String, port: Int, execId: String, blockIds: Array[String], listener: BlockFetchingListener, downloadFileManager: DownloadFileManager): Unit
Fetch a sequence of blocks from a remote node asynchronously,
Fetch a sequence of blocks from a remote node asynchronously,
Note that this API takes a sequence so the implementation can batch requests, and does not return a future so the underlying implementation can invoke onBlockFetchSuccess as soon as the data of a block is fetched, rather than waiting for all blocks to be fetched.
- host
the host of the remote node.
- port
the port of the remote node.
- execId
the executor id.
- blockIds
block ids to fetch.
- listener
the listener to receive block fetching status.
- downloadFileManager
DownloadFileManager to create and clean temp files. If it's not
null, the remote blocks will be streamed into temp shuffle files to reduce the memory usage, otherwise, they will be kept in memory.
- Definition Classes
- ExternalBlockStoreClient → BlockStoreClient
- Annotations
- @Override()
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- def finalizeShuffleMerge(host: String, port: Int, shuffleId: Int, shuffleMergeId: Int, listener: MergeFinalizerListener): Unit
Invoked by Spark driver to notify external shuffle services to finalize the shuffle merge for a given shuffle.
Invoked by Spark driver to notify external shuffle services to finalize the shuffle merge for a given shuffle. This allows the driver to start the shuffle reducer stage after properly finishing the shuffle merge process associated with the shuffle mapper stage.
- host
host of shuffle server
- port
port of shuffle server.
- shuffleId
shuffle ID of the shuffle to be finalized
- shuffleMergeId
shuffleMergeId is used to uniquely identify merging process of shuffle by an indeterminate stage attempt.
- listener
the listener to receive MergeStatuses
- Definition Classes
- ExternalBlockStoreClient → BlockStoreClient
- Annotations
- @Override()
- Since
3.1.0
- def getAppAttemptId(): String
- Definition Classes
- BlockStoreClient
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def getHostLocalDirs(host: String, port: Int, execIds: Array[String], hostLocalDirsCompletable: CompletableFuture[Map[String, Array[String]]]): Unit
Request the local disk directories for executors which are located at the same host with the current BlockStoreClient(it can be ExternalBlockStoreClient or NettyBlockTransferService).
Request the local disk directories for executors which are located at the same host with the current BlockStoreClient(it can be ExternalBlockStoreClient or NettyBlockTransferService).
- host
the host of BlockManager or ExternalShuffleService. It should be the same host with current BlockStoreClient.
- port
the port of BlockManager or ExternalShuffleService.
- execIds
a collection of executor Ids, which specifies the target executors that we want to get their local directories. There could be multiple executor Ids if BlockStoreClient is implemented by ExternalBlockStoreClient since the request handler, ExternalShuffleService, can serve multiple executors on the same node. Or, only one executor Id if BlockStoreClient is implemented by NettyBlockTransferService.
- hostLocalDirsCompletable
a CompletableFuture which contains a map from executor Id to its local directories if the request handler replies successfully. Otherwise, it contains a specific error.
- Definition Classes
- BlockStoreClient
- def getMergedBlockMeta(host: String, port: Int, shuffleId: Int, shuffleMergeId: Int, reduceId: Int, listener: MergedBlocksMetaListener): Unit
Get the meta information of a merged block from the remote shuffle service.
Get the meta information of a merged block from the remote shuffle service.
- host
the host of the remote node.
- port
the port of the remote node.
- shuffleId
shuffle id.
- shuffleMergeId
shuffleMergeId is used to uniquely identify merging process of shuffle by an indeterminate stage attempt.
- reduceId
reduce id.
- listener
the listener to receive chunk counts.
- Definition Classes
- ExternalBlockStoreClient → BlockStoreClient
- Annotations
- @Override()
- Since
3.2.0
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def init(appId: String): Unit
Initializes the BlockStoreClient, specifying this Executor's appId.
Initializes the BlockStoreClient, specifying this Executor's appId. Must be called before any other method on the BlockStoreClient.
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- def pushBlocks(host: String, port: Int, blockIds: Array[String], buffers: Array[ManagedBuffer], listener: BlockPushingListener): Unit
Push a sequence of shuffle blocks in a best-effort manner to a remote node asynchronously.
Push a sequence of shuffle blocks in a best-effort manner to a remote node asynchronously. These shuffle blocks, along with blocks pushed by other clients, will be merged into per-shuffle partition merged shuffle files on the destination node.
- host
the host of the remote node.
- port
the port of the remote node.
- blockIds
block ids to be pushed
- buffers
buffers to be pushed
- listener
the listener to receive block push status.
- Definition Classes
- ExternalBlockStoreClient → BlockStoreClient
- Annotations
- @Override()
- Since
3.1.0
- def registerWithShuffleServer(host: String, port: Int, execId: String, executorInfo: ExecutorShuffleInfo): Unit
Registers this executor with an external shuffle server.
Registers this executor with an external shuffle server. This registration is required to inform the shuffle server about where and how we store our shuffle files.
- host
Host of shuffle server.
- port
Port of shuffle server.
- execId
This Executor's id.
- executorInfo
Contains all info necessary for the service to find our shuffle files.
- def removeBlocks(host: String, port: Int, execId: String, blockIds: Array[String]): Future[Integer]
- def setAppAttemptId(appAttemptId: String): Unit
- Definition Classes
- ExternalBlockStoreClient → BlockStoreClient
- Annotations
- @Override()
- def shuffleMetrics(): MetricSet
Get the shuffle MetricsSet from BlockStoreClient, this will be used in MetricsSystem to get the Shuffle related metrics.
Get the shuffle MetricsSet from BlockStoreClient, this will be used in MetricsSystem to get the Shuffle related metrics.
- Definition Classes
- ExternalBlockStoreClient → BlockStoreClient
- Annotations
- @Override()
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()