trait MergedShuffleFileManager extends AnyRef
The MergedShuffleFileManager is used to process push based shuffle when enabled. It works
along side ExternalBlockHandler and serves as an RPCHandler for
org.apache.spark.network.server.RpcHandler#receiveStream, where it processes the
remotely pushed streams of shuffle blocks to merge them into merged shuffle files. Right
now, support for push based shuffle is only implemented for external shuffle service in
YARN mode.
- Annotations
- @Evolving()
- Since
3.1.0
- Alphabetic
- By Inheritance
- MergedShuffleFileManager
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Abstract Value Members
- abstract def applicationRemoved(appId: String, cleanupLocalDirs: Boolean): Unit
Invoked when an application finishes.
Invoked when an application finishes. This cleans up any remaining metadata associated with this application, and optionally deletes the application specific directory path.
- appId
application ID
- cleanupLocalDirs
flag indicating whether MergedShuffleFileManager should handle deletion of local dirs itself.
- abstract def finalizeShuffleMerge(msg: FinalizeShuffleMerge): MergeStatuses
Handles the request to finalize shuffle merge for a given shuffle.
Handles the request to finalize shuffle merge for a given shuffle.
- msg
contains appId and shuffleId to uniquely identify a shuffle to be finalized
- returns
The statuses of the merged shuffle partitions for the given shuffle on this shuffle service
- Exceptions thrown
- abstract def getMergedBlockData(appId: String, shuffleId: Int, shuffleMergeId: Int, reduceId: Int, chunkId: Int): ManagedBuffer
Get the buffer for a given merged shuffle chunk when serving merged shuffle to reducers
Get the buffer for a given merged shuffle chunk when serving merged shuffle to reducers
- appId
application ID
- shuffleId
shuffle ID
- shuffleMergeId
shuffleMergeId is used to uniquely identify merging process of shuffle by an indeterminate stage attempt.
- reduceId
reducer ID
- chunkId
merged shuffle file chunk ID
- returns
The
ManagedBufferfor the given merged shuffle chunk
- abstract def getMergedBlockDirs(appId: String): Array[String]
Get the local directories which stores the merged shuffle files.
Get the local directories which stores the merged shuffle files.
- appId
application ID
- abstract def getMergedBlockMeta(appId: String, shuffleId: Int, shuffleMergeId: Int, reduceId: Int): MergedBlockMeta
Get the meta information of a merged block.
Get the meta information of a merged block.
- appId
application ID
- shuffleId
shuffle ID
- shuffleMergeId
shuffleMergeId is used to uniquely identify merging process of shuffle by an indeterminate stage attempt.
- reduceId
reducer ID
- returns
meta information of a merged block
- abstract def receiveBlockDataAsStream(msg: PushBlockStream): StreamCallbackWithID
Provides the stream callback used to process a remotely pushed block.
Provides the stream callback used to process a remotely pushed block. The callback is used by the
org.apache.spark.network.client.StreamInterceptorinstalled on the channel to process the block data in the channel outside of the message frame.- msg
metadata of the remotely pushed blocks. This is processed inside the message frame
- returns
A stream callback to process the block data in streaming fashion as it arrives
- abstract def registerExecutor(appId: String, executorInfo: ExecutorShuffleInfo): Unit
Registers an executor with MergedShuffleFileManager.
Registers an executor with MergedShuffleFileManager. This executor-info provides the directories and number of sub-dirs per dir so that MergedShuffleFileManager knows where to store and look for shuffle data for a given application. It is invoked by the RPC call when executor tries to register with the local shuffle service.
- appId
application ID
- executorInfo
The list of local dirs that this executor gets granted from NodeManager
Concrete Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()