package shuffle

Package Members

  1. package checksum
  2. package protocol

Type Members

  1. trait BlockFetchingListener extends BlockTransferListener
  2. trait BlockPushingListener extends BlockTransferListener

    Callback to handle block push success and failure.

    Callback to handle block push success and failure. This interface and BlockFetchingListener are unified under BlockTransferListener to allow code reuse for handling block push and fetch retry.

  3. abstract class BlockStoreClient extends Closeable

    Provides an interface for reading both shuffle files and RDD blocks, either from an Executor or external service.

  4. trait BlockTransferListener extends EventListener

    This interface unifies both BlockFetchingListener and BlockPushingListener under a single interface to allow code reuse, while also keeping the existing public interface to facilitate backward compatibility.

  5. class Constants extends AnyRef
  6. trait DownloadFile extends AnyRef

    A handle on the file used when fetching remote data to disk.

    A handle on the file used when fetching remote data to disk. Used to ensure the lifecycle of writing the data, reading it back, and then cleaning it up is followed. Specific implementations may also handle encryption. The data can be read only via DownloadFileWritableChannel, which ensures data is not read until after the writer is closed.

  7. trait DownloadFileManager extends AnyRef

    A manager to create temp block files used when fetching remote data to reduce the memory usage.

    A manager to create temp block files used when fetching remote data to reduce the memory usage. It will clean files when they won't be used any more.

  8. trait DownloadFileWritableChannel extends WritableByteChannel

    A channel for writing data which is fetched to disk, which allows access to the written data only after the writer has been closed.

    A channel for writing data which is fetched to disk, which allows access to the written data only after the writer has been closed. Used with DownloadFile and DownloadFileManager.

  9. trait ErrorHandler extends AnyRef

    Plugs into RetryingBlockTransferor to further control when an exception should be retried and logged.

    Plugs into RetryingBlockTransferor to further control when an exception should be retried and logged. Note: RetryingBlockTransferor will delegate the exception to this handler only when - remaining retries < max retries - exception is an IOException

    Annotations
    @Evolving()
    Since

    3.1.0

  10. class ExecutorDiskUtils extends AnyRef
  11. class ExternalBlockHandler extends RpcHandler with MergedBlockMetaReqHandler

    RPC Handler for a server which can serve both RDD blocks and shuffle blocks from outside of an Executor process.

    RPC Handler for a server which can serve both RDD blocks and shuffle blocks from outside of an Executor process.

    Handles registering executors and opening shuffle or disk persisted RDD blocks from them. Blocks are registered with the "one-for-one" strategy, meaning each Transport-layer Chunk is equivalent to one block.

  12. class ExternalBlockStoreClient extends BlockStoreClient

    Client for reading both RDD blocks and shuffle blocks which points to an external (outside of executor) server.

    Client for reading both RDD blocks and shuffle blocks which points to an external (outside of executor) server. This is instead of reading blocks directly from other executors (via BlockTransferService), which has the downside of losing the data if we lose the executors.

  13. class ExternalShuffleBlockResolver extends AnyRef

    Manages converting shuffle BlockIds into physical segments of local files, from a process outside of Executors.

    Manages converting shuffle BlockIds into physical segments of local files, from a process outside of Executors. Each Executor must register its own configuration about where it stores its files (local dirs) and how (shuffle manager). The logic for retrieval of individual files is replicated from Spark's IndexShuffleBlockResolver.

  14. trait MergeFinalizerListener extends EventListener

    :: DeveloperApi ::

    :: DeveloperApi ::

    Listener providing a callback function to invoke when driver receives the response for the finalize shuffle merge request sent to remote shuffle service.

    Since

    3.1.0

  15. class MergedBlockMeta extends AnyRef

    Contains meta information for a merged block.

    Contains meta information for a merged block. Currently this information constitutes: 1. Number of chunks in a merged shuffle block. 2. Bitmaps for each chunk in the merged block. A chunk bitmap contains all the mapIds that were merged to that merged block chunk.

    Since

    3.1.0

  16. trait MergedBlocksMetaListener extends EventListener

    Listener for receiving success or failure events when fetching meta of merged blocks.

    Listener for receiving success or failure events when fetching meta of merged blocks.

    Since

    3.2.0

  17. trait MergedShuffleFileManager extends AnyRef

    The MergedShuffleFileManager is used to process push based shuffle when enabled.

    The MergedShuffleFileManager is used to process push based shuffle when enabled. It works along side ExternalBlockHandler and serves as an RPCHandler for org.apache.spark.network.server.RpcHandler#receiveStream, where it processes the remotely pushed streams of shuffle blocks to merge them into merged shuffle files. Right now, support for push based shuffle is only implemented for external shuffle service in YARN mode.

    Annotations
    @Evolving()
    Since

    3.1.0

  18. class NoOpMergedShuffleFileManager extends MergedShuffleFileManager

    Dummy implementation of merged shuffle file manager.

    Dummy implementation of merged shuffle file manager. Suitable for when push-based shuffle is not enabled.

    Since

    3.1.0

  19. class OneForOneBlockFetcher extends AnyRef

    Simple wrapper on top of a TransportClient which interprets each chunk as a whole block, and invokes the BlockFetchingListener appropriately.

    Simple wrapper on top of a TransportClient which interprets each chunk as a whole block, and invokes the BlockFetchingListener appropriately. This class is agnostic to the actual RPC handler, as long as there is a single "open blocks" message which returns a ShuffleStreamHandle, and Java serialization is used.

    Note that this typically corresponds to a org.apache.spark.network.server.OneForOneStreamManager on the server side.

  20. class OneForOneBlockPusher extends AnyRef

    Similar to OneForOneBlockFetcher, but for pushing blocks to remote shuffle service to be merged instead of for fetching them from remote shuffle services.

    Similar to OneForOneBlockFetcher, but for pushing blocks to remote shuffle service to be merged instead of for fetching them from remote shuffle services. This is used by ShuffleWriter when the block push process is initiated. The supplied BlockFetchingListener is used to handle the success or failure in pushing each blocks.

    Since

    3.1.0

  21. class RemoteBlockPushResolver extends MergedShuffleFileManager

    An implementation of MergedShuffleFileManager that provides the most essential shuffle service processing logic to support push based shuffle.

    An implementation of MergedShuffleFileManager that provides the most essential shuffle service processing logic to support push based shuffle.

    Since

    3.1.0

  22. class RetryingBlockTransferor extends AnyRef

    Wraps another BlockFetcher or BlockPusher with the ability to automatically retry block transfers which fail due to IOExceptions, which we hope are due to transient network conditions.

    Wraps another BlockFetcher or BlockPusher with the ability to automatically retry block transfers which fail due to IOExceptions, which we hope are due to transient network conditions.

    This transferor provides stronger guarantees regarding the parent BlockTransferListener. In particular, the listener will be invoked exactly once per blockId, with a success or failure.

  23. class ShuffleIndexInformation extends AnyRef

    Keeps the index information for a particular map output as an in-memory LongBuffer.

  24. class ShuffleIndexRecord extends AnyRef

    Contains offset and length of the shuffle block data.

  25. class SimpleDownloadFile extends DownloadFile

    A DownloadFile that does not take any encryption settings into account for reading and writing data.

    A DownloadFile that does not take any encryption settings into account for reading and writing data.

    This does *not* mean the data in the file is unencrypted -- it could be that the data is already encrypted when its written, and subsequent layer is responsible for decrypting.

Ungrouped