org.apache.hadoop.hdfs.server.blockmanagement
Class BlockPlacementPolicyWithNodeGroup

java.lang.Object
  extended by org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
      extended by org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
          extended by org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyWithNodeGroup

public class BlockPlacementPolicyWithNodeGroup
extends org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault

The class is responsible for choosing the desired number of targets for placing block replicas on environment with node-group layer. The replica placement strategy is adjusted to: If the writer is on a datanode, the 1st replica is placed on the local node (or local node-group), otherwise a random datanode. The 2nd replica is placed on a datanode that is on a different rack with 1st replica node. The 3rd replica is placed on a datanode which is on a different node-group but the same rack as the second replica node.


Field Summary
 
Fields inherited from class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
clusterMap, considerLoad, heartbeatInterval, host2datanodeMap, tolerateHeartbeatMultiplier
 
Constructor Summary
protected BlockPlacementPolicyWithNodeGroup()
           
protected BlockPlacementPolicyWithNodeGroup(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.hdfs.server.namenode.FSClusterStats stats, org.apache.hadoop.net.NetworkTopology clusterMap, org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager datanodeManager)
           
 
Method Summary
protected  int addToExcludedNodes(org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor chosenNode, Set<org.apache.hadoop.net.Node> excludedNodes)
          Find other nodes in the same nodegroup of localMachine and add them into excludeNodes as replica should not be duplicated for nodes within the same nodegroup
protected  DatanodeStorageInfo chooseLocalRack(org.apache.hadoop.net.Node localMachine, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<StorageType,Integer> storageTypes)
          Choose one node from the rack that localMachine is on.
protected  DatanodeStorageInfo chooseLocalStorage(org.apache.hadoop.net.Node localMachine, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<StorageType,Integer> storageTypes, boolean fallbackToLocalRack)
          choose local node of localMachine as the target.
protected  void chooseRemoteRack(int numOfReplicas, org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor localMachine, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxReplicasPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<StorageType,Integer> storageTypes)
          Choose numOfReplicas nodes from the racks that localMachine is NOT on.
protected  String getRack(org.apache.hadoop.hdfs.protocol.DatanodeInfo cur)
          Get rack string from a data node
 void initialize(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.hdfs.server.namenode.FSClusterStats stats, org.apache.hadoop.net.NetworkTopology clusterMap, org.apache.hadoop.hdfs.server.blockmanagement.Host2NodesMap host2datanodeMap)
          Used to setup a BlockPlacementPolicy object.
 Collection<DatanodeStorageInfo> pickupReplicaSet(Collection<DatanodeStorageInfo> first, Collection<DatanodeStorageInfo> second)
          Pick up replica node set for deleting replica as over-replicated.
 
Methods inherited from class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
chooseRandom, chooseRandom, chooseReplicaToDelete, chooseTarget, verifyBlockPlacement
 
Methods inherited from class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
adjustSetsWithChosenReplica, getInstance, splitNodesWithRack
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BlockPlacementPolicyWithNodeGroup

protected BlockPlacementPolicyWithNodeGroup(org.apache.hadoop.conf.Configuration conf,
                                            org.apache.hadoop.hdfs.server.namenode.FSClusterStats stats,
                                            org.apache.hadoop.net.NetworkTopology clusterMap,
                                            org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager datanodeManager)

BlockPlacementPolicyWithNodeGroup

protected BlockPlacementPolicyWithNodeGroup()
Method Detail

initialize

public void initialize(org.apache.hadoop.conf.Configuration conf,
                       org.apache.hadoop.hdfs.server.namenode.FSClusterStats stats,
                       org.apache.hadoop.net.NetworkTopology clusterMap,
                       org.apache.hadoop.hdfs.server.blockmanagement.Host2NodesMap host2datanodeMap)
Description copied from class: org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
Used to setup a BlockPlacementPolicy object. This should be defined by all implementations of a BlockPlacementPolicy.

Overrides:
initialize in class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
Parameters:
conf - the configuration object
stats - retrieve cluster status from here
clusterMap - cluster topology

chooseLocalStorage

protected DatanodeStorageInfo chooseLocalStorage(org.apache.hadoop.net.Node localMachine,
                                                 Set<org.apache.hadoop.net.Node> excludedNodes,
                                                 long blocksize,
                                                 int maxNodesPerRack,
                                                 List<DatanodeStorageInfo> results,
                                                 boolean avoidStaleNodes,
                                                 EnumMap<StorageType,Integer> storageTypes,
                                                 boolean fallbackToLocalRack)
                                          throws org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.NotEnoughReplicasException
choose local node of localMachine as the target. if localMachine is not available, choose a node on the same nodegroup or rack instead.

Overrides:
chooseLocalStorage in class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
Returns:
the chosen node
Throws:
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.NotEnoughReplicasException

chooseLocalRack

protected DatanodeStorageInfo chooseLocalRack(org.apache.hadoop.net.Node localMachine,
                                              Set<org.apache.hadoop.net.Node> excludedNodes,
                                              long blocksize,
                                              int maxNodesPerRack,
                                              List<DatanodeStorageInfo> results,
                                              boolean avoidStaleNodes,
                                              EnumMap<StorageType,Integer> storageTypes)
                                       throws org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.NotEnoughReplicasException
Description copied from class: org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
Choose one node from the rack that localMachine is on. if no such node is available, choose one node from the rack where a second replica is on. if still no such node is available, choose a random node in the cluster.

Overrides:
chooseLocalRack in class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
Returns:
the chosen node
Throws:
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.NotEnoughReplicasException

chooseRemoteRack

protected void chooseRemoteRack(int numOfReplicas,
                                org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor localMachine,
                                Set<org.apache.hadoop.net.Node> excludedNodes,
                                long blocksize,
                                int maxReplicasPerRack,
                                List<DatanodeStorageInfo> results,
                                boolean avoidStaleNodes,
                                EnumMap<StorageType,Integer> storageTypes)
                         throws org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.NotEnoughReplicasException
Choose numOfReplicas nodes from the racks that localMachine is NOT on. if not enough nodes are available, choose the remaining ones from the local rack

Overrides:
chooseRemoteRack in class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
Throws:
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.NotEnoughReplicasException

getRack

protected String getRack(org.apache.hadoop.hdfs.protocol.DatanodeInfo cur)
Description copied from class: org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
Get rack string from a data node

Overrides:
getRack in class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
Returns:
rack of data node

addToExcludedNodes

protected int addToExcludedNodes(org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor chosenNode,
                                 Set<org.apache.hadoop.net.Node> excludedNodes)
Find other nodes in the same nodegroup of localMachine and add them into excludeNodes as replica should not be duplicated for nodes within the same nodegroup

Overrides:
addToExcludedNodes in class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
Returns:
number of new excluded nodes

pickupReplicaSet

public Collection<DatanodeStorageInfo> pickupReplicaSet(Collection<DatanodeStorageInfo> first,
                                                        Collection<DatanodeStorageInfo> second)
Pick up replica node set for deleting replica as over-replicated. First set contains replica nodes on rack with more than one replica while second set contains remaining replica nodes. If first is not empty, divide first set into two subsets: moreThanOne contains nodes on nodegroup with more than one replica exactlyOne contains the remaining nodes in first set then pickup priSet if not empty. If first is empty, then pick second.

Overrides:
pickupReplicaSet in class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault


Copyright © 2014 Apache Software Foundation. All Rights Reserved.