[
https://issues.apache.org/jira/browse/YARN-1412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13823902#comment-13823902
]
gaurav gupta commented on YARN-1412:
------------------------------------
Here is the synopsis of the various combinations
Node_Set Rack_Set Relax locality
Yes No FALSE I get back on the node,
but then fallback doesn't work
Yes No TRUE I don't get back the correct
node
Yes Yes T/F I don't get back the correct
node
I am attaching the logs when Node is Yes and Rack is False and Relax is true.
The containers for which it is not working is
container_1384534729839_0001_01_000002 and
container_1384534729839_0001_01_000004
2013-11-15 09:00:38,116 ResourceManager Event Processor DEBUG
fica.FiCaSchedulerApp (FiCaSchedulerApp.java:showRequests(335)) - showRequests:
application=application_1384534729839_0001 headRoom=<memory:9091072, vCores:0>
currentConsumption=2048
2013-11-15 09:00:38,116 ResourceManager Event Processor DEBUG
fica.FiCaSchedulerApp (FiCaSchedulerApp.java:showRequests(339)) - showRequests:
application=application_1384534729839_0001 request={Priority: 0, Capability:
<memory:8192, vCores:1>, # Containers: 1, Location: /default-rack, Relax
Locality: true}
2013-11-15 09:00:38,116 ResourceManager Event Processor DEBUG
fica.FiCaSchedulerApp (FiCaSchedulerApp.java:showRequests(339)) - showRequests:
application=application_1384534729839_0001 request={Priority: 0, Capability:
<memory:8192, vCores:1>, # Containers: 1, Location: *, Relax Locality: true}
2013-11-15 09:00:38,116 IPC Server handler 43 on 8031 DEBUG
security.UserGroupInformation
(UserGroupInformation.java:logPrivilegedAction(1513)) - PrivilegedAction
as:hadoop (auth:SIMPLE)
from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
2013-11-15 09:00:38,116 ResourceManager Event Processor DEBUG
fica.FiCaSchedulerApp (FiCaSchedulerApp.java:showRequests(339)) - showRequests:
application=application_1384534729839_0001 request={Priority: 0, Capability:
<memory:8192, vCores:1>, # Containers: 1, Location: node10.morado.com, Relax
Locality: true}
2013-11-15 09:00:38,117 ResourceManager Event Processor DEBUG
fica.FiCaSchedulerApp (FiCaSchedulerApp.java:showRequests(335)) - showRequests:
application=application_1384534729839_0001 headRoom=<memory:9091072, vCores:0>
currentConsumption=2048
2013-11-15 09:00:38,117 ResourceManager Event Processor DEBUG
fica.FiCaSchedulerApp (FiCaSchedulerApp.java:showRequests(339)) - showRequests:
application=application_1384534729839_0001 request={Priority: 1, Capability:
<memory:8192, vCores:1>, # Containers: 1, Location: *, Relax Locality: true}
2013-11-15 09:00:38,117 AsyncDispatcher event handler DEBUG
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(125)) - Dispatching the
event
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeStatusEvent.EventType:
STATUS_UPDATE
2013-11-15 09:00:38,117 ResourceManager Event Processor DEBUG
fica.FiCaSchedulerApp (FiCaSchedulerApp.java:showRequests(335)) - showRequests:
application=application_1384534729839_0001 headRoom=<memory:9091072, vCores:0>
currentConsumption=2048
2013-11-15 09:00:38,117 AsyncDispatcher event handler DEBUG rmnode.RMNodeImpl
(RMNodeImpl.java:handle(354)) - Processing node6.morado.com:39327 of type
STATUS_UPDATE
2013-11-15 09:00:38,117 ResourceManager Event Processor DEBUG
fica.FiCaSchedulerApp (FiCaSchedulerApp.java:showRequests(339)) - showRequests:
application=application_1384534729839_0001 request={Priority: 2, Capability:
<memory:8192, vCores:1>, # Containers: 1, Location: /default-rack, Relax
Locality: true}
2013-11-15 09:00:38,117 AsyncDispatcher event handler DEBUG
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(125)) - Dispatching the
event
org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.NodeUpdateSchedulerEvent.EventType:
NODE_UPDATE
2013-11-15 09:00:38,118 ResourceManager Event Processor DEBUG
fica.FiCaSchedulerApp (FiCaSchedulerApp.java:showRequests(339)) - showRequests:
application=application_1384534729839_0001 request={Priority: 2, Capability:
<memory:8192, vCores:1>, # Containers: 1, Location: *, Relax Locality: true}
2013-11-15 09:00:38,118 ResourceManager Event Processor DEBUG
fica.FiCaSchedulerApp (FiCaSchedulerApp.java:showRequests(339)) - showRequests:
application=application_1384534729839_0001 request={Priority: 2, Capability:
<memory:8192, vCores:1>, # Containers: 1, Location: node18.morado.com, Relax
Locality: true}
2013-11-15 09:00:38,118 ResourceManager Event Processor DEBUG
capacity.LeafQueue (LeafQueue.java:computeUserLimit(1056)) - User limit
computation for gaurav in queue default userLimit=100 userLimitFactor=1.0
required: <memory:8192, vCores:1> consumed: <memory:2048, vCores:1> limit:
<memory:9093120, vCores:1> queueCapacity: <memory:9093120, vCores:1> qconsumed:
<memory:2048, vCores:1> currentCapacity: <memory:9093120, vCores:1>
activeUsers: 1 clusterCapacity: <memory:9093120, vCores:296>
2013-11-15 09:00:38,118 ResourceManager Event Processor DEBUG
capacity.LeafQueue (LeafQueue.java:computeUserLimitAndSetHeadroom(989)) -
Headroom calculation for user gaurav: userLimit=<memory:9093120, vCores:1>
queueMaxCap=<memory:9093120, vCores:1> consumed=<memory:2048, vCores:1>
headroom=<memory:9091072, vCores:0>
2013-11-15 09:00:38,118 ResourceManager Event Processor DEBUG
capacity.LeafQueue (LeafQueue.java:assignContainer(1306)) - assignContainers:
node=node8.morado.com application=1 priority=0 request={Priority: 0,
Capability: <memory:8192, vCores:1>, # Containers: 1, Location: *, Relax
Locality: true} type=OFF_SWITCH
2013-11-15 09:00:38,119 ResourceManager Event Processor DEBUG
security.BaseContainerTokenSecretManager
(BaseContainerTokenSecretManager.java:createPassword(90)) - Creating password
for container_1384534729839_0001_01_000002 for user
container_1384534729839_0001_01_000002 (auth:SIMPLE) to be run on NM
node8.morado.com:51530
2013-11-15 09:00:38,119 ResourceManager Event Processor DEBUG
security.ContainerTokenIdentifier (ContainerTokenIdentifier.java:write(112)) -
Writing ContainerTokenIdentifier to RPC layer:
org.apache.hadoop.yarn.security.ContainerTokenIdentifier@77c5b2de
2013-11-15 09:00:38,120 ResourceManager Event Processor DEBUG
security.ContainerTokenIdentifier (ContainerTokenIdentifier.java:write(112)) -
Writing ContainerTokenIdentifier to RPC layer:
org.apache.hadoop.yarn.security.ContainerTokenIdentifier@77c5b2de
2013-11-15 09:00:38,120 ResourceManager Event Processor DEBUG
scheduler.AppSchedulingInfo (AppSchedulingInfo.java:allocate(377)) - allocate:
applicationId=application_1384534729839_0001
container=container_1384534729839_0001_01_000002 host=node8.morado.com:51530
2013-11-15 09:00:38,120 ResourceManager Event Processor DEBUG
scheduler.AppSchedulingInfo (AppSchedulingInfo.java:allocate(265)) - allocate:
user: gaurav, memory: <memory:8192, vCores:1>
2013-11-15 09:00:38,120 ResourceManager Event Processor DEBUG
rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(208)) - Processing
container_1384534729839_0001_01_000002 of type START
2013-11-15 09:00:38,120 ResourceManager Event Processor INFO
rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(220)) -
container_1384534729839_0001_01_000002 Container Transitioned from NEW to
ALLOCATED
2013-11-15 09:00:38,146 ResourceManager Event Processor DEBUG
capacity.LeafQueue (LeafQueue.java:computeUserLimit(1056)) - User limit
computation for gaurav in queue default userLimit=100 userLimitFactor=1.0
required: <memory:8192, vCores:1> consumed: <memory:18432, vCores:3> limit:
<memory:9093120, vCores:1> queueCapacity: <memory:9093120, vCores:1> qconsumed:
<memory:18432, vCores:3> currentCapacity: <memory:9093120, vCores:1>
activeUsers: 1 clusterCapacity: <memory:9093120, vCores:296>
2013-11-15 09:00:38,146 ResourceManager Event Processor DEBUG
capacity.LeafQueue (LeafQueue.java:computeUserLimitAndSetHeadroom(989)) -
Headroom calculation for user gaurav: userLimit=<memory:9093120, vCores:1>
queueMaxCap=<memory:9093120, vCores:1> consumed=<memory:18432, vCores:3>
headroom=<memory:9074688, vCores:-2>
2013-11-15 09:00:38,146 ResourceManager Event Processor DEBUG
capacity.LeafQueue (LeafQueue.java:assignContainer(1306)) - assignContainers:
node=node7.morado.com application=1 priority=2 request={Priority: 2,
Capability: <memory:8192, vCores:1>, # Containers: 1, Location: *, Relax
Locality: true} type=OFF_SWITCH
2013-11-15 09:00:38,147 ResourceManager Event Processor DEBUG
security.BaseContainerTokenSecretManager
(BaseContainerTokenSecretManager.java:createPassword(90)) - Creating password
for container_1384534729839_0001_01_000004 for user
container_1384534729839_0001_01_000004 (auth:SIMPLE) to be run on NM
node7.morado.com:36087
2013-11-15 09:00:38,147 ResourceManager Event Processor DEBUG
security.ContainerTokenIdentifier (ContainerTokenIdentifier.java:write(112)) -
Writing ContainerTokenIdentifier to RPC layer:
org.apache.hadoop.yarn.security.ContainerTokenIdentifier@2f566b7d
2013-11-15 09:00:38,147 ResourceManager Event Processor DEBUG
security.ContainerTokenIdentifier (ContainerTokenIdentifier.java:write(112)) -
Writing ContainerTokenIdentifier to RPC layer:
org.apache.hadoop.yarn.security.ContainerTokenIdentifier@2f566b7d
2013-11-15 09:00:38,147 ResourceManager Event Processor DEBUG
scheduler.AppSchedulingInfo (AppSchedulingInfo.java:allocate(377)) - allocate:
applicationId=application_1384534729839_0001
container=container_1384534729839_0001_01_000004 host=node7.morado.com:36087
2013-11-15 09:00:38,148 ResourceManager Event Processor DEBUG
scheduler.ActiveUsersManager
(ActiveUsersManager.java:deactivateApplication(94)) - User gaurav removed from
activeUsers, currently: 0
2013-11-15 09:00:38,148 ResourceManager Event Processor DEBUG
scheduler.AppSchedulingInfo (AppSchedulingInfo.java:allocate(265)) - allocate:
user: gaurav, memory: <memory:8192, vCores:1>
2013-11-15 09:00:38,148 ResourceManager Event Processor DEBUG
rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(208)) - Processing
container_1384534729839_0001_01_000004 of type START
2013-11-15 09:00:38,148 ResourceManager Event Processor INFO
rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(220)) -
container_1384534729839_0001_01_000004 Container Transitioned from NEW to
ALLOCATED
> Allocating Containers on a particular Node in Yarn
> --------------------------------------------------
>
> Key: YARN-1412
> URL: https://issues.apache.org/jira/browse/YARN-1412
> Project: Hadoop YARN
> Issue Type: Bug
> Environment: centos, Hadoop 2.2.0
> Reporter: gaurav gupta
>
> I am trying to allocate containers on a particular node in Yarn but Yarn is
> returning me containers on different node although the requested node has
> resources available.
> Here is the snippet of the code that I am using
> AMRMClient<ContainerRequest> amRmClient = AMRMClient.createAMRMClient();;
> String host = "h1";
> Resource capability = Records.newRecord(Resource.class);
> capability.setMemory(memory);
> nodes = new String[] {host};
> // in order to request a host, we also have to request the rack
> racks = new String[] {"/default-rack"};
> List<ContainerRequest> containerRequests = new
> ArrayList<ContainerRequest>();
> List<ContainerId> releasedContainers = new ArrayList<ContainerId>();
> containerRequests.add(new ContainerRequest(capability, nodes, racks,
> Priority.newInstance(priority)));
> if (containerRequests.size() > 0) {
> LOG.info("Asking RM for containers: " + containerRequests);
> for (ContainerRequest cr : containerRequests) {
> LOG.info("Requested container: {}", cr.toString());
> amRmClient.addContainerRequest(cr);
> }
> }
> for (ContainerId containerId : releasedContainers) {
> LOG.info("Released container, id={}", containerId.getId());
> amRmClient.releaseAssignedContainer(containerId);
> }
> return amRmClient.allocate(0);
--
This message was sent by Atlassian JIRA
(v6.1#6144)