Bibin A Chundatt created YARN-4140:
--------------------------------------
Summary: RM container allocation delated incase of app submitted
to Nodel partition
Key: YARN-4140
URL: https://issues.apache.org/jira/browse/YARN-4140
Project: Hadoop YARN
Issue Type: Sub-task
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Trying to run application on Nodelabel partition I found that the application
execution time is delayed by 5 – 10 min for 500 containers . Total 3 machines 2
machines were in same partition and app submitted to same.
After enabling debug was able to find the below
# From AM the container ask is for OFF-SWITCH
# RM allocating all containers to NODE_LOCAL as shown in logs below.
# So since I was having about 500 containers time taken was about – 6 minutes
to allocated map after AM allocation.
#Tested with about 1K maps with PI job took 17 minutes to allocated the next
container after AM allocation
Once 500 container allocation on NODE_LOCAL is done the next container
allocation is done on OFF_SWITCH
{code}
2015-09-09 15:21:58,954 DEBUG
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
showRequests: application=application_1441791998224_0001 request={Priority:
20, Capability: <memory:512, vCores:1>, # Containers: 500, Location:
/default-rack, Relax Locality: true, Node Label Expression: }
2015-09-09 15:21:58,954 DEBUG
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
showRequests: application=application_1441791998224_0001 request={Priority:
20, Capability: <memory:512, vCores:1>, # Containers: 500, Location: *, Relax
Locality: true, Node Label Expression: 3}
2015-09-09 15:21:58,954 DEBUG
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
showRequests: application=application_1441791998224_0001 request={Priority:
20, Capability: <memory:512, vCores:1>, # Containers: 500, Location:
host-10-19-92-143, Relax Locality: true, Node Label Expression: }
2015-09-09 15:21:58,954 DEBUG
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
showRequests: application=application_1441791998224_0001 request={Priority:
20, Capability: <memory:512, vCores:1>, # Containers: 500, Location:
host-10-19-92-117, Relax Locality: true, Node Label Expression: }
2015-09-09 15:21:58,954 DEBUG
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5,
usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0,
numApps=1, numContainers=1 --> <memory:0, vCores:0>, NODE_LOCAL
{code}
{code}
2015-09-09 14:35:45,467 DEBUG
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5,
usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0,
numApps=1, numContainers=1 --> <memory:0, vCores:0>, NODE_LOCAL
2015-09-09 14:35:45,831 DEBUG
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5,
usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0,
numApps=1, numContainers=1 --> <memory:0, vCores:0>, NODE_LOCAL
2015-09-09 14:35:46,469 DEBUG
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5,
usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0,
numApps=1, numContainers=1 --> <memory:0, vCores:0>, NODE_LOCAL
2015-09-09 14:35:46,832 DEBUG
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5,
usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0,
numApps=1, numContainers=1 --> <memory:0, vCores:0>, NODE_LOCAL
{code}
{code}
dsperf@host-127:/opt/bibin/dsperf/HAINSTALL/install/hadoop/resourcemanager/logs1>
cat hadoop-dsperf-resourcemanager-host-127.log | grep "NODE_LOCAL" | grep
"root.b.b1" | wc -l
500
{code}
(Consumes about 6 minutes)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)