[
https://issues.apache.org/jira/browse/YARN-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14951561#comment-14951561
]
Hudson commented on YARN-4140:
------------------------------
FAILURE: Integrated in Hadoop-Mapreduce-trunk #2452 (See
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2452/])
YARN-4140. RM container allocation delayed incase of app submitted to (wangda:
rev def374e666ed0c1d665aeb1b7307e09769448138)
* hadoop-yarn-project/CHANGES.txt
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestNodeLabelContainerAllocation.java
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/utils/BuilderUtils.java
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java
> RM container allocation delayed incase of app submitted to Nodelabel partition
> ------------------------------------------------------------------------------
>
> Key: YARN-4140
> URL: https://issues.apache.org/jira/browse/YARN-4140
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: scheduler
> Reporter: Bibin A Chundatt
> Assignee: Bibin A Chundatt
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4140.patch, 0002-YARN-4140.patch,
> 0003-YARN-4140.patch, 0004-YARN-4140.patch, 0005-YARN-4140.patch,
> 0006-YARN-4140.patch, 0007-YARN-4140.patch, 0008-YARN-4140.patch,
> 0009-YARN-4140.patch, 0010-YARN-4140.patch, 0011-YARN-4140.patch,
> 0012-YARN-4140.patch, 0013-YARN-4140.patch, 0014-YARN-4140.patch
>
>
> Trying to run application on Nodelabel partition I found that the
> application execution time is delayed by 5 – 10 min for 500 containers .
> Total 3 machines 2 machines were in same partition and app submitted to same.
> After enabling debug was able to find the below
> # From AM the container ask is for OFF-SWITCH
> # RM allocating all containers to NODE_LOCAL as shown in logs below.
> # So since I was having about 500 containers time taken was about – 6 minutes
> to allocate 1st map after AM allocation.
> # Tested with about 1K maps using PI job took 17 minutes to allocate next
> container after AM allocation
> Once 500 container allocation on NODE_LOCAL is done the next container
> allocation is done on OFF_SWITCH
> {code}
> 2015-09-09 15:21:58,954 DEBUG
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
> showRequests: application=application_1441791998224_0001 request={Priority:
> 20, Capability: <memory:512, vCores:1>, # Containers: 500, Location:
> /default-rack, Relax Locality: true, Node Label Expression: }
> 2015-09-09 15:21:58,954 DEBUG
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
> showRequests: application=application_1441791998224_0001 request={Priority:
> 20, Capability: <memory:512, vCores:1>, # Containers: 500, Location: *, Relax
> Locality: true, Node Label Expression: 3}
> 2015-09-09 15:21:58,954 DEBUG
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
> showRequests: application=application_1441791998224_0001 request={Priority:
> 20, Capability: <memory:512, vCores:1>, # Containers: 500, Location:
> host-10-19-92-143, Relax Locality: true, Node Label Expression: }
> 2015-09-09 15:21:58,954 DEBUG
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
> showRequests: application=application_1441791998224_0001 request={Priority:
> 20, Capability: <memory:512, vCores:1>, # Containers: 500, Location:
> host-10-19-92-117, Relax Locality: true, Node Label Expression: }
> 2015-09-09 15:21:58,954 DEBUG
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5,
> usedResources=<memory:0, vCores:0>, usedCapacity=0.0,
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 --> <memory:0,
> vCores:0>, NODE_LOCAL
> {code}
>
> {code}
> 2015-09-09 14:35:45,467 DEBUG
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5,
> usedResources=<memory:0, vCores:0>, usedCapacity=0.0,
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 --> <memory:0,
> vCores:0>, NODE_LOCAL
> 2015-09-09 14:35:45,831 DEBUG
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5,
> usedResources=<memory:0, vCores:0>, usedCapacity=0.0,
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 --> <memory:0,
> vCores:0>, NODE_LOCAL
> 2015-09-09 14:35:46,469 DEBUG
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5,
> usedResources=<memory:0, vCores:0>, usedCapacity=0.0,
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 --> <memory:0,
> vCores:0>, NODE_LOCAL
> 2015-09-09 14:35:46,832 DEBUG
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5,
> usedResources=<memory:0, vCores:0>, usedCapacity=0.0,
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 --> <memory:0,
> vCores:0>, NODE_LOCAL
> {code}
> {code}
> dsperf@host-127:/opt/bibin/dsperf/HAINSTALL/install/hadoop/resourcemanager/logs1>
> cat hadoop-dsperf-resourcemanager-host-127.log | grep "NODE_LOCAL" | grep
> "root.b.b1" | wc -l
> 500
> {code}
>
> (Consumes about 6 minutes)
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)