[jira] [Commented] (YARN-5540) scheduler spends too much time looking at empty priorities

Arun Suresh (JIRA) Tue, 23 Aug 2016 13:13:31 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433534#comment-15433534
 ]


Arun Suresh commented on YARN-5540:
-----------------------------------

Thanks for the patch [~jlowe]

Minor nits:
# you can remove the {{TODO: Shouldn't we activate even if numContainers = 0}} 
since you are now taking care of it.
# You do not really need to pass the schedulerKey around since you can extract 
it from the request using {{SchedulerRequestKey::create(ResourceRequest)}} but 
since some of the existing methods still pass it around, its not a must fix for 
me.

Thinking out loud here... shouldn't we probably merge the 2 data structures 
(the resourceRequestMap ConcurrentHashMap and the schedulerKeys TreeSet) with a 
ConcurrentSkipListMap and return the keySet() when getSchedulerKeys() is called.



> scheduler spends too much time looking at empty priorities
> ----------------------------------------------------------
>
>                 Key: YARN-5540
>                 URL: https://issues.apache.org/jira/browse/YARN-5540
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacity scheduler, fairscheduler, resourcemanager
>    Affects Versions: 2.7.2
>            Reporter: Nathan Roberts
>            Assignee: Jason Lowe
>         Attachments: YARN-5540.001.patch
>
>
> We're starting to see the capacity scheduler run out of scheduling horsepower 
> when running 500-1000 applications on clusters with 4K nodes or so.
> This seems to be amplified by TEZ applications. TEZ applications have many 
> more priorities (sometimes in the hundreds) than typical MR applications and 
> therefore the loop in the scheduler which examines every priority within 
> every running application, starts to be a hotspot. The priorities appear to 
> stay around forever, even when there is no remaining resource request at that 
> priority causing us to spend a lot of time looking at nothing.
> jstack snippet:
> {noformat}
> "ResourceManager Event Processor" #28 prio=5 os_prio=0 tid=0x00007fc2d453e800 
> nid=0x22f3 runnable [0x00007fc2a8be2000]
>    java.lang.Thread.State: RUNNABLE
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceRequest(SchedulerApplicationAttempt.java:210)
>         - eliminated <0x00000005e73e5dc0> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:852)
>         - locked <0x00000005e73e5dc0> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp)
>         - locked <0x00000003006fcf60> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:527)
>         - locked <0x00000003001b22f8> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:415)
>         - locked <0x00000003001b22f8> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1224)
>         - locked <0x0000000300041e40> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-5540) scheduler spends too much time looking at empty priorities

Reply via email to