Sandy Ryza commented on YARN-2176:

Without the ActivationCallback, the ActiveUsersManager would need to call in to 
the leaf queue, which it currently doesn't even have a reference to.  It seems 
weirder to me to have an edge from the ActiveUsersManager to the leaf queue 
than to have an edge from the AppSchedulingInfo to the leaf queue - tracing 
what's going on would require more hops.  What do you think about either
* Have both the ActiveUsersManager and the leaf queue register for the callback
* Have only the leaf queue register for the callback, and then be in charge of 
notifying the ActiveUsersManager (which it already has a reference to) 

Sorry to be nitpicky on this pretty small thing - have just ended up confused 
by this code multiple times and think it's worth getting right.

> CapacityScheduler loops over all running applications rather than actively 
> requesting apps
> ------------------------------------------------------------------------------------------
>                 Key: YARN-2176
>                 URL: https://issues.apache.org/jira/browse/YARN-2176
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacityscheduler
>    Affects Versions: 2.4.0
>            Reporter: Jason Lowe
> The capacity scheduler performance is primarily dominated by 
> LeafQueue.assignContainers, and that currently loops over all applications 
> that are running in the queue.  It would be more efficient if we looped over 
> just the applications that are actively asking for resources rather than all 
> applications, as there could be thousands of applications running but only a 
> few hundred that are currently asking for resources.

This message was sent by Atlassian JIRA

Reply via email to