[ https://issues.apache.org/jira/browse/YARN-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14035902#comment-14035902 ]
Sandy Ryza commented on YARN-2176: ---------------------------------- The Fair Scheduler should probably avoid this as well. Would a derived ActiveUsersManager be necessary? I've always found the degree that AppSchedulingInfo talks to ActiveUsersManager kind of weird. Could we just expose a hasPendingRequests() method in AppSchedulingInfo? The leaf queue would check it after making an allocation and then make any necessary adjustments. I suppose if an application cancels requests, this wouldn't get reflected immediately in the leaf queue's bookkeeping, but the leaf queue could make the adjustment as soon as it observes this, which would lead to equivalent run time. > CapacityScheduler loops over all running applications rather than actively > requesting apps > ------------------------------------------------------------------------------------------ > > Key: YARN-2176 > URL: https://issues.apache.org/jira/browse/YARN-2176 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler > Affects Versions: 2.4.0 > Reporter: Jason Lowe > > The capacity scheduler performance is primarily dominated by > LeafQueue.assignContainers, and that currently loops over all applications > that are running in the queue. It would be more efficient if we looped over > just the applications that are actively asking for resources rather than all > applications, as there could be thousands of applications running but only a > few hundred that are currently asking for resources. -- This message was sent by Atlassian JIRA (v6.2#6252)