[ https://issues.apache.org/jira/browse/YARN-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492610#comment-13492610 ]
Robert Joseph Evans commented on YARN-201: ------------------------------------------ >From a quick look the patch looks good. I only have one comment. {code} // Reset scheduling opportunities if this was a localized request if (assignment.getType() != NodeType.OFF_SWITCH) { application.resetSchedulingOpportunities(priority); } {code} {quote}Reset scheduling opportunities if this was a localized request{quote} seems unnecessary. The code is simpel enough that no comment is really needed but I would like to see a comment about why we don't reset them, so when someone else reads through the code they can understand why we are doing this, and update it accordingly. > CapacityScheduler can take a very long time to schedule containers if > requests are off cluster > ---------------------------------------------------------------------------------------------- > > Key: YARN-201 > URL: https://issues.apache.org/jira/browse/YARN-201 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler > Affects Versions: 0.23.3, 2.0.1-alpha > Reporter: Jason Lowe > Assignee: Jason Lowe > Priority: Critical > Attachments: YARN-201.patch > > > When a user runs a job where one of the input files is a large file on > another cluster, the job can create many splits on nodes which are > unreachable for computation from the current cluster. The off-switch delay > logic in LeafQueue can cause the ResourceManager to allocate containers for > the job very slowly. In one case the job was only getting one container > every 23 seconds, and the queue had plenty of spare capacity. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira