Jason Lowe created YARN-201:
-------------------------------
Summary: CapacityScheduler can take a very long time to schedule
containers if requests are off cluster
Key: YARN-201
URL: https://issues.apache.org/jira/browse/YARN-201
Project: Hadoop YARN
Issue Type: Bug
Components: capacityscheduler
Affects Versions: 2.0.1-alpha, 0.23.3
Reporter: Jason Lowe
When a user runs a job where one of the input files is a large file on another
cluster, the job can create many splits on nodes which are unreachable for
computation from the current cluster. The off-switch delay logic in LeafQueue
can cause the ResourceManager to allocate containers for the job very slowly.
In one case the job was only getting one container every 23 seconds, and the
queue had plenty of spare capacity.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira