[
https://issues.apache.org/jira/browse/YARN-270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thomas Graves updated YARN-270:
-------------------------------
Priority: Critical (was: Blocker)
Target Version/s: 2.0.3-alpha, 3.0.0, 0.23.7 (was: 3.0.0, 2.0.3-alpha,
0.23.7)
changing this to not be a blocker since we have worked around and some of
subtasks complete.
> RM scheduler event handler thread gets behind
> ---------------------------------------------
>
> Key: YARN-270
> URL: https://issues.apache.org/jira/browse/YARN-270
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 0.23.5
> Reporter: Thomas Graves
> Assignee: Thomas Graves
> Priority: Critical
>
> We had a couple of incidents on a 2800 node cluster where the RM scheduler
> event handler thread got behind processing events and basically become
> unusable. It was still processing apps, but taking a long time (1 hr 45
> minutes) to accept new apps. this actually happened twice within 5 days.
> We are using the capacity scheduler and at the time had between 400 and 500
> applications running. There were another 250 apps that were in the SUBMITTED
> state in the RM but the scheduler hadn't processed those to put in pending
> state yet. We had about 15 queues none of them hierarchical. We also had
> plenty of space lefts on the cluster.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira