[jira] [Commented] (YARN-270) RM scheduler event handler thread gets behind

Vinod Kumar Vavilapalli (JIRA) Mon, 17 Dec 2012 19:18:16 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534586#comment-13534586
 ]


Vinod Kumar Vavilapalli commented on YARN-270:
----------------------------------------------

Nathan, unfortunately, the dispatcher framework cannot exert back pressure in 
general, each event producer needs to control itself.

OTOH, YARN-275 is indeed a long term fix. NMs back off just like the TTs do in 
1.*.
                
> RM scheduler event handler thread gets behind
> ---------------------------------------------
>
>                 Key: YARN-270
>                 URL: https://issues.apache.org/jira/browse/YARN-270
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 0.23.5
>            Reporter: Thomas Graves
>            Assignee: Thomas Graves
>
> We had a couple of incidents on a 2800 node cluster where the RM scheduler 
> event handler thread got behind processing events and basically become 
> unusable.  It was still processing apps, but taking a long time (1 hr 45 
> minutes) to accept new apps.   this actually happened twice within 5 days.
> We are using the capacity scheduler and at the time had between 400 and 500 
> applications running.  There were another 250 apps that were in the SUBMITTED 
> state in the RM but the scheduler hadn't processed those to put in pending 
> state yet.  We had about 15 queues none of them hierarchical.  We also had 
> plenty of space lefts on the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-270) RM scheduler event handler thread gets behind

Reply via email to