[ 
https://issues.apache.org/jira/browse/YARN-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13719448#comment-13719448
 ] 

qus-jiawei commented on YARN-964:
---------------------------------

There is 27 nodes in our hadoop cluster.

If a disk for a nodemanager broken,then the am assign to this nodemanager would 
fail as the am want to create the temp dir for this application.Until the disk 
checker kick the broken disk out of the local dirs. And the disk checker 
interval is 2 min.
I think the am is possible to assign to this nodemanager in 2min.

Also,when the cluster is full of job, this bad nodemanger is the only one that 
have memory to run container.
So ,it is easy for am reassign to this bad nodemanager.


                
> Give a parameter that can set  AM retry interval
> ------------------------------------------------
>
>                 Key: YARN-964
>                 URL: https://issues.apache.org/jira/browse/YARN-964
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: qus-jiawei
>
> Our am retry number is 4.
> As one nodemanager 's disk is full,the container of am couldn't allocate on 
> this nodemanager.But RM try this AM on the same NM every 3 secondes.
> i think there shoule be a params to set the  AM retry interval.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to