[
https://issues.apache.org/jira/browse/YARN-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13719448#comment-13719448
]
qus-jiawei commented on YARN-964:
---------------------------------
There is 27 nodes in our hadoop cluster.
If a disk for a nodemanager broken,then the am assign to this nodemanager would
fail as the am want to create the temp dir for this application.Until the disk
checker kick the broken disk out of the local dirs. And the disk checker
interval is 2 min.
I think the am is possible to assign to this nodemanager in 2min.
Also,when the cluster is full of job, this bad nodemanger is the only one that
have memory to run container.
So ,it is easy for am reassign to this bad nodemanager.
> Give a parameter that can set AM retry interval
> ------------------------------------------------
>
> Key: YARN-964
> URL: https://issues.apache.org/jira/browse/YARN-964
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: resourcemanager
> Reporter: qus-jiawei
>
> Our am retry number is 4.
> As one nodemanager 's disk is full,the container of am couldn't allocate on
> this nodemanager.But RM try this AM on the same NM every 3 secondes.
> i think there shoule be a params to set the AM retry interval.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira