[
https://issues.apache.org/jira/browse/YARN-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13990120#comment-13990120
]
Jian He commented on YARN-2001:
-------------------------------
If RM start accepting application requests before NMs sync back, for example,
we may run into condition like the resource usage, capacity limit (e.g.
headroom, queue capacity etc. ) in scheduler is not yet correct until all the
nodes sync back all the running containers belong to the app,
applications/queues can potentially go beyond its limit.
It'll be definitely good if we can think of a way to not make RM wait without
hitting race conditions.
> Persist NMs info for RM restart
> -------------------------------
>
> Key: YARN-2001
> URL: https://issues.apache.org/jira/browse/YARN-2001
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Reporter: Jian He
> Assignee: Jian He
>
> RM should not accept allocate requests from AMs until all the NMs have
> registered with RM. For that, RM needs to remember the previous NMs and wait
> for all the NMs to register.
> This is also useful for remembering decommissioned nodes across restarts.
--
This message was sent by Atlassian JIRA
(v6.2#6252)