[
https://issues.apache.org/jira/browse/YARN-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14202069#comment-14202069
]
Jason Lowe commented on YARN-2632:
----------------------------------
Thanks for taking this up, Junping, and for reviews, Varun! Further comments
on the latest patch:
In Preconditions the wording of the first sentence is a bit off and we're
inconsistent with the use of "nodemanager" vs. "NodeManager." I suggest
something like the following:
{noformat}
Ephemeral ports (port 0, which is the default) cannot be used for the
NodeManager address because the NodeManager may restart with a different
address.
{noformat}
"that is waitting" s/b "that are waiting"
Wondering if we should simply put what's currently in the Preconditions section
down in the steps for enabling NM restart. Arguably it's just the third step
in the config, and we can put a line or two of explanation next to the
instructions. That way if someone just scans down to the steps to enable it,
they will also see that they have to not only set
yarn.nodemanager.recovery.enabled and yarn.nodemanager.recovery.dir but also
change yarn.nodemanager.address.
The mapreduce_shuffle auxiliary service and any other auxiliary service also
needs to be configured to support NM restart (e.g.: avoid using ephemeral ports
or otherwise support recovering with the same address). mapreduce_shuffle uses
mapreduce.shuffle.port for the shuffle port, for example, and it, too, doesn't
support ephemeral ports for restart.
There should also be a caveat that configured auxiliary services must support
recovery or otherwise NM functionality may be affected in those areas upon
restart.
"To enable NM Restart functionality, set ..." should just be "Set ..." because
the line just before this already says "Enabling NM Restart consists of ...".
> Document NM Restart feature
> ---------------------------
>
> Key: YARN-2632
> URL: https://issues.apache.org/jira/browse/YARN-2632
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Reporter: Junping Du
> Assignee: Junping Du
> Priority: Blocker
> Attachments: YARN-2632-v2.patch, YARN-2632-v3.patch, YARN-2632.patch
>
>
> As a new feature to YARN, we should document this feature's behavior,
> configuration, and things to pay attention.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)